The fifth generation of the LTO (Linear Tape-Open) standard is known for its embracing of LTFS (Linear Tape File System). The goal of incorporating LTFS was to broaden LTO’s range of use, increase flexibility and facilitate the exchange and transport of data. It provides users disk-like access to browsing and OS level functionality without the need for software. But while LTFS does represent a major milestone for tape-based archiving, much work still remains to be done before this technology can be considered fully developed.
Tape as a storage medium has historically had its share of detractors and opponents, and the need for positive information has likely been a factor in spurring the development of LTFS. This article will touch on the benefits of the marriage between LTO and LTFS, as well as the issues that still remain.
Tape and Its Reputation
One very interesting yet lesser known fact of LTO is that the origin of the standard is actually based on combining the best features of several preceding technologies: the single track cartridge from DLT; the chip within the cartridge from AIT; the servo tracks from SLR; and error correction from DDS. This “Best of all Worlds” product is something of a rarity in the IT world, and created in LTO a highly professional and secure storage medium that has proven to be unmatched in cost and efficiency, particularly for long-term archiving and off-site storage. Yet tape is still much maligned with misconceptions concerning reliability, slow speed and cost.
Now let´s have a closer look at LTFS. What exactly is LTFS, how is it implemented and what is the benefit for the user? LTO5 introduced an option to partition tape media. Partition 0 (37GB in size) holds the index and Metadata, whereas partition 1 holds the actual data as well as a backup copy of the index.
During a read process, the index is read from partition 0, placed into RAM of the connected computer, and copied into a temp directory respectively. There the index is updated and written back to partition 0. Additionally, a backup copy of the index will be written after the actual stored data (on partition 1). The index itself is an XML structure (in UTF-8) that holds the file names and directories of all the files written to tape. For security reasons, the index is kept twice on the tape.
(Almost) Like a Disk
When a LTFS tape (LTO-5) is inserted into a drive, it mounts and displays its contents just like a disk (assuming the correct driver is installed) and sees the files listed on the tape, just like a disk. This is certainly an important improvement, in that tapes that were written with older backup software did not mount by themselves. In any case, since every vendor created their own respective proprietary format or container files on tape, this would have been of little use, and since the respective container files hold all the attributes, ACLs, extended finder attributes, etc., cross-platform compatibility was dependent on it.
At present LTFS still has its issues with special characters in file names, as well as in the file path. Extended attributes are not always supported between platforms, and this generates warning messages that can sometimes be ambiguous. These conditions are largely dependent upon which OS the LTO medium was written with. To compensate, the user would have to document this information for data exchange, making it a cumbersome process.
Write, Read, Delete
To write a file to tape, you drag it onto the drive symbol or the window of the tape, just like you would with a disk. But this is where the similarity ends. Since tape is a linear medium, access to files works completely different than with disks. Files are always appended to the end of the previously written data partition. Deleting a file does not free up space – to delete a file and free up space it is necessary to delete all the previous files and rewrite them back to tape again. Just reading a fully populated 1TB tape can take up to three hours. Writing back to tape can take just as long.
For users accustomed to the non-linear access of a hard disk, USB stick or SSD, this linear process can be cumbersome. Marketing tape-based media as comparable to hard disks in this respect leads to unnecessary misunderstanding and disappointment. For example: “The tape can be utilized just like a hard disk or other removable media, including directory tree structures.” (www.trustlto.com). The issue of access rights further complicates things. Since the index is cached (in /tmp/ltfs) while the tape is in the drive, the user has to have access rights to its directory, as well as for the mount point of the LTO drive. (Oracle LTFS)
Search and Find
As with CDs, DVDs or disks, each LTFS tape holds a directory of all files contained on the medium. When searching for a file that might be located on one of several tapes, each of those tapes must be individually placed in the drive, and its index read before the directory can be displayed. Depending on the number of files on the tape, this process can take several minutes. If several tapes are to be screened, this can add up to half an hour or more just to locate a file. The actual reading process of the file also requires time, but is relatively fast at 140MB/s (assuming the host computer can support the data at this rate).
Traditional Backup software like Archiware´s PresSTORE has a clear advantage over this procedure. All written files and versions are stored in an index that can be browsed or searched by the user at any time. When the desired file is located, the respective tape is called up, and queued to the exact position where the file is located, since this information is also stored.
One factor in favor of LTFS is that it is vendor-independent, since no software is required for reading and writing to tape; only driver software is needed. But it is here where the term “vendor-independent” can be somewhat misleading. Each of the three LTO drive manufacturers (HP, IBM and Quantum) creates their own driver software. Since each company supports OS X, Windows and Linux, there are altogether nine different drivers. How they will be kept up to date over the years is unknown.
Essentially, dependency on backup software vendors has been traded for dependency on drive manufacturers. Backup software vendors who support tape have invested heavily in gaining the necessary know-how and market share, and software development is at the core of their business. In contrast, drive manufacturers have the majority of their business invested in traditional storage solutions that are not using LTFS, thus giving them little incentive to develop and update their LTFS software. Users who have invested in their product can easily end up stuck with an outdated OS version or disk, losing access to their data. This recently happened when LTFS moved from version 1.0 to version 2.0 and new media would not even mount with older drivers (IBM troubleshooting).
On a smaller scale, administering a few tapes and putting them in the drive to locate a file seems manageable. As the number of tapes increases this process becomes increasingly unwieldy. If seek time is relevant, LTFS can only be used for archiving a small number of tape volumes. It is certainly not feasible on a larger scale in a professional application.
Traditional archive software like PresSTORE P4 compiles a catalog of all tapes that are part of an archive. This catalog can be browsed and searched anytime, supported with media previews to streamline the process. Searching by file name, date of archiving, path or metadata (even individually formulated) is standard. Any file can be located within seconds. Multiple files can be combined in a restore selection. The physical tape is only called up by the software when starting the actual restore process, and in a tape library setting the tape will be mounted in the drive automatically.
With its emphasis on direct attached storage (DAS) at single workstation, LTFS seems to underestimate LTO technology and its potential. For the user, the advantages of LTFS are more than offset by its high complexity and limited ease of use. With “datacenter on the desktop” and new technologies like Thunderbolt, there are additional challenges for the professional workstation user in the very near future. The more clearly those are presented and the more transparent they are, the more they will be embraced by users.
Who LTFS is for
Considering all the aforementioned factors, attributes and limitations, it is clear that LTFS is only appropriate in smaller computing environments. Nevertheless, maintaining a consistent archive structure and documentation to facilitate file location and retrieval is still crucial in all environments. The additional security of redundant, off-site backup is also critical. The total number of tapes should still be manageable manually, and it is thus clear that long-term archiving with LTFS is not advisable at this point. While LTO5 is an interesting medium for data transport due to its high data density and robust technology, long-term experience with driver development and future compatibility are still unknown factors.
LTFS is still something of a novelty for LTO tape. At this stage neither the idea nor the implementation are convincing. It bears many of the attributes of a professional medium, but with a somewhat ill-fitting feature set for most users. Communicating the real advantages of LTO tape more effectively and to a broader audience would go a long way toward attracting new users. One approach would be to emphasize the unrivaled TCO of tape versus disk. Tape drives and libraries could be designed to be more attractive, with better desktop compatibility for new users and applications. There is much left to do.