by Julia Friedrich & Ibrahim Tannir
When planning a data management strategy, there are a number of questions that need to be considered: Do I need an immediate failover solution? What is an acceptable backup frequency? What is the maximum retention period for past data? How can I make sure my data is safe in case of a natural disaster?
Many helpful guides to creating such strategies exist (here’s one example referring to backup), following all of which should result in a detailed description of your strategy. But do not stop at the theory: create use cases playing out concrete scenarios for your specific business, and test your failover/backup/archive setup regularly to eliminate bottlenecks and flaws.
As a software-only supplier, one question we get asked frequently is “Which storage media should I use for my Archive/Backup?” This question, however, can never be answered categorically. The selection of the ideal storage hardware – which may well be a mix of different media – depends heavily on your individual requirements and strategy.
Moreover, as professional-grade storage usually comes with a considerable financial investment, it is imperative that the choice be considered meticulously to avoid spending money on the wrong kind of storage.
Although initial investments may well be a significant factor in your decision, make sure you consider cost over time – you might find that after a few years, it’s a completely different picture.
Another important aspect is expandability. With ever-growing media formats, you should make sure that your storage will be able to cope with future developments when it comes to capacity.
And what about availability? In case of disk failure or user error, do you need your data straight away, or can you afford a few hours or days to get it all back?
Do you have the possibility to store an extra copy off-site so it’s safe even if your building is damaged by flood, fire, or even just in case of a power outage?
These are just a few aspects to consider, and that is why we cannot answer the original question definitively.
Having said that, what we can do is give you an overview of the three most common types of storage for professional data management – specifically disk, LTO tape and the cloud – and their advantages and disadvantages:
HDD (spinning disk) or SSD (solid-state). While SSD does not suffer from mechanical failure as much as spinning disk, the price per TB is around four to five times higher than for HDD). Here’s a comparison of both disk types.
- Low initial investment for the disks themselves
- Growing capacity
- Direct access to data
- Easily attached to the system
- Easily replaced in case of failure
- Using additional technology (RAID, NAS) can be conglomerated to large capacities and achieve high throughput
- Multitude of vendors, sizes
- Relatively short lifespans, high failure rates
- Require additional not-so-cost-effective hardware for conglomeration and to achieve throughput.
- Require special conditions (cooling, air cleanness) to ensure proper operation
- Energy cost intensive
- Space cost intensive, relatively larger volume/TB of storage
- Interfaces change frequently with generation change and make old hardware unusable
- Mechanically sensitive and therefore not well suited for transport and offline storage
- Not reliable for long-term storage. Disk may not spin up
- Cost ineffective for offline and (not-so-)long-term storage
2. LTO Tape
As opposed to video tape, LTO (Linear Tape-Open) is a standard for magnetic data tape.
- Cost-effective in terms of storage space (Lowest price/TB)
- Cost-effective in terms of expenses, i.e. electricity, cooling, maintenance
- Cost-effective in terms of operational and storage space (GB/volume of space)
- Large capacity per media – 6 TB/tape native with LTO-7
- Storage speeds higher than spinning disk or SSD
- Less sensitive to mechanical damage thanks to no permanently spinning parts
- Easy transport and off-site storage
- 30 years shelf life (in appropriate conditions)
- Built-in encryption
- Built-in compression
- Highly redundant and reliable error detection and correction with inherent read-after-write verification
- Standardized long-lifespan connection interface (SCSI, FCL)
- Established technology with well-defined growth path for several future generations (new generation approx. every 2-3 years)
- Controlled by an open consortium of major industry players IBM, HP, Quantum
- 2 generations backward compatibility
- Requires a special drive or tape library
- High entry and initial costs for drive/library (but quick breakeven point)
- Linear, sequential technology of data storage plus physical loading into drive
-> may take several minutes to load tape, locate data and commence retrieval
- If re-used often (approx. 100x), tapes are prone to wear and tear. Therefore better suited for long-term archive. If used for recurring backup, tapes should be replaced in approx. 3-year cycles
- Proper storage conditions (temperature/humidity) required to achieve 10+ year shelf life
- Drives and libraries require periodic maintenance and cleaning
- Not as easily plugged in and brought to functionality as disk
- Technology requires proper/proprietary drivers and/or a technology aware application for driving it.
- LTFS “standard” available, but supported and maintained in different levels by the individual members of the consortium.
3. Cloud storage
Off-site storage via internet/network connection offered by large providers (Google, Amazon, Microsoft, etc.). Private (local) clouds (storage using cloud technology) exist, but require the same cost and maintenance as local disk/tape storage, so these are not considered in this comparison.
- Extreme capacities
- No initial capital expenditure (pay-as-you-go, mostly monthly payment)
- Easily expandable or reducible
- Zero maintenance on business side (no technicians/technical knowledge required)
- High reliability
- Availability of redundancy strategies
- Allows long-time storage
- Technology changes are transparent to, and do not affect the customer
- Shareability of data
- Built-in encryption
- Perpetual expenses
- Slow access due to nature of access – WAN
- Requires dedicated and expensive network lines for faster transfer
- Not suitable for large amounts of data if quick or immediate retrieval is required (quick access is often charged at a premium)
- Requires specialized software for transfer and access to data.
- Data is out of control of the business – total dependency on provider
- Not on premise – No WAN, no data.
So in summary, these are the most important factors:
We hope that these points can support you in making an informed choice when selecting the best type of storage for your data management setup. Think short-term vs. long-term expenses, maintenance cost, expandability, etc. and don’t make a rash decision.
Also, keep in mind that it probably makes sense to employ a combined strategy. For example, it is more convenient to accommodate short-term changes in company policy using a cloud-based solution, but move to LTO tape when they turn out to be long-term strategies later on.
Lastly, it will always be necessary to keep a certain amount of critical data on-site.
So focus on building an environment that incorporates the best of both (or possibly all three) worlds – for example a disk-to-disk-to-tape strategy.