I’ve never been fond of VTLs (Virtual Tape Libraries). I like deduplication but it quickly became a feature and is not a product. And lately we all want more out of backups, don’t we?
Data Domain did a great job in building a brilliant deduplication-based appliance in 2001, but that was 2001. And, while it lasted, having the ability to write backups faster than usual and taking advantage of great space efficiency was a good thing… but time passes and alternatives are easier to find. This makes the VTL less appealing than in the past. (lately also visible from EMC financial results)
Another technology I’ve always been fond of is ZFS. It’s also super easy to build a ZFS-based box! You can buy ready-to-run appliances or build them starting from scratch (with OpenSolaris/Illumos, Ubuntu and maybe other OSes). With off-the-shelf commodity hardware you can build a 4U 45/60 disk box. We’re talking about up to 480TB raw (w/ 8TB disks, before RAID, compression and dedupe), which could easily become 3 times that with compression and dedupe (almost 1,5PB!). It’s a lot of space and comes at a low cost.
You have a preference for Microsoft? Not a problem – do it with Windows Server. It works like a charm thanks to Storage Spaces, dedupe and many other features that do the same work of a ZFS-based box (probably even better if you think about all the fancy things you can do around SMB3 and SOFS)
What about Linux? Do it with Ceph. Building a scale-out cluster could be a little trickier than a single host, but in this case you can have much more scalability and several additional features.
Either way, you can save a lot of money while achieving (very) good results and taking full advantage of a solution built out of a piece of commodity hardware and an operating system. A VTL is more efficient you say? I’d like to see some proof regarding… especially by putting TCO,TCA and real world backup jobs on the same table.
But yes, I agree that this is not enough sometimes…
Take the VTL for what it is
At the end of the day a VTL is just a dumb data repository; you want it reliable, cost effective, scalable easy to use and efficient. What about a different approach then? Object storage is cheaper than a VTL with better features in terms of e-vaulting (aka replication, in this case), TCO, reliability, durability and can scale much more than any VTL.
An object storage is more flexible than a box which supports only a few protocols for a specific use case. You can ask for the “traditional” gateway approach (like NetApp AltaVault for example), which gives you the ability to have a VTL-like frontend with all the flexibility of an object store at the back-end, or you can write directly on an S3 repository. The latter is becoming quite common in enterprise backup software and is supported by many object storage vendors… ).
Scalability is no longer an issue and the storage repository can do more than backups only… much more. But if this is not enough for you, there is a third way to rethink VTLs today.
Or you can ask for more than a VTL
You prefer smart over dumb? Why not put everything together? Some new startups, like Cohesity for example, have totally redefined the concept of secondary storage and data protection. Consequently, the idea of VTL too.
In fact, this type of appliance has the ability to ingest data through its integrated backup system or by leveraging backup software like Veeam. But contrary to what happens to a dumb storage system like a VTL, data ingestion is only the first step of a much more complete and exhaustive process.
In fact, once data is in the system it’s possible to search it directly in a google-like fashion, analyze it and make copies for other uses (like test/dev for example). The options are plentiful, as are the potential savings brought about by consolidation, optimization and centralization of several secondary workloads. And these appliances can also leverage cloud for tiering, archiving and disaster recovery. At the end of the day we are dealing with another level of efficiency that leads to faster and integrated backups, better data management and quicker restores.
Closing the circle
VTL is dead! Just kidding… 🙂 But VTLs are becoming less appealing to modern infrastructures. They simply no longer make any sense… but please take this comment with a grain of salt, as I’ve never been fond of VTLs (when I was a Sun Microsystems VAR, I was the greatest fan of the Sun Fire x4500 and they always served like the best of the VTLs for my customers. 😉
VTLs don’t usually scale enough, they are not efficient enough, they are not flexible enough. On the other hand, Startups like Cohesity have coined the term Hyperconverged Secondary Storage. I don’t know if they have chosen the right words to describe it, but I like what they are doing. When you think about HCI you think about storage and compute collapsed together, and end users like it because they want to simplify their infrastructure. For secondary storage it is the same but it’s not about storage and compute, instead it is about collapsing data and data services together to simplify data management.
Disclaimer: I have recently written extensively about VTL alternatives, in one way or another. This is part of my “Flash & Trash” idea and many vendors have asked me to elaborate on it. In fact, I’ve worked for all the vendors mentioned in this article (and others), including Cohesity for which I published a new paper on this precise topic a few days ago.