Yesterday, Veeam announced a $500M funding round. The company is privately held, profitable and with a pretty solid revenue stream coming from hundreds of thousands of happy customers. But, still, they raised $500M!
I didn’t see it coming, but if you look at what is happening in the market, it’s not a surprising move. Market valuation of companies like Rubrik and Cohesity is off the chart and it is pretty clear that while they are spending boatloads of money to fuel their growth, they are also developing platforms that are well beyond traditional data protection.
Backup is boring
Backup is one of the most tedious, yet critical, tasks to be performed in the IT space. You need to protect your data and save a copy of it in a secure place in case of a system failure, human error or worse, like in the case of natural disasters and cyberattacks. But as critical as it is, the differentiations between backup solutions are getting thinner and thinner.
Vendors like Cohesity got it right from the very beginning of their existence. It is quite difficult, if not impossible, to consolidate all your primary storage systems in a single large repository but if you concentrate backups on a single platform then you have all of your data in a single logical place!
In the past, backup was all about throughput and capacity with very low CPU, and media devices were designed for few sequential data streams (tapes and deduplication appliances are perfect examples). Why are companies like Rubrik and Cohesity so different then? Well, from my point of view they designed an architecture that enables to do much more with backups than what was possible in the past.
Next-gen backup architectures
Adding a scale-out file system to this picture was the real game changer. Every time you expand the backup infrastructure to store more data, the new nodes also contribute to increase CPU power and memory capacity. With all these resources at your disposal, and the data that can be collected through backups and other means, you’ve just built a big data lake… and with all that CPU power available, you are just one step away from transforming it into a very effective big data analytics cluster!
From data protection to analytics and management
Starting from this background it isn’t difficult to explain the shift that is happening in the market and why everybody is talking more about the broader concept of data management rather than data protection.
Some of you may argue that it’s wrong to associate data protection with data management and in this particular case the term data management is misleading and inappropriately applied. But, there is much to be said about it and could very well become the topic for another post. Also, I suggest you take a look at the report I recently wrote about unstructured data management to get a better understanding of my point of view.
Data management for everybody
Now that we have the tool (a big data platform), the next step is to build something useful on top of it and this is the area where everybody is investing heavily. Even though Cohesity is leading the pack and has started showing the potential of this type of architecture years ago with its analytics workbench, the race is open and everybody is working on out-of-the-box solutions.
In my opinion these out-of-the-box solutions, which will be nothing more that customizable big data jobs with a nice and easy to use UI on top, will make data management within everyone’s reach in your organization. This means that data governance, security and many business roles will benefit from it!
A quick solution roundup
As mentioned earlier, Cohesity is in a leading position at the moment and they have all the features needed to realize this kind of vision, but we are just at the beginning and other vendors are working hard on similar solutions.
Rubrik, which has a similar architecture, has chosen a different path. They’ve recently acquired Datos IO and started offering NoSQL DB data management. Even though NoSQL is growing steadily in enterprises, this is a niche use case at the moment and I expect that sooner or later Rubrik will add features to manage data they collect from other sources.
Not long ago I spoke highly about Commvault, and Activate is another great example of their change in strategy. This is a tool that can be a great companion of their backup solution, but can also live alone, enabling the end user to analyze, get insights and take action on data. They’ve already demonstrated several use cases in fields like compliance, security, e-discovery and so on.
Getting back to Veeam… I really loved their DataLabs and what it can theoretically do for data management. Still not at its full potential, this is an orchestrator tool that allows to take backups, create a temporary sandbox, and run applications against them. It is not fully automated yet, and you have to bring your own application. If Veeam can make DataLabs ready to use with out-of-the-box applications it will become a very powerful tool for a broad range of use cases, including e-discovery, ransomware protection, index & search and so on.
These are only a few examples of course, and the list is getting longer by the day.
Closing the circle
Data management is now key in several areas. We’ve already lost the battle against data growth and consolidation, and at this point finding a way to manage data properly is the only way to go.
With ever larger storage infrastructures under management, and sysadmins that now have to manage petabytes instead of hundreds of terabytes, there is a natural shift towards automation for basic operations and the focus is more on what is really stored in the systems.
Furthermore, with the increasing amount of data, expanding multi-cloud infrastructures, new demanding regulations like GDPR, and ever evolving business needs, the goal is to maintain control over data no matter where it is stored. And this is why data management is at the center of every discussion now.