The traditional enterprise storage market is declining and there are several reasons why. Some of them are easier to identify than others, but one of the most interesting aspects is that there’s a radicalization in workloads, hence storage requirements.
Storage as we know it, SAN or NAS, will become less relevant in the future. We’ve already had a glimpse of it from Hyperconvergence, but this kind of infrastructure is trying to balance all the resources – at the expense of overall efficiency sometimes – and they are more compute-driven than data-driven. Data intensive workloads have different requirements and need different storage solutions.
The Rise of Flash
All-flash systems are gaining in popularity, and are more efficient than hybrid and all-disk counterparts. Inline compression and deduplication, for example, are much more viable on a Flash based system than on others, making it easier to achieve better performance even from the smallest of configurations. This means doing more with less.
At the same time, All-flash allows for a better performance and lower latency and, even more important, the latter is much more consistent and predictable over time.
With the introduction of NVMe and NVMeoF, protocols which are specifically designed to access flash media (attached to PCI bus) faster, latency will be even lower (in the order of hundreds of microseconds or less).
The Rise of Objects (and cloud, and scale-out)
At the same time, what I’ve always described as “Flash & Trash” is actually happening. Enterprises are implementing large scale capacity-driven storage infrastructures to store all the secondary data. I’m quite fond of object storage, but there are several ways of tackling it and the common denominators are scale-out, software-defined and commodity hardware to get the best $/GB.
Sometimes, your capacity tier could be the cloud (especially for smaller organizations with small amounts of inactive data to store) but the concept is the same, as are the benefits. At the moment the best $/GB is still obtained by Hard Disks (or tapes) but with the rate of advancement in Flash manufacturing, before you know it we’ll be seeing the large SSDs replacing disks in these systems too.
The next step
Traditional workloads are served well by this type of two-tier storage infrastructure but it’s not always enough.
The concept of memory-class storage is surfacing more and more often in conversations with end users, and also other CPU-driven techniques are taking the stage. Once again, the problem is getting results faster, before others if you want to improve your competitiveness.
With new challenges coming from real-time analytics, IoT, deep learning and so on, even traditional organizations are looking at new forms of compute and storage. You can also see it from cloud providers. Many of them are tailoring specific services and hardware options (GPUs or FPGAs for example) to target new requirements.
The number of options is growing pretty quickly in this segment and the most interesting ones are software-based. Take DataCore and its Parallel I/O technology as an example (I recently wrote a paper about it). By parallelizing the data path and taking advantage of multicore CPUs and RAM, it’s possible to achieve incredible storage performance without touching any other component of the server.
This software uses available CPU cores and RAM as a cache to reorganize writes while avoiding any form of queuing to serve data faster. It radically changes the way you can design your storage infrastructure, with a complete decoupling of performance from capacity. And, because it is software, it can be installed also on cloud VMs.
A persistent storage layer is still necessary, but will be inexpensive if based on the scale-out systems I’ve mentioned above. Furthermore, even though software like DataCore’s Parallel I/O can work with all existing software, modern applications are now designed relying on the fact that they could run on some sort of ephemeral storage, and when it comes to analytics we usually work with copies of data anyway.
Servers are storage
Software-defined scale-out storage usually means commodity X86 servers, for HCI is the same and very low latency solutions are heading towards a similar approach. Proprietary hardware can’t compete, it’s too expensive and evolves too slowly compared to the rest of the infrastructure. Yes, niches good for proprietary systems will remain for a long time but this is not where the market is going.
Software is what makes the difference… everywhere now. Innovation and high performance at low cost, is what end users want. Solutions like DataCore do exactly that, making it possible to do more with less but also do much more, and quicker, with the same resources!
Closing the circle
Storage requirements are continuing to diversify and “one-size-fits-all” no longer works (I’ve been saying that for a long time now). Fortunately, commodity x86 servers, flash memory and software are helping to build tailored solutions for everyone at reasonable costs, making high performance infrastructures accessible to a vaster public.
Most modern solutions are built out of servers. Storage, as we traditionally know it, is becoming less of a discrete component and more blended with the rest of the distributed infrastructures with software acting as the glue and making things happen. Examples can be found everywhere – large object storage systems have started implementing “serverless” or analytics features for massive data sets, while CPU intensive and real-time applications can leverage CPU-data vicinity and internal parallelism through a storage layer which can be ephemeral at times… but screaming fast!
[Quick Disclaimer: DataCore is a client of mine]
Well, Mr. Signoretti has written a very thoughtful and forward-looking blog on the evolution of data storage. It bears reading and re-reading.
Traditional storage is called traditional because it has been around a long time. It has been around a long time because storage is a conservative business. Apps blow up all the time and people just re-launch them or maybe the supplier of the app issues an update or fix for the problem. No one gets hurt when an app goes belly-up. If your storage array fails to perform there is widespread panic in the organization. Storage arrays have been engineered to exacting performance and durability levels, which is why they are expensive to acquire and manage. That said, scalability is the downfall of traditional storage. Scaling up traditional storage is a real budget buster in the IT department because it is expensive, time-consuming, and labor intensive. Expanding and/or replacing traditional storage systems every 3-5 years is financially unsustainable in an era of continuous data growth.
Software defined storage (SDS) running on commodity off-the-shelf (COTS) storage servers solves the scalability problem. Object storage easily handles the demands for scalability and low cost. Flash easily handles the demands for performance. All storage can now be divided into two parts. Object storage for all data that is not transactional and flash for all data that is transactional. The amount of unstructured (non-transactional) data in an organization is very large compared to the amount of transactional data in an organization.
Flash and object-based storage have formed the foundation of a new storage architecture. And even though the origins of these technologies go back over 10 years, they have yet to see wide spread adoption due to the fear, uncertainty and doubt (FUD) of people habituated to working with traditional storage. All of this is understandable, but storage evolution waits for no one. New storage models will emanate from this architecture over time because people will want more from their storage and they will want it faster and at a lower cost.