If you work with storage and virtualization, you probably already know just how important storage is for infrastructure design. Taking some basic features like availability and reliability for granted, the most important characteristics are scalability, manageability and integration with the upper software layers.
I’m not a fan of all these “software defined” buzzwords but something is really happening here and, remember Nicira?? also in this case the most interesting things are happening on the Openstack and startups side. At the same time, you can see a “next generation traditional” approach that has a lot to say.
Virtualization and Storage
If you have a very capable storage system but it is not integrated with the hypervisor then almost all its capabilities are useless. The integration between storage and the hypervisor is made with plug-ins and APIs. VMware, for example, provides plenty of APIs and the storage vendor can add array management functionalities to vCenter through plug-ins. Nowadays, almost all vendors can show you a certain level of integration between VMware and their products.
Small, and maybe medium, virtualized infrastructures can rely on traditional shared storage but, as the numbers grow, storage becomes a real pain. It’s not only a performance issue: automation, management and provisioning could become the real problem… especially if we need many arrays to serve a single big infrastructure.
Cloud, at least for the numbers involved, can only worsen this.
Cloud and Storage
If you look deep into storage solutions for cloud infrastructures you’ll find two radically different approaches that show a common basic design: scale-out. They are also converging to ethernet as transportation media and seems that iSCSI is the preferred communication protocol. The main difference remains in the way storage is organized: distributed or shared.
In the first case we have some startups that are working on very radical solutions (and they are very comparable to software defined networking too): the final goal of software defined networking (and storage?) is to split the “control plane” and the “data plane”. The control plane (all management and provisioning stuff, data protection, IO management, snapshots, replicas, and so on) goes into the servers while the data plane (where data is physically stored) is commodity hardware (for example, disks into the servers or dumb shared storage). Control plane is also responsible for all the needed tasks to manage data coherency and distribution, resiliency and availability in what has become a scale-out storage cluster. Actually, this isn’t an official definition of software defined storage but looks like the most coherent to me.
In practice, you can build a 100% scale-out storage array from the ground up embedded into the servers’ infrastructure. Each node of your cluster has CPU, RAM and local shared disks; each time you need storage you add a new node to the cluster.
The big advantage of this approach is to obtain a very “software defined storage” without any scalability problem and a huge overall performance at a low cost but, on the other side, join storage and CPU power into the same node has its drawbacks.
The first problem that comes to my mind is that if you have an unexpected growth only in CPU or space you are forced to waste resources; in fact, your building block should be as standard as possible. The same problem will arise again at every next generation of servers (or disks). In the end you’ll obtain an unbalanced cluster, with latency and performance discrepancies both in CPU and storage.
Other concerns, as a direct consequence of what described above, are about the performance of the single data volume (LUN): underlying complexity introduced by this kind of architecture, with CPU used for storage and compute tasks, coupled with different types (generations) of nodes connected on ethernet links, make LUNs’ performance and latency unpredictable. In short, service levels profiling and QoS mechanisms are very hard to implement and manage.
Potentially this kind of distributed storage is the perfect fit for commodity cloud infrastructures or, on the opposite side, it could be good for small enterprise infrastructures without particularly high demanding requirements.
More traditional shared storage is the other way to go. Actually, defining it as traditional is quite wrong because I’m thinking about 100% scale-out systems based on commodity hardware. As you can easily imagine this approach does away with the savings of putting disks into the servers. On the other hand, LUN behavior is much more predictable and QoS is possible. In many cases this storage design also allows to adopt data reduction techniques (Deduplication, for example). The shared storage is also more manageable and it can grow as needed in space or in performance.
A step further
A few months ago Openstack started a new project called Cinder. Cinder is the attempt to separate the Nova volume block storage service into its own project. Obviously, this separation will bring a lot of advantages in terms of future improvements and speeds up the development process. In fact, the main goal of Cinder is to provide standard APIs for provisioning/management and drivers to correctly deal with different storages; now it’s much easier to integrate storage with Openstack.
Almost all vendors have already jumped on the Cinder bandwagon (also market leaders like NetApp and EMC are developing their own drivers) but the most active contributors, as usual, are startups.
Talking about a potentially massive scale-out storage is not for everyone. The solutions that could be evaluated aren’t many at the moment and some of them are to be considered immature. But, In my personal opinion, there are a few startups that deserve a look: Inktank, Scality, Coraid and Solidfire. You’ll find a lot of documentation on companies’ websites about architectures and philosophies that will help you to have a full picture of what I wrote above.
They are not the only ones indeed, but they are enough to give you a really interesting view of what is happening on the storage for cloud infrastructures landscape.
They are proposing solutions that range from distributed storage to shared storage: you’ll certainly find some overlaps and they could (or perhaps they already do) compete against each other for some cloud infrastructures. I’m also sure that there is some space for them in not-so-huge enterprise private clouds too, but only time will tell.