Storage

Secondary storage, the missed opportunity for object storage

I’ve had a lot of conversations lately with vendors and end users about secondary storage. The scenario is quickly changing with new solutions targeting this space and more and more end users adopting different forms of secondary storage.

Recyclable garbage consisting of glass plastic metal and paperI started writing about object storage a few years ago and I’ve been writing about it ever since. But even though its potential remains huge, I believe that radical changes are occurring which are altering the market landscape as a consequence of a new class of smarter solutions. “Flash & Trash” remember? But Trash not as in garbage in a landfill, Trash as a resource like in the recycling process, for example.

I don’t want to repeat myself here (and this article could help to set up the background), but secondary storage is a huge opportunity for the future (in the range of 80/85% of capacity installed and 50% of overall storage revenues). And with the advantage for the end user coming from a lower $/GB, scalability and many other features thought up to lower TCO while readying organizations of any size for Big Data and IOT.

Object storage has failed expectations

Many object storage vendors have made the same mistakes in the past, they were focused on basic technology aspects without looking at customer needs. These vendors spent most of their resources in pure object storage, good for a few massive deployments and next generation cloud apps, but not for traditional large and medium size organizations. And it is clear to everybody that the number of hyper-scale end users is very limited. Trying to sell big infrastructures to a few customers is not going to last long… especially if the user is big enough to design and build its own infrastructure components!

Trying to sell big infrastructures to a few customers is not going to last long… especially if the user is big enough to design and build its own infrastructure components!

os.001If a few years ago selling in the range of 0.5 to 10 PB was only for a handful of vendors, now there are plenty of solutions capable of doing well in that area. This is no longer a problem of scaling in capacity, which is taken for granted, end users look at other aspects like flexibility, ease of use, security, data services, analytics, performance and integration with the rest of the infrastructure. Users want the horizontal infrastructure I’ve described in the past which is capable of solving serving many different problems needs concurrently. Most, if not all, vendors are now fully aware of this, but not all have a product ready to meet this need.

When it comes to secondary storage, users want a horizontal infrastructure which is capable of solving serving many different problems needs concurrently.

There are exceptions of course. For example, not everyone is aware that 50% of Scality revenues already come from scale-out NAS (same backend of course, just different protocols exposed!). And others (HDS and DDN) have built a an ecosystem around their object storage. But if your business is based on a single product, which is quickly becoming a feature, and you are not thinking about real customer needs… then I tend to think your chances of having a prosperous future are quite low.

After the last round of acquisitions, and with HPE investing $10M in Scality while waiting to understand what to do, it is clear that object storage is just a component (or a back-end technology) and it’s just flowing into something bigger, more interesting and more aligned with real user needs.

Secondary storage (three examples)

Without always referring to the usual suspects, I’d like to bring your attention to three different examples to explain what I mean: Caringo, Ctera and Ceph. [Quick Disclaimer: I did a speech at Ctera SKO in January. And I’ve written a paper for Caringo in the past]

Caringo, whom I met a few weeks ago during TFD10, looks like a renewed startup to me. I got the same good impression last November, when I was briefed about some advancements in the product, and this feeling was confirmed once again. I don’t even know if you can consider them a startup anymore (funded in 2005 and with more than 500 customers), they started with object storage very early but the latest add-ons to their product are the ones that are triggering my interest… more than ever.

They are working on two fronts and the outcome is really compelling. On one side you have FileFly, which is a software component you install on your windows file servers. The idea is that Filefly, thanks to a good centralized policy engine, leverages Swarm as a single huge repository for the entire organization, leaving on local servers only a cache or stubs. I’m over-simplifying here, and there are several use cases, but let me start from the easiest scenario (and let you discover more on their website). In this case, not only the server is protected and optimized (think about, for example, if this server is connected to a SAN, the real footprint on primary storage becomes minimal!) but you obtain a form of DR too (especially if the server is in a remote office).

It doesn’t end here, the second part is even more interesting. Now, Swarm provides a Search portal which can perform (and save) sophisticated queries on every object metadata saved into the cluster. It becomes easy to search the entire file domain, even for the most distributed organizations! I really like the idea and I’d like to see it improved with full content searches in the future.

Swarm is becoming more and more capable of serving different applications. HFDS and NFS support for example, it’s not unique but again, the object store is just the backend that makes it possible to expose multiple protocols and ready-to-use solutions in a software-defined storage fashion.

Ceph is another example that is worth a mention. At the beginning, the goal of an open source state-of-the-art scale-out unified software-defined storage system seemed just too much to achieve and its biggest problem was exactly this: visionary but too immature at the same time. Then Red Hat acquired Inktank and things drastically changed. A small group of enthusiasts has become a large vibrant community (I hadn’t paid much attention to Ceph for some time, but after a short chat with RedHat last week I did my homework and the quantity and quality of code submissions on the official repositories is amazing). For example, Ceph is currently one of the best options for OpenStack installations and is deployed by organizations of every size… And this is one of its biggest advantages, it can be deployed by anyone with a minimum of linux skills for free, and grow from there.

Screenshot 2016-03-03 09.53.57And its not just Ceph itself, but the ecosystem around it that I like. For example I really love a solution developed by a small company I met a few months ago, Outpace.IO, which is also working on interfacing single disks with a small CPU/RAM card to run OSDs directly on the disk… a really compelling idea!

Ceph has a great (and unexplored) potential in many different areas, still immature for some use cases, but a great potential non the less. And the big difference between Ceph and the majority of object storage systems on the market today is that Ceph has been designed to be multi-protocol from day one.

The big difference between Ceph and the majority of object storage systems on the market today is that Ceph has been designed to be multi-protocol from day one.

Ctera is not a storage system, but it is one of the few storage solutions capable of delivering what private cloud storage promises, and end users love it. To be fair, Ctera can work on any kind of backend from AWS S3 to on-premises NAS but, realistically, I think that object storage is its perfect match. I fell in love with Ctera a long time ago, the first time I had the opportunity to look into its products. It started with a remote backup/NAS solution with cloud backend, now it has a complete end-to-end line of products delivering also enterprise Sync&Share, remote and cross cloud data protection.

My point here is that the backend is almost virtually non-influential (Ctera also provides concurrent multi-cloud support and cloud-to-cloud migration capabilities). Ctera has done a great job in supporting all the possible back ends and now almost all object storage vendors rely on it to provide this kind of service. This is the solution while the object store is just a repository… Good luck object storage vendors, what is your differentiator here?

Closing the circle

Now that enterprises are seriously looking at large scale storage repositories and object stores, object storage is becoming less relevant, it’s now just an access method. Once again, real world enterprises are more interested in ready-to-use solutions and not just in the enabling technology. In this case, Ctera, Ceph and Caringo are examples of products/vendors that are going in the right direction.

Most of the vendors have finally understood what is happening, but for some of them it’s too late. For those that have finally discovered that the world is not made of Exabytes but 1000s of Petabytes, it’s too much of a problem scaling down and maintaining a balanced architecture (especially now that we have 10TB per disk). On the other hand, others are working to become the horizontal platform I’ve mentioned many times but, again, the product was not designed with this in mind and now it’s hard to catch up and become credible in the eyes of the user.

The world is not made of Exabytes but 1000s of Petabytes

Exceptions exist of course but… look around, they are very few.

TECHunpluggedIf you want to know more about this topic, I’ll be presenting at next TECHunplugged conference in London on 12/5/16. A one day event focused on cloud computing and IT infrastructure with an innovative formula combines a group of independent, insightful and well-recognized bloggers with disruptive technology vendors and end users who manage rich technology environments. Join us!

Independent Analyst, trusted advisor and Blogger (not necessarily in that order. Having been immersed into IT environments for over 20 years, Enrico's career began with Assembler in the second half of the 80's before moving on to UNIX platforms (but always with the Mac at heart) until now when he joined the "Cloudland". He is constantly keeping an eye on how the market evolves for new ideas and continuously looking for new ideas and innovative solutions. He's a fond sailor and unsuccessful fisherman. You can find Enrico's social profiles here: http://about.me/esignoretti