IBM is planning to acquire CleverSafe, HGST bought Amplidata a while ago and RedHat did the same with Ceph 18 months ago. If you put this information alongside the fact that disk shipments are doing very well (even when compared to Flash), it’s no wonder Object storage is so hot now.

Why Cleversafe

Modern interior of server room in datacenterFrom my POV it’s a good move for IBM’s cloud strategy. They can leverage it to building a very scalable S3-like service as well as selling it as part of their hybrid/managed cloud deployments.

It’s one of the best products on the market and, even though it has a couple of limits which I have always considered a threat for their growth (Erasure code is not good for everything, and their architecture implies big installations from the first deployment), maturity, a good ecosystem of partners and the fact that they have already proven (many times) the ability to scale well beyond 100PB, are major advantages for Enterprise customers and xSPs like SoftLayer.

The need for scale-out storage

I’m not going to talk again about why we need object storage or, more in general, highly scalable object-based storage for unstructured data (there is a video recording of the last TECHunplugged event which summarizes it all up ūüėČ ). But it’s a fact, enterprises are adopting it for various reasons and it’s changing how large storage infrastructures are implemented and consumed.

Modern data protection (like cloud-based VTL appliances or direct D2O backup – d2O? I just conjured that up, it means Disk to Object!)
Private cloud storage (like Sync&share)
Data lakes building up with vendors already working on in-place data analytics!

These are only examples but they should give you an idea why object storage is getting all this attentions from vendors and end users.

A look at the market

We now have Cleversafe going into IBM hands and, as I mentioned earlier, Ceph and Aplidata have already gone to RedHat and HGST. What about the rest of the market?

On the startup side:
iStock_000017231789MediumScality is really focused on its growth now (targeting an IPO for 2017). It’s one of the best, and very scalable, object storage solutions out there and they are doing pretty well with RING, its product, is sold as scale-out NAS (counting for 40% of their sales!).
Caringo, with its 500+ customers, and the latest addition to its platform, is really interesting. The product is mature, and they can start very small (under 100TB of usable space) enabling mid-size customers to adopt OBS.
Cloudian is another startup with cool ideas. Their 100% compatible S3 API, the new (cool) hardware, analytics and a couple of other nice features (like tiering to the cloud) make them really interesting in the enterprise space.
SwiftStack. I got to talk to Joe Arnold (founder of the company) a few days ago for my podcast. I have to say that I was skeptical about the product but the soon to be released 3.0 version looks much more on par with its competitors now… on paper at least.

Primary vendors are doing well too:
illustration of big data, file transfes and sharing filesHDS: I’ve always loved them for the ecosytem they’ve built around the object storage platform. It’s the only end-to-end solution available at the moment that includes Sync&share and remote NAS appliances from a single vendor. And they are also adding many features that I find really compelling for enterprise customers. Potential development with HSP is intriguing!
NetApp: It seemed that StorageGRID was lost in the fog. But all the new releases and the roadmap show quite the contrary. New, recently acquired products like AltaVault and a broader set of features are making this product competitive in a wider set of use cases.
DDN: I put DDN with the established vendors because they can’t be considered a startup any longer (more than $300M in revenues last year) but they still have the agility of a small company. WOS360 doesn’t have the visibility it deserves. It’s a good product and it has been deployed in infrastructures that see a growth of more than 1 Billion per week! Its high performance design doesn’t fit in smaller infrastructures, but integration with the rest of DDN products make WOS an interesting solution for customers with high performance needs at scale.
HGST, as well as Seagate, is building a strong offering and Object Storage, thanks to Amplidata is an important part of it. They are investing and working on interesting things for the future!
EMC: With EMC it’s hard. They have too many competing products. I personally love Isilon (even though it’s not a proper object storage system). Atmos?, ECS? What else? All the rest seems half baked stuff, old design, not very scalable and without a clear strategy. Am I wrong?

…And the newcomers

Sign flash trashThere is a whole group of new startups that can be included in this scale-out kind-of-object-based storage too. Some of them are still in stealth (iguaz.io is an example) but others are already shipping their products (sometimes still in a controlled release):
Hedvig (founded by Avinash Lakshman) is promising multi-protocol scale-out storage
Cohesity (founded by Mohit Aron), GFS DNA and smart data services.
Rubrik very similar to Cohesity from the backend perspective, but very focused on seamless backup
Nutanix, with its new scale-out storage offering based on same techology used in its hyperconverged systems (GFS again!)
QuoByte, Linux oriented, european based startup, focusing again on a GFS-like scale-out FS with an interesting HDFS interface.
Rozo Systems. Fast as hell erasure coding scale-out NAS.
More… ?

Closing the circle

iStock_000000261349SmallWho have I missed? As always, I can’t mention everyone! Sometimes I just don’t believe in what they do, or they are totally absent in the market‚Ķ in other cases, I haven’t received any briefings lately and I prefer not to talk about something I’m not updated about.

The most interesting things are happening in two areas today: Flash on one side and what I usually call “Trash” (all the rest) on the other. In the first area you find high IOPS, low latency storage with local high performance requisites while in the other it’s all about throughput, parallelism and data accessed from everywhere (which I usually call distributed performance).
It’s interesting to note that all the newcomers I’ve mentioned in the post have some characteristics inherited form both groups but, from my point of view, they are still much more on the capacity side even if performance is the core proposition for some of them.

[This is a disclaimer! Remember that I’m doing, or I did some work for various vendors mentioned in this article…]