This is the second article of a short series about Object Storage (here is the fist one) with the intent is to give a brief introduction about each Object Storage player. As I mentioned earlier this is not a comparison between products but rather an attempt to give the reader the idea of the number of solutions available and how they are positioned.
In the first article I talked about Amplidata, Caringo, Cleversafe, Claudian. Today I’m going to write about Data Direct Networks WOS, EMC Atmos, Hitachi Data Systems HCP, Inktank Ceph, NetApp StoreGRID, OpenStack Swift and SwiftStack, Scality RING.
Data Direct Networks WOS
I was briefed on WOS (Web Object Scaler) a few months ago and it strongly reflects the DNA of its company. In fact DDN was born in 1998 and is very focused on high performance storage solutions for applications like HPC, Oil&Gas, media&entertainment, Genome sequencing and so on.
WOS architecture is very flexible and allows for configuring the cluster in different ways with regards of data protection and replication. At the same time, contrary to many other object storage systems, data are directly written on disks without limits and constraints introduced by the filesystems. WOS is usually sold as a hardware solution based on very high density 4U appliances connected with 10Gb Ethernet or infiniband. WOS provides access via APIs as well as file gateways alongside an integration with others DDN products. On the flip side the partners ecosystem is not one of the most crowded.
This solution is geared to the very high end and to object storage infrastructures which require scalability and very high performance.
EMC, the worldwide leader in the storage industry, which practically started the Object Storage market many years ago with Centera: despite all its limitations, this product was adopted by many ISVs. Atmos, the current product, is much more open and with a modern design.
EMC has built an interesting ecosystem of partners and solutions around Atmos that makes it very suitable for most enterprise needs.
Atmos can be viewed as very general purpose Object Storage system that doesn’t bring the best in performance or in scalability but, thanks to a certain level of flexibility, can be easily adopted and managed from medium to large sized companies. In fact, the architecture allow deploying different kinds of configurations and has features like erasure code; on the other hand it has a long way to go from the scalability and performance of high end solutions.
It can be ordered has a hardware product or as a VSA, the latter is the best option for starting with limited quantities of data and concurrent connections.
Hitachi Data Systems HCP
HCP (Hitachi Content Platform) comes from the acquisition of Archivas in 2007. Fortunately, HDS has left the development team almost intact and HCP is now an important part of HDS’ strategy for content and file solutions.
The product shows a conservative approach under many aspects but it also has some interesting functionalities which are well suited for the enterprise usage (Single file instances is one of those I like best). HDS has been working hard to extend the solutions portfolio and the partners ecosystem. At the end of the day this is the only product on the market that can offer a complete set of end-to-end solutions from the object store to the gateways, all directly supported by a primary storage vendor. Management tools, as usually happens with HDS, should be revised and simplified.
HCP is available in three different configurations: VSA, physical appliances with local storage and physical appliances on top of HDS block storage. HDS also offers a sort of “HCP as a service” in some countries (directly or through certified partners), giving the end user freedom of choice on how to implement its object storage infrastructure (private or hybrid).
Ceph is an open source distributed Object Store and File System available on linux platforms and Inktank is the company that provides commercial services and support.
Ceph architecture has a brilliant designed and an interesting roadmap (although I might check the latest updates). The implementation is still immature under the “enterprise” point of view but it is rapidly improving.
Theoretically speaking this is something very far from ordinary enterprise environments but in the last year I spotted Ceph clusters in a growing number of data centers for basically three reasons: it’s cheap, it’s resilient and it scales (no small thing, indeed!). Almost 100% of the installations are small clusters (less than 10 nodes, hundreds of TBs) and they are used for storing the type of secondary data that don’t have enough value to be stored in primary storage systems but no one wants/can delete them (logs, sensors readings, secondary backups, and so on). Sometimes the cluster was moved from development/test to production just because it works! At the same time these clusters are still available to internal developers and some of them are figuring out how to use them more intelligently.
Ceph is compatible with Openstack (Swift+Cinder) and Amazon S3 but then you need third parties solutions to complete the infrastructure.
StorageGRID comes from the acquisition of Bycast in 2010. After the acquisition, probably due to a lack of a precise strategy, the product almost disappeared for a couple of years, when suddenly a new version (9.0) came out with S3/CDMI support and a few new features in 2012.
StorageGRID is a software product but NetApp can offer a complete solution bundled with servers and its storage systems. The partners ecosystem is limited (most of the partners where present before the acquisition) and there aren no end-to-end out-of-the-box solutions like the ones you can find from other major vendors. StorageGRID also offer basic NFS/SMB services to ingest files but this feature is not intended to swap out traditional NAS filers.
NetApp doesn’t seem to be pushing it much, which is the biggest problem with this product and, consequently, the risk for the end users.
Openstack Swift (SwiftStack)
Swift is a open source projects which is part of Openstack cloud management platform. SwiftStack was founded in 2011 by some of the Swift project leaders to develop, deliver and support a commercial version of Swift.
The architecture design is similar to what I wrote previously: front-end nodes for managing accesses and IO operations coupled with a back-end of high-capacity nodes that provide space. The product is also receiving continuos updates and features improvements.
Maturity issues aside, the SwiftStack is gaining a lot of attention (also thanks to Opestack) and partner ecosystem is quickly growing.
Scality is a startup, founded in 2009, that is developing a next generation software-only object storage solution called RING.
The modern architecture of the product allows end users to configure the cluster with different data protection mechanisms (multiple copies or erasure coding) as well as different data replication policies and geo distribution.
Scality has already demonstrated very high speed configurations (based on SSDs) and policy based automatic tiering capabilities. For example, these two together allow the customer to have a single solution capable of serving frequently accessed data and long term archiving. The architecture is very flexible and scalable targeting large multi Petabyte installations.
Scality RING can be accessed via proprietary and standard APIs (CDMI and S3 for example) but also through many connectors and gateways: scale-out FS, NAS protocols, OpenStack Cinder for block storage, HDFS and it can be directly used as a mail server storage.
Scality has a very clear strategy and a growing list of partners who are building a good ecosystem of hardware and software solutions for large ISPs and Enterprises.
Why it matters
Object storage brings several benefits to the enterprise and can easily help with TCO when the numbers get big. The bigger the amount of unstructured data you have to manage, the more Object Storage makes sense.
The foremost issue is dealing with data growth management, but storing data without having a clue of what’s going on, makes the data management problem even bigger. Object Storage has a totally different approach and addresses the problem by adding value and new opportunities:
TCO: A well-designed Object Storage platform and its gateways could be much more cost effective when compared to a traditional storage system.
Private cloud enabler: Centralization, consolidation, self/auto provisioning, scaling, abstraction, access methods (and so on) are all at the base of a private cloud infrastructure.
Dark data and Big Data: As a consequence of the first two points, the enterprise object store becomes a huge repository. It’s only a matter of finding the tools (and the ideas!) that suit you at best to bring out value from that!
As mentioned in the previous article, I’ll be attending Next Generation Object Storage Summit next March in Los Angeles. Please, don’t hesitate to drop me an email or a DM if you are going to be there and want to chat about objects!
And, as always, every comment is warmly welcomed!