Primary storage = block.
Secondary storage = file.
Object storage comes third.
Object storage is immensely scalable, cheaper, durable and Slooooower. Could it be changing soon? Let me talk about on-premises object storage here.
More applications access object stores
Many end users start with an application (Sync & share for example) and then they add more and more (and more…). But most of them don’t need the level of scale usually found in service providers, and many object storage system now support tiering mechanisms to slower repositories or cloud services. Adding flash memory to these small (5/10 node) clusters becomes fundamental to have an acceptable performance.
More developers use object stores
Young developers started their careers using Amazon AWS services and the likes. VMs have ephemeral storage and, for them, persistent storage is S3. If you ask them to develop a new application, they ask for an S3 service.
Application repatriation and hybrid cloud is another trend. With large organizations moving some applications back to their datacenters, we’ll be seeing more and more pressure on object storage systems… and some applications need more performance than others but, again, not always the scale.
More use cases for object stores
Scale-out NFS/FS is becoming more important as a front-end for object storage, but not only. The types of workloads that are accessing object storage are growing too. It’s interesting, for example, that PureStorage FlashBlade supports NFS and S3 (while support for other protocols will be added in the future).
Big Data applications are starting to take advantage of S3 too….. why not have it faster? Again, some object storage vendors have Tiering… and the same is true also for workloads like content distribution (most accessed objects can be served from flash).
Speeding up objects (and closing the circle)
Up to now we’ve always had slow object storage systems and different forms of cache deployed on specific gateways or directly through the application. Flash, if any, was deployed only in small quantities to manage metadata.
This approach has the best $/GB but, depending on the application, complexity and data consistency could be an issue.
Now, with modern object storage platforms a first tier of flash is not difficult to implement and, for some use cases, even an all-flash object storage system could be a good idea… (A few weeks ago I had the chance to talk with an end user who has deployed an All-flash Ceph cluster!)
Some primary storage vendors already have an object-based architecture at the back-end and for them is just mapping S3 protocol to the front-end… and, yes they are registering this need… it won’t take long.
If you want to know more about this topic, I’ll be presenting at next TECHunplugged conference in London on 12/5/16. A one day event focused on cloud computing and IT infrastructure with an innovative formula combines a group of independent, insightful and well-recognized bloggers with disruptive technology vendors and end users who manage rich technology environments. Join us!
There really isn’t much point to putting objects on SSD. The vast majority of object workloads are sequential access in nature and wouldn’t take advantage of the random access benefits of SSD. HTTP as a transport is not good for random access. It certainly wouldn’t hurt for tiny files, but for anything greater 10TB+ HDDs are better suited.
Hi Tim,
i agree with you, it depends on the workload and the object size… but there are end users asking for it and vendors working on it.
I was a little confused reading this. The storage industry is optimizing along two types of workloads; the transactional IOPS-are-everything workload (Database), and the unstructured data workload (files). Everything in the middle, formerly occupied by mid-range SAN and NAS arrays is a shrinking revenue market.
In these camps there are huge differences in what constitutes good performance. OLTP cares about latency (all-flash-arrays and NVME) and video streaming cares about throughput (file and object systems). Conventional object storage is an architecture optimized for unstructured data, with loose-consistency clustering resulting in slow transaction latencies. But stateless HTTP servers (the RESTful front-end of an object system) and data servers (the object back-end) can scale out, and provide VERY high throughput in a well designed system.
Lastly, we have to express caution around straddling strategies implemented by Ceph and others. Most Ceph implementations today are ironically generic iSCSI, not large-scale object systems. It’s just a cheap and not-that-fast iSCSI array for the Openstack proof-of-concept. Ok it can wear 2 hats. But as soon as an object storage designer commits to support iSCSI transaction latencies and file-system distributed lock management, then he’s instantly imposed a lot of architectural constraints that will show up as weak performance for every workload, topological inflexibility, and complex tuning efforts.
To summarize, Object is optimized for unstructured workloads, and can deliver very high throughput. But force-fitting traditional object into transactional workloads imposes implementation tradeoffs and a results in a poor customer experience.
Thank you for chiming in,
storage industry is not really focusing on IOPS but more on latency lately. But it doesn’t change that much to explain my point here.
In this article I never mention object as a replacement for block. The idea is not to have an object store that is as fast as block (iSCSI/FC), the idea is that developers want faster object stores to serve more files faster. that’s it. Especially when numbers gets small (in the order of 100TB for on premises installations), it’s hard to have a decent number of objects served per second… and this impacts overall performance.