And I’m not talking about Flash memory here! Well, not in the way we usually intend flash at least. I wrote about that a long time ago: in-memory storage makes sense. Not only does it make sense now but it’s becoming a necessity.
The number of applications taking advantage of large memory capacity is growing like crazy, especially in the Big Data analytics, HPC and in all those cases where you need large amounts of data as close as possible to the CPU.
Yes, memory. But not for everyone
We could argue that any application can benefit by a larger amount of memory but that’s not always true. Some applications are not written to take advantage of all the memory available or they simply don’t need it because of the nature of their algorithm.
Scale-out distributed applications can benefit the most from large memory servers. In this case the more memory they can address locally, the less the chance they need to access remote and slower resources (another node or a shared storage, for example).
RAM is fast, but it doesn’t come cheap. Access speed is measured in nS but it is prohibitively expensive if you think in terms of TBs instead of GBs.
On the other hand, flash memory (NAND) brings the latency at uSec but is way cheaper. There is another catch however- if it is used through the wrong interface it could become even slower and less predictable.
Intel promises to bring more to the table soon, in the form of 3DXpoint memory. A new class of chips which will have better performance than Flash, persistency and a competitive price point (between flash and RAM). But we’re not there yet…
In any case, the entire game is played on compromises. You can’t have 100% RAM at a reasonable cost and RAM can’t be used as a persistent storage device. There are some design considerations to take into account to implement this kind of solution, especially if we are talking about ephemeral storage.
Leading edge solutions
The first is an HW solution. A memory DIMM which uses NAND chips. From the outside it looks like a normal DIMM, just slower… but a DIMM none the less (as close as possible to the CPU!). A particular Linux driver is used to mitigate the speed difference through a smart tiering mechanism between actual and “fake” RAM. I tried to simplify as much as I could here, the technology behind this object is very advanced (watch this SFD video if you want to know more) and it works very well. Benchmarks are off the charts and savings, by using less servers to do the same job in less time, are unbelievably good as well!
The latter has a similar concept but it is implemented as a software solution. NVMe (or slower) devices can be seen as RAM. Once again, it’s a simplification, but it gives the idea. Also in this case, overall performance is very good and overall cost is lower than for the former… and, since this is only software, you can run it in an AWS flash-based EC2 instance for example! And the other good news is that the software is free at the moment. The bad news lies in the maturity of the product… but I’d keep an eye on Plexistor because the concept is really appealing.
It’s clear that looking at the level of sophistication, maturity and performance predictability, the two solutions target two different types of end users at the moment, but they are equally interesting from my point of view.
And the rest of us?
Not all applications can take advantage of large memory servers… and caching remains a good option. Started as a point solution to speed up access to traditional storage, when flash memory was absurdly expensive, it is now becoming an interesting feature to do the same when large capacity storage systems are involved (scale-out NAS and object storage for example).
Caching is less efficient, but has similar benefits as the ones described above while it’s easier and cheaper to implement. An example? Take a look at intel CAS software. It comes for free with intel flash drives and has some really neat features! The benchmark Intel has shown at last TFD, regarding the usage of CAS in conjunction with Ceph, is quite impressive.
Closing the circle
Data growth is hard to manage, but these solutions can really help to do more with less… moving data closer to the CPU has always been crucial, now more than ever, for at least three reasons:
1) CPU are becoming more and more powerful (latest best-of-breed intel CPU, launched yesterday, has now 22cores and 55Mb cache!!!) and I’m sure that you want to feed those cores at best all the time… right?
2) Capacity grows more than everything else. Low $/GB is essential and there is no way to use all-flash systems for everything. In fact, it’s still quite the contrary, If Flash is good to solve all primary data needs now, it’s also true that all workloads that involve huge amounts of data are still on other forms of storage. Once again, multiple tier storage is not efficient enough (HDD-Flash-RAM, too many hoops to manage). Better to have very-cheap and ultra-fast (local memory).
3) As a consequence of 1 and 2. it’s more common now to have large object stores and compute clusters at a certain distance from each other in a multiple or hybrid cloud fashion (look at what you can do with Avere for example!). In this scenario, since network latency is unavoidable, you better have a way something to bring data close to the CPU and again, caching and ephemeral storage are the best solutions at the moment.
If you want to know more about this topic, I’ll be presenting at next TECHunplugged conference in London on 12/5/16. A one day event focused on cloud computing and IT infrastructure with an innovative formula combines a group of independent, insightful and well-recognized bloggers with disruptive technology vendors and end users who manage rich technology environments. Join us!
Disclaimer: I was invited to SFD9 and TFD10 by Tech Field Day and they paid for travel and accommodation, I have not been compensated for my time and am not obliged to blog. Furthermore, the content is not reviewed, approved or edited by any other person than the Juku team.