“How to Solve Performance Problems at Petabyte-Scale” is the second report of a short series that I wrote for GigaOM Research (first one is available following this link).

Here the Executive Summary of this report:

Today’s huge storage systems, in the order of many petabytes, are associated more with capacity than performance, but that perception is changing. Until recently, the most requested storage feature has been active archiving, but the cloud, new technologies, and increased mobile applications now demand performance as well as capacity.

The three usual measurements for storage-system performance are input/output operations per second (I/OPS), throughput, and latency. Combining the three at a reasonable price is challenging, especially at high capacity. Even more demanding are the number of clients, applications, and workloads that contend for system resources from a multi-petabyte storage infrastructure. Adding to these demands is the challenge of achieving high performance from a distributed storage system spanning a geographically large, often global, area.

The first report in this four-part series describes how a traditional network-attached storage (NAS) system can scale to a few hundred terabytes and sometimes a few petabytes. But some scale-out NAS systems, though amazingly fast, are still not sufficient for webscale and large-organization infrastructures that must reach new scalability levels and indisputable performance while serving tens of thousands of local and remote clients with massive throughputs. An additional challenge is coping with long-distance data communication.

A deeper look at local and distributed performance helps illustrate the problem. For local performance, the clients are traditional servers and PCs, and connections are almost always reliable. For distributed performance, a variety of connections, protocols, and devices produce and consume data at blistering speeds, demanding efficiency and productivity.

Some next-generation multi-petabyte scale-out storage infrastructures have the feature set needed to leverage performance and capacity workloads simultaneously—either when data is saved locally or distributed globally. Separate load balancing, smart-caching techniques, scale-out file-system interfaces, clever use of flash memory, and so on, occur simultaneously to scale capacity at the backend, while delivering the needed performance at the front end.

The full report is available here (registration required).

[Disclaimer: The reports are underwritten by Scality]