This is the last article of a short series of blogs I wrote about Data-aware storage and sponsored by Data Gravity. The full paper will be realeased next week. Follow the links to find first and second articles of the series.
The ability to search and analyze all enterprise data opens up a new world of possibilities, especially if it comes in a transparent fashion and without additional costs, directly embedded in your primary storage.
At the beginning, enterprise storage systems were designed around resiliency, data protection and performance. Later, data services (like snapshots for example) became table-stakes. Now all these features are taken for granted and organizations, of any size, are demanding more control over data, users and workloads. Data-aware storage systems are the answer.
A traditional, layered software-based approach is possible but installing and managing external tools is expensive and often implies specific skills that have to be maintained over time. It’s also important to note that building an external analytics platform increases costs and complexity.
Implementing a data-aware storage infrastructure enables the adoption of new and smarter strategies in data management, allowing a complete rethinking of many organizational processes. The use cases are endless but the most common can be found in areas like data discovery, infrastructure TCO, proactive security, auditing, compliance and chargeback.
Data search and discovery
The ability to search the entire contents of the storage system through simple Google-like interfaces allows users and administrators to find relevant information quickly. Thanks to this capability it is much easier to re-use content that is already available and leverage the organization’s knowledge.
Thanks to the rich metadata maintained by this kind of system, it’s possible to have a complete view of what is really stored in your primary storage, which users use the largest amount of capacity, for what purpose, and what the access patterns are. Modern user interfaces help to visualize this data very quickly, accelerating decision making about what data is worth leaving on primary storage and what data is better to move to cheaper archiving systems or even delete. This helps keep primary storage lean and efficient while optimizing storage spending for the right resources. Eventually, SysAdmins will have a powerful tool to help them in taking defensible decisions on how to manage to “keep everything forever” storage policies.
Another advantage of having rich metadata and file analytics is being able to quickly discover files based on their content or user-defined rules. Every single file that lands in the primary storage system is tagged and indexed, even if it is saved into a VM, and alarms can be raised to identify potential data breaches or leaks. Advanced reporting can help to identify where sensitive data is (PII, PHI, PCI, etc.) and ensure that it is properly secured and not being accessed inappropriately.
The same technology described in the previous point can be applied to verify compliance and perform real-time auditing over users and data. The storage system has all the information about every file, user and access pattern. By relating them, and by visualizing the results with simple graphs, it becomes very easy to understand how data flows in the system and who is doing what.
What was explained for file access patterns is even more true for content. The system can easily discover user-defined sensitive information in any file, including files stored in VMs, and give a complete map of their content. This information can also be exported and used to create reports or produce legal evidence when necessary.
Chargeback and show-back
Thanks to all the information available from the system, it is also easy to export specific and detailed reports on storage usage, and build chargeback and show-back mechanisms for budgeting and providing evidence on how resources are actually used.
Closing the circle
Full data-aware solutions are the first of a new class of infrastructure components that are smarter and much more proactive. Compared to the past, these new building blocks are capable of interpreting what they are doing and depending on the level of the implementation, give insightful information not just to the System Administrator but to everyone across the organization.
This is happening not only in storage, but similar examples can also be found in networking and computing. This new advanced approach dramatically changes the role of IT regarding business intelligence and enables organizations to answer many more questions faster. It’s helping to move the needle from traditional paradigms to next generation data-integrated infrastructures that are finally capable of coping with growing user expectations.
With data-aware storage, the overall infrastructure is simplified and becomes more agile, while more key organizational roles now have access to an unprecedented set of insightful information about all aspects of one of the most valuable assets for any modern enterprise: its own data. And it is made possible without implementing any expensive and complex additional infrastructure to maintain.
Depending on the use cases and user needs, solutions can vary in terms of depth of information that can be carved out from data but its potential and value for the end user is unquestionable.