No longer capable of remaining on the sidelines as a separate administrative domain, today's networked storage must be managed with a deeper awareness of business objectives.
But in an era of compliance, litigation, and highly interactive, data-dependent apps fine-tuned for maximum responsiveness, it takes more than a shift in philosophy to establish the kind of business-conscious storage environment that can deliver a true competitive advantage. It takes management tools born of the need to mitigate the downsides of the deluge of data today's enterprises face.
Enter data classification, CDP (continuous data protection), data deduplication, and tiered storage — three recent advances and one revamped mainstay poised to hone your daily storage operations.
Seemingly unrelated, these four technologies share a common objective: alleviating the pain of enterprise data management.
Whether providing improved data protection, reducing required capacities, ensuring a more flexible infrastructure, or presenting deep insights into stored data content, they seek to better align the traditionally technical benchmarks of storage management — capacity, performance, and so on — with business-related metrics, such as relevance, integrity, and responsiveness. In so doing, they are fast becoming essential tools for enterprises looking to derive greater advantage from existing and future storage assets.
The all-too-silent pink elephant in the room of storage management, data classification is finally receiving some much deserved attention from storage vendors. Compliance and e-discovery may be among the central motivating factors for this trend, but enterprises are fast finding data-level awareness of stored content to be an essential component of any comprehensive storage management strategy.
The rise of networked storage as a separate administrative domain has resulted in numerous benefits, including consolidated management and improved scalability. Yet this strategy has led enterprises to manage their storage containers without much understanding of the data content housed therein.
As a consequence, looking at data from the storage side rather than from the application front end is a lot like entering into a gigantic warehouse full of mysterious, cursorily labelled boxes. And when it comes to protecting data off premises or responding to requests from a judge or challenger in court, not to mention surfacing what your enterprise already knows, having precious information buried deep in storage silos can prove detrimental to your bottom line.
Making sense of what is stored in those mysterious boxes is the primary objective of data classification.
Chief among the benefits of data classification is the ability to allocate data to the appropriate storage tier. Compellent's Data Progression, for example, automatically classifies blocks of data according to criteria such as age and frequency of access, then pushes them to tiers accordingly.
Data Progression has the unique capability of decoupling blocks from their file wrapper, but on any other storage system, administrators can combine analysis of standard file metadata — name, file type, date created, and so on — with simple classification criteria to identity files that need to move elsewhere.
Relatively easy to implement, that kind of functionality proves inadequate for more ambitious classification exercises. To comply with regulations such as HIPAA, to respond to FRCP (Federal Rules of Civil Procedures) e-discovery requests, or to assess risks of disclosure, companies need more comprehensive data classification tools capable of finding files that contain sensitive information such as Social Security numbers, credit card numbers, or other private personal or corporate data.
Data classification solutions of this calibre provide the applications and structure to search for those needles in companies' archive haystacks, scanning for relevant patterns and creating rules to automatically assign data to the proper containers. Implementing such tools is often a recursive exercise in which the human element must complement the results of the search and classification engines.
Infoscape — EMC's ambitious and still evolving data classification project — is the cornerstone of the company's ILM (information lifecycle management) strategy. Using templates, Infoscape users can quickly identify the steps and rules needed for each classification task.
Templates, however, can help only to a point, and EMC is finding that customers may have to manage documents outside Infoscape. "[In Infoscape], we have implemented a copy to Documentum feature," says Sheila Childs, director of marketing at EMC.
Kazeon Information Server is another comprehensive data classification solution. Michael Marchi, vice president of solutions marketing at Kazeon, contends that e-discovery, compliance, and security are driving enterprises to incorporate integrated data classification solutions into their overall storage management strategies.
First launched in 2005, Kazeon's Information Server houses content-aware indexing, data classification, search, reporting, and migration in a single appliance in an effort to meet those needs. Information Server is also offered by NetApp to manage, for example, the retention dates of files created by NetApp's data protection offerings.
Index Engines, as its name suggests, leverages indexing as a means for creating metadata that makes corporate data easily searchable. The added twist this vendor offers, however, is the ability to create online metadata from files on tape reels, a lifesaver for companies housing a multitude of media in their vaults.
Despite the advances of such offerings, it would be disingenuous to paint data classification as a mature technology. That said, the technology is evolving and may in fact be the most effective means currently available for maintaining compliance, protecting sensitive data, and ensuring adequate responsiveness in the event of litigation. No other technology comes close to supplying an answer for those needs.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.