Computerworld

A rationale for ILM: Automated data management is a must-have to avoid a storage Pleistocene

I have been traveling more than usual lately. Although my schedule has been hectic, it has given me the opportunity to meet new people and exchange ideas. One notion I keep hearing about is how easy it has become to set up a SAN, which is good news -- especially considering that I keep hearing it from normal people with solid IT backgrounds, not from storage gurus.

Those statements usually refer to entry-level solutions consisting of a few servers, a switch, and a single array. Generally, the products mentioned are built around either FC (Fibre Channel) or iSCSI (Internet SCSI) transport.

Why is this good news? Because ease of use has a direct impact on lowering the TCO of storage products, which translates into more purchases of networked storage solutions. There is also a psychological impact: Customers are more likely to buy products that they perceive as less intimidating.

Moving to networked storage also carries a strategic advantage, which is the direct effect of decoupling storage from application servers, and opens the way to future advanced storage administration frameworks such as ILM (information lifecycle management).

If you missed it, you should read Bob Francis' excellent article on the ILM definition set out by the Storage Networking Industry Association (SNIA). The way I interpret the SNIA definition, ILM is about automatically determining the most cost-effective support structure for your data -- a task that for a variety of reasons is easier to define than to achieve.

One major obstacle to ILM is that business applications do not define the business relevance of their data in terms that other applications can easily understand. Not surprisingly, the lack of a common data definition language leaves both storage hardware vendors and ISVs working in the dark and with blinders on. For example, backup and restore applications take what I call the "fire-hose approach" to data protection: Their focus is to dump files from point A to point B as quickly as possible, regardless of the data content.

This approach was adequate 20 years ago when companies had much less data to handle and when identifying the content of databases and files was so much easier. A more sensible approach would be to spread the output of each backup to multiple devices according to business rules that analyze the data content of each source file. Some data would be saved to disk for quick retrieval; other data would be moved to optical media for secure, long-term storage; and other data would end up on tape media for affordable, medium-term storage.

If you're unsure about the cost gap between different backup solutions, a recent paper from the Tape Technology Council, a nonprofit organization formed by prime vendors of tape drives and media, offers some interesting figures. In recent study, the Council analyzed three different approaches to data protection for a 10TB capacity, comparing solutions based on a tape library, an optical jukebox, and a RAID disk array.

As expected, the report shows that a tape-based solution is the least expensive alternative (about US$100,000 over three years), an optical library solution costs 50 percent more, and a mirrored disk array costs four times as much.

Admittedly, the paper doesn't consider factors such as a higher ROI that could divert a user from the cheapest solution. That's exactly why ILM is important: It replaces a one-size-fits-all approach to data management with a more focused strategy where you decide the best possible allocation in your infrastructure for each data token.

Does this sound more like science fiction than storage administration? Perhaps it's because ILM is still largely a work in progress, but in the not-too-distant future, these kinds of scenarios will become common. Why am I so sure? Because a future without ILM (or whatever name that concept takes) would likely be the equivalent of an ice age for storage, and that's something neither vendors nor customers can afford.

Mario Apicella is a senior analyst at the InfoWorld Test Center.