When it comes to storage infrastructure, few topics have been more talked about in 2011 than the phenomenon of ‘Big Data’ — the exponential growth of data — largely off the back of the huge transactional systems which underpin global commerce — which must be stored, backed up and archived.
For IBRS analyst, Kevin McIsaac, part of why Big Data has become such an issue is the fact that while transactional data has been piling up, unstructured data, such as Word and Excel files, videos, photographs, emails and the like has grown exponentially.
“The amount of data we have which are files, images and graphics, is now between 65 to 80 per cent of our data so it’s by far the largest part,” McIsaac says. “Transactional data has been growing at around 35 per cent per annum whereas unstructured data is growing anywhere from 60 to 100 per cent per annum.”
Add to these two issues the increasing number of devices each employee now uses within work IT environment, and data generated from new communication media such as Twitter, Facebook and LinkedIn, and you’d think that making sure organisational backup and archiving hardware, software and strategies were optimised would be a major priority. But, you’d be wrong.
If the storage experts are to be believed plenty of organisations are confusing key functions of storage management — namely backup and archiving — resulting in increased costs, power consumption, widening backup windows, mixed success recovery processes, and unnecessary data sets.
So just what is backup and what is archiving and how did the two come to be confused?
Backup vs. archiving
On paper, the distinction between backup and archiving is black and white. As a data storage process, backup is more immediate and aims to enable the business to recover data quickly and from a recent period of time, typically between one and six months, and is often carried out using disk as the storage medium. Archiving is the storage of data for many years, such as for future financial and compliance audits. Given the span of time, and the cost to spin disk for years at a time on the off-chance data is needed, the task of storing this data often falls to tape-based platforms.
Where organisations often go wrong in their storage strategies is in keeping their backup data on disk for years despite the fact that the IT department is unlikely to be called on to recover data that is more than a month or two old. “Archive is different from backup, its purpose is long term retention, while backup is to restore a few hours, a day or a few weeks later,” IBRS’ McIsaac says.
In addition to confusing archiving with backup, Telsyte senior analyst, Rodney Gedda, says some IT departments do not apply a ‘use-by date’ to their data to allow it to move from backup to archiving.
“I think there tends to be a bit of a mish-mash out there, you’ve got organisations like the ABC which previously put everything on archive tapes and have now decided they need to access it, or it might be a case of thinking you need to infinitely store everything and have access to it,” he says. “However, that’s not a prudent decision as you end up with expensive disk systems running and archiving and backup programs that aren’t necessary.”
When it comes to the success of backup and archiving, Gartner analyst Phil Sargeant, says a contributing factor has been the tendency for vendors themselves to focus more heavily on the backup and recovery space, with archiving platforms only becoming more prominent in the last few years.
Despite some confusion between the two functions, he says larger corporates generally get it right more often than SMEs. “There is no doubt that the lines have blurred, I’d say a lot of the larger companies I speak to are fairly sophisticated so they don’t confuse that… they know the distinction between backup and recovery and archiving, I’d say small to medium businesses are the ones who get a little bit confused at times,” he says.
“When there wasn’t a major distinction between the two a lot of people got confused because they could almost use backup and recovery and archiving interchangeably because there wasn’t the big data sets, but that’s certainly not the case today.”
One organisation having to come to grips with backup and archiving is Australian financial services firm Teacher’s Credit Union (TCU). Its CIO, Colin Thomas, says that the additional impetus of regulations to retain heath and financial data for certain lengths of time has prompted it to take a much closer look at its storage strategy.
“It’s much easier to back things up than to take the position of classifying data and actually start to expire data you no longer need,” Thomas says. “[Keeping everything] was a previous approach we took.”
The firm is now undergoing a data classification process to determine specific retention periods for different types of data.
“We’re implementing a document records management solution as well so we have to determine retention periods and whether or not we need to keep the data on worm (Write One Read Many) storage or whether it needs to be retained on normal disk type storage.”
“ We have a preference to keep all our long-time member information that’s required for contractual reasons, on worm storage, it can’t be changed once you store it and the regulators and the courts tend to require evidence that you have a document in its original state.
“In the past we would have done optical storage but now we’re able to keep that online using EMC technologies to ensure the data we hold on disk cannot be changed from its original state.”