Data Reduction Chapter 2: Surviving the tidal wave of data - options for data reduction
In chapter 1, we discussed the struggles that storage administrators are having with the tidal wave of data. In this chapter, well begin talking about how data reduction technologies can help you survive and even thrive in the face of these challenges.
IBM takes a holistic approach to data reduction, unlike competitors that offer point solutions to problems that they may in fact be causing. For example, a huge contributor to data growth is the repeated duplication of large amounts of data every time you perform a full backup.
So, one option is to avoid data growth from unnecessary data duplication, by only backing up data that has changed since the last backup. This addresses the cause of the problem, not the symptom. For example, if you have a 5 percent per week data change rate, 95 percent of your data didnt change this week. If you perform a full backup on that this weekend, youre duplicating almost everything you backed up last weekend. Not only does that take a lot of storage capacity, but it also takes a long time and these problems only get worse as you create more new data. Its no wonder that data deduplication products are so popular they were designed to eliminate all this duplicate data. And when they claim to reduce your backup storage footprint by 95 percent or more, this is exactly the data that theyre talking about.
Another option is to determine what different types of data you have and categorize it so that you can manage it most effectively, by moving less frequently-accessed data to lower-cost tiers of storage, and by deleting data that you no longer need or want. This will shorten your backup cycles and improve application performance, as well as reduce or delay the need to buy more primary storage capacity.
A third option is to put automated processes in place, based on policies that meet business requirements and/or service level agreements, to migrate, archive and delete data. There are several actions that can be taken on your data files based on criteria such as age, how long it has been since last access, which application created it, etc. These automated solutions can include:
Transparent migration of data from production storage systems to a hierarchy of secondary systems; the data remains on-line and available without any modifications to applications.
Archival of data, removing it completely from production systems and storing it in secure storage where retention policies can be set and managed.
Expiration of data, deleting it from all storage once it no longer needed or to meet corporate governance policies.
The last option is to compress and deduplicate the data you end up putting into your data protection and retention systems. Data deduplication is the most popular technology in this category, and well discuss it and the other technologies mentioned above in greater detail in future chapters of this blog.
To learn more, please visit the Data Reduction Solutions web page and stay tuned for Chapter 3 in which we'll dig into the first step in effective data reduction.
Data Reduction - Chapter 2
Richard Vining 2700019R2A firstname.lastname@example.org Tags:  archive storage-blog data-management deduplication hsm backup data-reduction space-managment 2,393 Visits