Data Reduction Chapter 8: Deduplication with Tivoli Storage Manager 6, FastBack and ProtecTIER
So far in this series, we’ve detailed the challenges that the tidal wave of data is placing on storage administrators, and how a smarter, more holistic and comprehensive approach to data reduction is needed to survive in a way that let’s you do more with less.
We covered eliminating the largest source of duplicate data (full backups) and automating the migration, archiving and deletion of older data. Then, in chapter 7, we covered the basics of data deduplication. Now we’ll detail the differences between IBM’s deduplication offerings, and when to best use each.
Let’s talk first about the deduplication capabilities of Tivoli Storage Manager (TSM). This feature is included at no additional charge for TSM 6 Extended Edition customers. This solution can help to reduce recovery times by enabling you to store more backup data and recovery points on disk rather than tape. It works with the data from all sources – via normal backups, data imported via the TSM API, as well as archive and HSM data. TSM deduplicates your disk-based data pools as a post-process, so there is no impact on backup performance. After running, it automatically reclaims the storage that has been freed up.
TSM already eliminates the most common cause of duplicate data – full backups – so the reduction ratios you can expect from TSM’s deduplication solution are fairly modest – the average is about 40%. But when combined with its progressive incremental backup approach and built-in data compression, TSM’s effective data reduction rate is extremely competitive with any other solution on the market, as has been detailed in a commissioned report written by Enterprise Strategy Group (ESG), available here (fair warning – registration required – sorry):
Announced today, Tivoli Storage Manager FastBack v6.1 also includes target-side data deduplication to help reduce the capacity required in the FastBack backup repository, adding to its value as the leading near-instant recovery solution on the market for business critical Windows servers and remote/branch offices. Also announced today was Linux support and tighter integration with the Tivoli Storage Manager Integrated Solutions Console (ISC), delivering on IBM’s vision of true enterprise-wide Unified Recovery Management.
IBM System Storage ProtecTIER is a technology leader in performance, scalability, data integrity and reliability. In true apple to apple comparisons this solution is the fastest on the market in real customer environments. A single ProtecTIER system can easily scale in both performance (1000MB/sec) AND capacity (1PB of deduplicated data). ProtecTIER is one of the few solutions that doesn’t rely on a hash algorithm and performs a byte level differential to ensure data is a duplicate for enterprise class data integrity. And ProtecTIER features all IBM best of breed components versus inexpensive OEM'd parts found in competitive products.
ProtecTIER has been proven in very large production environments and is supported worldwide by IBM’s services operations. The TS7650 ProtecTIER Deduplication Family ranges from small (7TB) to medium (18TB) to large-scale (36TB) appliances. And the TS7650G gateway offerings allow you to add the storage of your choice, up to 1PB. Active-Active cluster configurations also provide high availability capabilities.
Video on ProtecTIER: http://www.youtube.com/watch?v=6Uk41HpCTqo&feature=related
Review - Choosing TSM or ProtecTIER for Data Deduplication
While TSM works very well in ProtecTIER environments, you wouldn’t use both TSM deduplication and ProtecTIER deduplication simultaneously. That would require twice as much work for no additional benefit. So when should you choose one over the other? Both solutions offer the benefits of target side deduplication: greatly reduced storage capacity requirements (especially when using TSM’s progressive incremental backup). You’ll have lower operational costs, energy usage and Total Cost of Ownership. You also get faster recoveries with more data on disk.
Use TSM 6 built-in data deduplication when you desire that deduplication operations be completely integrated within TSM. You want the benefits of deduplication without the costs of separate hardware or software – it ships for free with TSM 6 Extended Edition. Or you desire end to end data lifecycle management with minimized data store requirements.
Use ProtecTIER when:
• You need the highest performance up to 1000 MB/sec or more
• You have a large amount of data and need scalable capacity and performance
• You need inline deduplication to avoid the operational impact of post processing
• You are deduplicating across multiple TSM (or other backup) servers
• You don’t have TSM and are performing weekly full backups.
To learn more, please visit the Data Reduction Solutions web page and stay tuned for chapter 9, where we’ll summarize IBM’s holistic approach to data reduction and show you how we can help you survive the tidal wave of data.
"The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions."
Data Reduction Chapter 8 - IBM Data Deduplication
Richard Vining 2700019R2A firstname.lastname@example.org Tags:  data-management backup deduplication data-reduction hsm archive storage-blog space-managment tivoli 3,826 Visits