Making Sense of Data Deduplication
Data deduplication was the “Buzz” word of the month a little while back. Many cold-callers would open with “If you aren’t using dedupe, you are wasting money.” I don’t put much stock in the buzz word of the month club, but data deduplication does offer some value to clients. Before I dive into the value that I see, let’s get a brief definition of what it is. Good old Wikipedia says data deduplication is “the elimination of duplicated or redundant data.” Well, we already got that much from the name, so diving a little further in, deduplication cuts down on copying the same data over by referencing the original full copy, similar to when I just referenced Wikipedia. I gave you a hyperlink or pointer to go see all that Wikipedia had to offer without copying the entire page. Data deduplication works very similarly, it consolidates all that duplicate data of and replaces it with pointers to the original material.
Now that we have a basic understanding of what it does, the real question is WHY does it matter to you? Company storage data doubles every 1 to 2 years. Storage product, such as disk and tape, costs continue to decrease per GB, but they are not keeping pace with the amount of increased data that must be stored. There are also recurring costs of storage; such as power, cooling, and real estate. Therefore, your storage costs are increasing exponentially every time you plug in a new device. What deduplication allows you to do is minimize the space required to store your files, thus allowing you to get more out of every disk, limiting power consumption, and keeping data center sprawl to a minimum. This equates to saving you money! In short, you should consider data deduplication when you want to store more data, in less space, for less cost.
There are multiple companies that offer deduplication software and each claims to have a better “special sauce” than the other. Some of these vendors are HP, NetApp, and Data Domain. A good IT partner can help you work through the particulars of which product will work best for you, and why and where to use deduplication technology.
A few interesting links on the topic:
HP White Paper: http://h20195.www2.hp.com/V2/GetPDF.aspx/4AA1-9796ENW.pdf
List of White Papers by multiple vendors: http://whitepapers.businessweek.com/rlist/term/Data-Deduplication.html





