Wikibon estimates that the effects of applying compression to primary storage are additive when combined with traditional data de-duplication solutions (e.g. Data Domain, Falconstor, Diligent, etc). This is based on discussions with practitioners and an analysis of each technology.
As an example, let's assume:
- Primary storage compression ratio 2:1
- Data de-duplication ratio for backup stream 10:1
In a perfect scenario, the capacity reduction ratio for the technologies in combination would yield a final 20:1 data reduction ratio for the backup stream. In a worst case scenario, the data de-duplication ratio would yield 10:1, meaning compression has no additive effect.
Wikibon believes the 'typical' rule-of-thumb is that combined, these technologies will yield a roughly 15:1 data reduction ratio, assuming these base reduction ratios and appropriate data candidates. In practice, compression on primary storage is likely to yield 30-60% improvements in capacity and as such, in real world environments the combined or 'blended' ratio would be lower but still substantially higher than data de-duplication as a standalone solution backing up non-compressed data.
Wikibon believes that this rule of thumb should work for either in-line data de-duplication (e.g. from Data Domain, Diligent, FalconStor etc., and targeted at back-up and restore), or background de-duplication (e.g. the NetApp A-SIS feature which is suitable for finding duplicate 4K blocks in on-line storage).