De-duplicating backup tool on a block basis? [closed]
- by SST
I am looking for an (ideally free as in speech or beer) backup tool for Unix-like OS which can store deduplicated backups, i.e. only nonredundant content takes up additional space.
I already looked at dirvish (my first candidate) and rsnapshot which use hardlinks to achieve deduplication on a per-file level. However, as I want to back up large files (Thunderbird mailboxes 3GB, VMware images 10GB), such file are stored again entirely even if just a few bytes change. Then there are rsync-based tools like rdiff-backup which only store deltas and a current mirror. However, as the deltas are generated against each previous mirror, it is difficult to fine-tune the retention granularity (only keep one backup after a week, etc.) because the deltas would have to be re-evaluated.
Another approach is to partition content into blocks and store each block only if it is not stored yet, otherwise just linking it to the first occurrence. The only tool I know of that does this by now is obnam (http://liw.fi/obnam), and it even supports zlib-compression and gpg-encryption -- nice! But it is very slow, AFAICT.
Does any one know any other, solid backup software which supports deduplication on a sub-file level, ideally with at least some management options (show/select/delete generations...)?