Most cost efficient way to backup Subversion data to S3?
- by sludge
I'm looking at using S3 as an offsite backup repo for my Subversion database. When I dump my SVN database, it's about 10 gigabytes. I would like to avoid the charge of uploading that data repeatedly.
The anatomy of this large file such that new changes to Subversion modify the tail of the file, with everything else staying the same. Because Amazon S3 does not allow you to "patch" files with changes, I will have to upload ten gigs every time I instantiate a backup after doing a simple submit to Subversion.
Here are the options as I see them:
Option 1
I am looking at duplicity which has --volsize which splits data over an amount of megs. Is it possible to split the Subversion dumps using this so further incremental backups are measured in megabytes?
Option 2
Can I just backup the hot subversion repository? This seems like a bad idea if it is in the middle of writing a submit. However, I have the option of taking the repo offline between the hours of midnight and 4am. Each revision in my Berkeley DB uses a file as its record.