Backing up data stored on Amazon S3

Posted by Fiver on Server Fault See other posts from Server Fault or by Fiver
Published on 2013-11-07T21:48:58Z Indexed on 2013/11/07 21:58 UTC
Read the original article Hit count: 329

Filed under:

I have an EC2 instance running a web server that stores users' uploaded files to S3. The files are written once and never change, but are retrieved occasionally by the users. We will likely accumulate somewhere around 200-500GB of data per year. We would like to ensure this data is safe, particularly from accidental deletions and would like to be able to restore files that were deleted regardless of the reason.

I have read about the versioning feature for S3 buckets, but I cannot seem to find if recovery is possible for files with no modification history. See the AWS docs here on versioning:

http://docs.aws.amazon.com/AmazonS3/latest/dev/ObjectVersioning.html

In those examples, they don't show the scenario where data is uploaded, but never modified, and then deleted. Are files deleted in this scenario recoverable?

Then, we thought we may just backup the S3 files to Glacier using object lifecycle management:

http://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html

But, it seems this will not work for us, as the file object is not copied to Glacier but moved to Glacier (more accurately it seems it is an object attribute that is changed, but anyway...).

So it seems there is no direct way to backup S3 data, and transferring the data from S3 to local servers may be time-consuming and may incur significant transfer costs over time.

Finally, we thought we would create a new bucket every month to serve as a monthly full backup, and copy the original bucket's data to the new one on Day 1. Then using something like duplicity (http://duplicity.nongnu.org/) we would synchronize the backup bucket every night. At the end of the month we would put the backup bucket's contents in Glacier storage, and create a new backup bucket using a new, current copy of the original bucket...and repeat this process. This seems like it would work and minimize the storage / transfer costs, but I'm not sure if duplicity allows bucket-to-bucket transfers directly without bringing data down to the controlling client first.

So, I guess there are a couple questions here. First, does S3 versioning allow recovery of files that were never modified? Is there some way to "copy" files from S3 to Glacier that I have missed? Can duplicity or any other tool transfer files between S3 buckets directly to avoid transfer costs? Finally, am I way off the mark in my approach to backing up S3 data?

Thanks in advance for any insight you could provide!

Developer IT

Backing up data stored on Amazon S3 - Developer IT

Backing up data stored on Amazon S3

backup

amazon-web-services

amazon-s3

Related posts about backup

backup exec - backup to disk offline

Ideal backup appliance for backup software like Bacula?

Symantec Backup Exec Error on backup

Windows Server Backup - Recover only shows the latest backup

Failed Backup Job With Backup Exec 12 and AOFO

Related posts about amazon-web-services

amazon web services and sql server support

Unable to list owned images and running instances from Amazon Web Services using Zend Framework

Amazon Web services - retrieving a wishlist

Amazon Web Services: A Developer Primer

amazon web services

Categories cloud