S3 Backup Memory Usage in Python

Posted by danpalmer on Stack Overflow See other posts from Stack Overflow or by danpalmer
Published on 2010-03-17T18:06:12Z Indexed on 2010/03/17 18:31 UTC
Read the original article Hit count: 407

Filed under:

python

|

amazon-s3

|

memory

|

upload

I currently use WebFaction for my hosting with the basic package that gives us 80MB of RAM. This is more than adequate for our needs at the moment, apart from our backups. We do our own backups to S3 once a day.

The backup process is this: dump the database, tar.gz all the files into one backup named with the correct date of the backup, upload to S3 using the python library provided by Amazon.

Unfortunately, it appears (although I don't know this for certain) that either my code for reading the file or the S3 code is loading the entire file in to memory. As the file is approximately 320MB (for today's backup) it is using about 320MB just for the backup. This causes WebFaction to quit all our processes meaning the backup doesn't happen and our site goes down.

So this is the question: Is there any way to not load the whole file in to memory, or are there any other python S3 libraries that are much better with RAM usage. Ideally it needs to be about 60MB at the most! If this can't be done, how can I split the file and upload separate parts?

Thanks for your help.

This is the section of code (in my backup script) that caused the processes to be quit:

filedata = open(filename, 'rb').read()
content_type = mimetypes.guess_type(filename)[0]
if not content_type:
    content_type = 'text/plain'
print 'Uploading to S3...'
response = connection.put(BUCKET_NAME, 'daily/%s' % filename, S3.S3Object(filedata), {'x-amz-acl': 'public-read', 'Content-Type': content_type})

© Stack Overflow or respective owner

Related posts about python

unmet dependencies in Ubuntu 12.04

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I tried today to install a dvb-card on my Ubuntu 12.04 (Linux blauhai-linux 3.2.0-25-generic #40-Ubuntu SMP Wed May 23 20:30:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux ). The installation failed with an error. After that, i tried to install python (it was already installed but i got this error): linux:~$… >>> More
How can I get sikuli-ide to work?

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I installed sikuli-ide with sudo apt-get install sikuli-ide Everything was fine until I tried to start it from the terminal. I typed sikuli-ide But the only response I got was [info] locale: en_US The application was not started, furthermore there is no desktop file and sikuli-ide does not… >>> More
Getting PATH right for python after MacPorts install

as seen on Super User - Search for 'Super User'
I can't import some python libraries (PIL, psycopg2) that I just installed with MacPorts. I looked through these forums, and tried to adjust my PATH variable in $HOME/.bash_profile in order to fix this but it did not work. I added the location of PIL and psycopg2 to PATH. I know that Terminal is… >>> More
call python with system() in R to run a python script emulating the python console

as seen on Stack Overflow - Search for 'Stack Overflow'
I want to pass a chunk of Python code to Python in R with something like system('python ...'), and I'm wondering if there is an easy way to emulate the python console in this case. For example, suppose the code is "print 'hello world'", how can I get the output like this in R? >>> print… >>> More
Python - Calling a non python program from python?

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I am currently struggling to call a non python program from a python script. I have a ~1000 files that when passed through this C++ program will generate ~1000 outputs. Each output file must have a distinct name. The command I wish to run is of the form: program_name -input -output -o1 -o2… >>> More

Related posts about amazon-s3

Amazon S3 Tips: Quickly Add/Modify HTTP Headers To All Files Recursively

as seen on Tech Dreams - Search for 'Tech Dreams'
Amazon S3 is an dead cheap cloud storage service that offers unlimited storage in pay as you use model. Recently we moved all the images and other static files(scripts & css) of Tech Dreams to Amazon S3 to reduce load on VPS server. Amazon S3 is cheap, but monthly bill will shoot up if images/static… >>> More
Alternative to Amazon's S3 service?

as seen on Super User - Search for 'Super User'
Just wondering if there is good alternative to Amazon's S3 service? I like S3 but the bandwidth cost is high. I looked at CouldFiles from Rackspace but the cost is even higher. I don't mind prepaying or having monthly payment in order to reduce the bandwidth cost greatly. Thank you for any help >>> More
Alternative to Amazon’s S3 service?

as seen on Stack Overflow - Search for 'Stack Overflow'
Just wondering if there is good alternative to Amazon's S3 service? I like S3 but the bandwidth cost is high. I looked at CouldFiles from Rackspace but the cost is even higher. I don't mind prepaying or having monthly payment in order to reduce the bandwidth cost greatly. Thank you for any help >>> More
Saving Python Complex Data Types to Amazon S3

as seen on Stack Overflow - Search for 'Stack Overflow'
Can Python class data be saved to S3 without marshalling? I am trying to cut down of I/O operations until necessary. >>> More
Filesize with SWFUpload and Amazon S3

as seen on Stack Overflow - Search for 'Stack Overflow'
Hello all, I'm currently using SWFUpload to upload files to my S3 bucket. And it's working great. I'm using the script from a website here: http://www.anedix.com/news/article/50 Again, the upload to my S3 works fine, however, I've been running into an issue when attempting to upload larger… >>> More