Why do dicts of defaultdict(int)'s use so much memory? (and other simple python performance question

Posted by dukhat on Stack Overflow See other posts from Stack Overflow or by dukhat
Published on 2010-04-30T20:33:57Z Indexed on 2010/04/30 20:37 UTC
Read the original article Hit count: 116

Filed under:

python

|

memory

|

Performance

|

runtime

import numpy as num
from collections import defaultdict

topKeys = range(16384)
keys = range(8192)

table = dict((k,defaultdict(int)) for k in topKeys)

dat = num.zeros((16384,8192), dtype="int32")

print "looping begins"
#how much memory should this use? I think it shouldn't use more that a few
#times the memory required to hold (16384*8192) int32's (512 mb), but
#it uses 11 GB!
for k in topKeys:
    for j in keys:
        dat[k,j] = table[k][j]

print "done"

What is going on here? Furthermore, this similar script takes eons to run compared to the first one, and also uses an absurd quantity of memory.

topKeys = range(16384)
keys = range(8192)
table = [(j,0) for k in topKeys for j in keys]

I guess python ints might be 64 bit ints, which would account for some of this, but do these relatively natural and simple constructions really produce such a massive overhead? I guess these scripts show that they do, so my question is: what exactly is causing the high memory usage in the first script and the long runtime and high memory usage of the second script and is there any way to avoid these costs?

© Stack Overflow or respective owner

Related posts about python

unmet dependencies in Ubuntu 12.04

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I tried today to install a dvb-card on my Ubuntu 12.04 (Linux blauhai-linux 3.2.0-25-generic #40-Ubuntu SMP Wed May 23 20:30:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux ). The installation failed with an error. After that, i tried to install python (it was already installed but i got this error): linux:~$… >>> More
How can I get sikuli-ide to work?

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I installed sikuli-ide with sudo apt-get install sikuli-ide Everything was fine until I tried to start it from the terminal. I typed sikuli-ide But the only response I got was [info] locale: en_US The application was not started, furthermore there is no desktop file and sikuli-ide does not… >>> More
Getting PATH right for python after MacPorts install

as seen on Super User - Search for 'Super User'
I can't import some python libraries (PIL, psycopg2) that I just installed with MacPorts. I looked through these forums, and tried to adjust my PATH variable in $HOME/.bash_profile in order to fix this but it did not work. I added the location of PIL and psycopg2 to PATH. I know that Terminal is… >>> More
call python with system() in R to run a python script emulating the python console

as seen on Stack Overflow - Search for 'Stack Overflow'
I want to pass a chunk of Python code to Python in R with something like system('python ...'), and I'm wondering if there is an easy way to emulate the python console in this case. For example, suppose the code is "print 'hello world'", how can I get the output like this in R? >>> print… >>> More
Python - Calling a non python program from python?

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I am currently struggling to call a non python program from a python script. I have a ~1000 files that when passed through this C++ program will generate ~1000 outputs. Each output file must have a distinct name. The command I wish to run is of the form: program_name -input -output -o1 -o2… >>> More

Related posts about memory

SQL Server and Hyper-V Dynamic Memory Part 2

as seen on SQL Blog - Search for 'SQL Blog'
Part 1 of this series was an introduction and overview of Hyper-V Dynamic Memory. This part looks at SQL Server memory management and how the SQL engine responds to changing OS memory conditions. Part 2: SQL Server Memory Management As with any Windows process, sqlserver.exe has a virtual… >>> More
SQL Server and Hyper-V Dynamic Memory - Part 1

as seen on SQL Blog - Search for 'SQL Blog'
SQL and Dynamic Memory Blog Post Series Hyper-V Dynamic Memory is a new feature in Windows Server 2008 R2 SP1 that allows the memory assigned to guest virtual machines to vary according to demand. Using this feature with SQL Server is supported, but how well does it work in… >>> More
SQL Server and Hyper-V Dynamic Memory - Part 1

as seen on SQL Blog - Search for 'SQL Blog'
SQL and Dynamic Memory Blog Post Series Hyper-V Dynamic Memory is a new feature in Windows Server 2008 R2 SP1 that allows the memory assigned to guest virtual machines to vary according to demand. Using this feature with SQL Server is supported, but how well does it work in… >>> More
Python Memory leak - Solved, but still puzzled

as seen on Stack Overflow - Search for 'Stack Overflow'
Dear everyone, I have successfully debugged my own memory leak problems. However, I have noticed some very strange occurence. for fid, fv in freqDic.iteritems(): outf.write(fid+"\t") #ID for i, term in enumerate(domain): #Vector tfidf = self.tf(term… >>> More
SQL Server and Hyper-V Dynamic Memory Part 3

as seen on SQL Blog - Search for 'SQL Blog'
In parts 1 and 2 of this series we looked at the basics of Hyper-V Dynamic Memory and SQL Server memory management. In this part Serdar looks at configuration guidelines for SQL Server memory management. Part 3: Configuration Guidelines for Hyper-V Dynamic Memory and SQL Server Now that… >>> More