Iterating over a large data set in long running Python process - memory issues?
Posted
by
user1094786
on Stack Overflow
See other posts from Stack Overflow
or by user1094786
Published on 2012-06-05T04:32:32Z
Indexed on
2012/06/05
4:40 UTC
Read the original article
Hit count: 131
I am working on a long running Python program (a part of it is a Flask API, and the other realtime data fetcher).
Both my long running processes iterate, quite often (the API one might even do so hundreds of times a second) over large data sets (second by second observations of certain economic series, for example 1-5MB worth of data or even more). They also interpolate, compare and do calculations between series etc.
What techniques, for the sake of keeping my processes alive, can I practice when iterating / passing as parameters / processing these large data sets? For instance, should I use the gc module and collect manually?
Any advice would be appreciated. Thanks!
© Stack Overflow or respective owner