PostgreSQL lots of large Arrays and Writes
- by strife911
Hi,
I am running a python program that spawns 8 threads and as each thread launch its own postmaster process via psycopg2. This is to maximize the use of my CPU-cores (8). Each thread call a series of SQL Functions. Most of these functions go through many thousands of rows each associated to a large FLOAT8[] Array (250-300) values by using unnest() and multiplying each FLOAT8 by an another FLOAT8 associated to each row. This Array approach minimized the size of the Indexes and the Tables. The Function ends with an Insert into another Table of a row of the same form (pk INT4, array FLOAT8[]). Some SQL Functions called by python will Update a row of these kind of Tables (with large Arrays).
Now I currently have configured PostgreSQL to use most of the memory for cache (effective_cache_size of 57 GB I think) and only a small amount of it for shared memory (1GB I think). First, I was wondering what the difference between Cache and Shared memory was in regards to PostgreSQL (and my application).
What I have noticed is that only about 20-40% of my total CPU processing power is used during the most Read intensive parts of the application (Select unnest(array) etc). So secondly, I was wondering what I could do to improve this so that 100% of the CPU is used. Based on my observations, it does not seem to have anything to do with python or its GIL.
Thanks