Simple aggregating query very slow in PostgreSql, any way to improve?

Posted by Ash on Stack Overflow See other posts from Stack Overflow or by Ash
Published on 2010-05-25T12:08:32Z Indexed on 2010/05/25 12:11 UTC
Read the original article Hit count: 192

Filed under:
|
|

HI

I have a table which holds files and their types such as

CREATE TABLE files (
    id          SERIAL PRIMARY KEY, 
    name        VARCHAR(255),
    filetype    VARCHAR(255),
    ...
);

and another table for holding file properties such as

CREATE TABLE properties (
    id          SERIAL PRIMARY KEY, 
    file_id     INTEGER CONSTRAINT fk_files REFERENCES files(id),
    size        INTEGER,
    ... // other property fields
);

The file_id field has an index.

The file table has around 800k lines, and the properties table around 200k (not all files necessarily have/need a properties).

I want to do aggregating queries, for example find the average size and standard deviation for all file types. But it's very slow - around 70 seconds for the latter query. I understand it needs a sequential scan, but still it seems too much. Here's the query

SELECT f.filetype, avg(size), stddev(size) FROM files as f, properties as pr 
 WHERE f.id = pr.file_id GROUP BY f.filetype;

and the explain

 HashAggregate  (cost=140292.20..140293.94 rows=116 width=13) (actual time=74013.621..74013.954 rows=110 loops=1)
   ->  Hash Join  (cost=6780.19..138945.47 rows=179564 width=13) (actual time=1520.104..73156.531 rows=179499 loops=1)
         Hash Cond: (f.id = pr.file_id)
         ->  Seq Scan on files f  (cost=0.00..108365.41 rows=1140941 width=9) (actual time=0.998..62569.628 rows=805270 loops=1)
         ->  Hash  (cost=3658.64..3658.64 rows=179564 width=12) (actual time=1131.053..1131.053 rows=179499 loops=1)
               ->  Seq Scan on properties pr  (cost=0.00..3658.64 rows=179564 width=12) (actual time=0.753..557.171 rows=179574 loops=1)
Total runtime: 74014.520 ms

Any ideas why it is so slow/how to make it faster?

© Stack Overflow or respective owner

Related posts about sql

Related posts about postgresql