Simple aggregating query very slow in PostgreSql, any way to improve?

Posted by Ash on Stack Overflow See other posts from Stack Overflow or by Ash
Published on 2010-05-25T12:08:32Z Indexed on 2010/05/25 12:11 UTC
Read the original article Hit count: 237

Filed under:

aggregate

HI

I have a table which holds files and their types such as

CREATE TABLE files (
    id          SERIAL PRIMARY KEY, 
    name        VARCHAR(255),
    filetype    VARCHAR(255),
    ...
);

and another table for holding file properties such as

CREATE TABLE properties (
    id          SERIAL PRIMARY KEY, 
    file_id     INTEGER CONSTRAINT fk_files REFERENCES files(id),
    size        INTEGER,
    ... // other property fields
);

The file_id field has an index.

The file table has around 800k lines, and the properties table around 200k (not all files necessarily have/need a properties).

I want to do aggregating queries, for example find the average size and standard deviation for all file types. But it's very slow - around 70 seconds for the latter query. I understand it needs a sequential scan, but still it seems too much. Here's the query

SELECT f.filetype, avg(size), stddev(size) FROM files as f, properties as pr 
 WHERE f.id = pr.file_id GROUP BY f.filetype;

and the explain

 HashAggregate  (cost=140292.20..140293.94 rows=116 width=13) (actual time=74013.621..74013.954 rows=110 loops=1)
   ->  Hash Join  (cost=6780.19..138945.47 rows=179564 width=13) (actual time=1520.104..73156.531 rows=179499 loops=1)
         Hash Cond: (f.id = pr.file_id)
         ->  Seq Scan on files f  (cost=0.00..108365.41 rows=1140941 width=9) (actual time=0.998..62569.628 rows=805270 loops=1)
         ->  Hash  (cost=3658.64..3658.64 rows=179564 width=12) (actual time=1131.053..1131.053 rows=179499 loops=1)
               ->  Seq Scan on properties pr  (cost=0.00..3658.64 rows=179564 width=12) (actual time=0.753..557.171 rows=179574 loops=1)
Total runtime: 74014.520 ms

Any ideas why it is so slow/how to make it faster?

Developer IT

Simple aggregating query very slow in PostgreSql, any way to improve? - Developer IT

Simple aggregating query very slow in PostgreSql, any way to improve?

sql

postgresql

aggregate

Related posts about sql

SQL SERVER – Concat Strings in SQL Server using T-SQL – SQL in Sixty Seconds #035 – Video

SQL SERVER – Concat Function in SQL Server – SQL Concatenation

Error with SQL Server Setup 2012 on Windows 2012

Nested SQL Select statement fails on SQL Server 2000, ok on SQL Server 2005

How can I detect which version of SQL (eg SQL 2008 or SQL Azure)

Related posts about postgresql

Postgresql fails to start on Ubuntu 10.04.4 LTS

can't install psycopg2 in my env on mac os x lion

Postgresql has broken apt-get on Ubuntu

Installing PostgreSQL on FreeBSD (with ports)

Strange permission errors in new PostgreSQL installation

Categories cloud