Querying Postgresql with a very large result set
- by sanity
In an application I need to query a Postgres DB where I expect tens or even hundreds of millions of rows in the result set. I might do this query once a day, or even more frequently. The query itself is relatively simple, although may involve a few JOINs.
My question is: How smart is Postgres with respect to avoiding having to seek around the disk for each row of the result set? Given the time required for a hard disk seek, this could be extremely expensive.
If this isn't an issue, how does Postgres avoid it? How does it know how to lay out data on the disk such that it can be streamed out in an efficient manner in response to this query?