How to optimize this SQL query for a rectangular region?
- by Andrew B.
I'm trying to optimize the following query, but it's not clear to me what index or indexes would be best. I'm storing tiles in a two-dimensional plane and querying for rectangular regions of that plane. The table has, for the purposes of this question, the following columns:
id: a primary key integer
world_id: an integer foreign key which acts as a namespace for a subset of tiles
tileY: the Y-coordinate integer
tileX: the X-coordinate integer
value: the contents of this tile, a varchar if it matters.
I have the following indexes:
"ywot_tile_pkey" PRIMARY KEY, btree (id)
"ywot_tile_world_id_key" UNIQUE, btree (world_id, "tileY", "tileX")
"ywot_tile_world_id" btree (world_id)
And this is the query I'm trying to optimize:
ywot=> EXPLAIN ANALYZE SELECT * FROM "ywot_tile" WHERE ("world_id" = 27685 AND "tileY" <= 6 AND "tileX" <= 9 AND "tileX" >= -2 AND "tileY" >= -1 ); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on ywot_tile (cost=11384.13..149421.27 rows=65989 width=168) (actual time=79.646..80.075 rows=96 loops=1)
Recheck Cond: ((world_id = 27685) AND ("tileY" <= 6) AND ("tileY" >= (-1)) AND ("tileX" <= 9) AND ("tileX" >= (-2)))
-> Bitmap Index Scan on ywot_tile_world_id_key (cost=0.00..11367.63 rows=65989 width=0) (actual time=79.615..79.615 rows=125 loops=1)
Index Cond: ((world_id = 27685) AND ("tileY" <= 6) AND ("tileY" >= (-1)) AND ("tileX" <= 9) AND ("tileX" >= (-2)))
Total runtime: 80.194 ms
So the world is fixed, and we are querying for a rectangular region of tiles. Some more information that might be relevant:
All the tiles for a queried region may or may not be present
The height and width of a queried rectangle are typically about 10x10-20x20
For any given (world, X) or (world, Y) pair, there may be an unbounded number of matching tiles, but the worst case is currently around 10,000, and typically there are far fewer.
New tiles are created far less frequently than existing ones are updated (changing the 'value'), and that itself is far less frequent that just reading as in the query above.
The only thing I can think of would be to index on (world, X) and (world, Y). My guess is that the database would be able to take those two sets and intersect them. The problem is that there is a potentially unbounded number of matches for either for either of those. Is there some other kind of index that would be more appropriate?