How to speed up a slow UPDATE query

Posted by Mike Christensen on Stack Overflow See other posts from Stack Overflow or by Mike Christensen
Published on 2012-06-18T16:08:55Z Indexed on 2012/06/20 15:16 UTC
Read the original article Hit count: 179

Filed under:
|

I have the following UPDATE query:

UPDATE Indexer.Pages SET LastError=NULL where LastError is not null;

Right now, this query takes about 93 minutes to complete. I'd like to find ways to make this a bit faster.

The Indexer.Pages table has around 506,000 rows, and about 490,000 of them contain a value for LastError, so I doubt I can take advantage of any indexes here.

The table (when uncompressed) has about 46 gigs of data in it, however the majority of that data is in a text field called html. I believe simply loading and unloading that many pages is causing the slowdown. One idea would be to make a new table with just the Id and the html field, and keep Indexer.Pages as small as possible. However, testing this theory would be a decent amount of work since I actually don't have the hard disk space to create a copy of the table. I'd have to copy it over to another machine, drop the table, then copy the data back which would probably take all evening.

Ideas? I'm using Postgres 9.0.0.

UPDATE:

Here's the schema:

CREATE TABLE indexer.pages
(
  id uuid NOT NULL,
  url character varying(1024) NOT NULL,
  firstcrawled timestamp with time zone NOT NULL,
  lastcrawled timestamp with time zone NOT NULL,
  recipeid uuid,
  html text NOT NULL,
  lasterror character varying(1024),
  missingings smallint,
  CONSTRAINT pages_pkey PRIMARY KEY (id ),
  CONSTRAINT indexer_pages_uniqueurl UNIQUE (url )
);

I also have two indexes:

CREATE INDEX idx_indexer_pages_missingings
  ON indexer.pages
  USING btree
  (missingings )
  WHERE missingings > 0;

and

CREATE INDEX idx_indexer_pages_null
  ON indexer.pages
  USING btree
  (recipeid )
  WHERE NULL::boolean;

There are no triggers on this table, and there is one other table that has a FK constraint on Pages.PageId.

© Stack Overflow or respective owner

Related posts about sql

Related posts about postgresql