How can I efficiently manipulate 500k records in SQL Server 2005?

Posted by cdeszaq on Stack Overflow See other posts from Stack Overflow or by cdeszaq
Published on 2010-03-24T14:49:44Z Indexed on 2010/03/24 14:53 UTC
Read the original article Hit count: 211

Filed under:
|

I am getting a large text file of updated information from a customer that contains updates for 500,000 users. However, as I am processing this file, I often am running into SQL Server timeout errors.

Here's the process I follow in my VB application that processes the data (in general):

  1. Delete all records from temporary table (to remove last month's data) (eg. DELETE * FROM tempTable)
  2. Rip text file into the temp table
  3. Fill in extra information into the temp table, such as their organization_id, their user_id, group_code, etc.
  4. Update the data in the real tables based on the data computed in the temp table

The problem is that I often run commands like UPDATE tempTable SET user_id = (SELECT user_id FROM myUsers WHERE external_id = tempTable.external_id) and these commands frequently time out. I have tried bumping the timeouts up to as far as 10 minutes, but they still fail. Now, I realize that 500k rows is no small number of rows to manipulate, but I would think that a database purported to be able to handle millions and millions of rows should be able to cope with 500k pretty easily. Am I doing something wrong with how I am going about processing this data?

Please help. Any and all suggestions welcome.

© Stack Overflow or respective owner

Related posts about sql-server-2005

Related posts about timeout