Best Approach for Checking and Inserting Records

Posted by nevets1219 on Stack Overflow See other posts from Stack Overflow or by nevets1219
Published on 2010-05-21T18:32:17Z Indexed on 2010/05/21 18:50 UTC
Read the original article Hit count: 156

Filed under:
|

In one of our existing C programs which purpose is:

Open connection to DB
for record in all_record:
  if record contain certain data:
    if record is NOT in table A: // see #1
      insert record information into table A and B // see #2
Close connection to DB
  1. select field from table where field=XXX
  2. 2 inserts

This is typically done every X months to sync everything up or so I'm told. I've also been told that this process takes roughly a couple of days. There is (currently) at most 2.5million records (though not necessarily all 2.5m will be inserted). One of the table contains 10 fields and the other 5 fields. There isn't much to be done about iterating through the records since that part can't be changed at the moment. What I would like to do is speed up the part where I query MySQL.

I'm not sure if I have left out any important details -- please let me know! I'm also no SQL expert so feel free to point out the obvious.

I thought about:

  1. Putting all the inserts into a transaction (at the moment I'm not sure how important it is for the transaction to be all-or-none or if this affects performance)
  2. Using Insert X Where Not Exists Y
  3. LOAD DATA INFILE (but that would require I create a (possibly) large temp file)

I read that (hopefully someone can confirm) I should drop indexes so they aren't re-calculated.

mysql Ver 14.7 Distrib 4.1.22, for sun-solaris2.10 (sparc) using readline 4.3

© Stack Overflow or respective owner

Related posts about mysql

Related posts about Performance