Most efficient way to maintain a 'set' in SQL Server?
- by SEVEN YEAR LIBERAL ARTS DEGREE
I have ~2 million rows or so of data, each row with an artificial PK, and two Id fields (so: PK, ID1, ID2). I have a unique constraint (and index) on ID1+ID2.
I get two sorts of updates, both with a distinct ID1 per update.
100-1000 rows of all-new data (ID1 is new)
100-1000 rows of largely, but not necessarily completely overlapping data (ID1 already exists, maybe new ID1+ID2 pairs)
What's the most efficient way to maintain this 'set'? Here are the options as I see them:
Delete all the rows with ID1, insert all the new rows (yikes)
Query all the existing rows from the set of new data ID1+ID2, only insert the new rows
Insert all the new rows, ignore inserts that trigger unique constraint violations
Any thoughts?