I need an approach to the problem of preventing inserting duplicate records into the database
- by Maurice
Apologies is this question is asked on the incorrect "stack"
A webservice that I call returns a list of data. The data from the webservice is updated periodically, so a call to the webservice done in one hour could return the same data as a call done in an hour. Also, the data is returned based on a start and end date.
We have multiple users that can run the webservice search, and duplicate data is most likely to be returned (especially for historical data). However I don't want to insert this duplicate data in the database.
I've created a db table in which the data is stored (most important columns are)
Id int autoincrement PK
Date date not null --The date to which the data set belongs.
LastUpdate date not null --The date the data set was last updated.
UserName varchar(50) --The name of the user doing the search.
I use sql server 2008 express with c# 4.0 and visual studio 2010. Entity Framework is used as the ORM. If stored procedures could be avoided in the proposed solution, then that will be a plus.
Another way of looking interpreting what I'm asking a solution for is as follows:
I have a million unique records in my table. A user does a new search. The search results from the user contains around 300k of the data that is already in the db. An efficient solution to finding an inserting only the unique records is needed.