Intersection() and Except() is too slow with large collections of custom objects
- by Theo
I am importing data from another database.
My process is importing data from a remote DB into a List<DataModel> named remoteData and also importing data from the local DB into a List<DataModel> named localData.
I am then using LINQ to create a list of records that are different so that I can update the local DB to match the data pulled from remote DB. Like this:
var outdatedData = this.localData.Intersect(this.remoteData, new OutdatedDataComparer()).ToList();
I am then using LINQ to create a list of records that no longer exist in remoteData, but do exist in localData, so that I delete them from local database.
Like this:
var oldData = this.localData.Except(this.remoteData, new MatchingDataComparer()).ToList();
I am then using LINQ to do the opposite of the above to add the new data to the local database.
Like this:
var newData = this.remoteData.Except(this.localData, new MatchingDataComparer()).ToList();
Each collection imports about 70k records, and each of the 3 LINQ operation take between 5 - 10 minutes to complete. How can I make this faster?
Here is the object the collections are using:
internal class DataModel
{
public string Key1{ get; set; }
public string Key2{ get; set; }
public string Value1{ get; set; }
public string Value2{ get; set; }
public byte? Value3{ get; set; }
}
The comparer used to check for outdated records:
class OutdatedDataComparer : IEqualityComparer<DataModel>
{
public bool Equals(DataModel x, DataModel y)
{
var e =
string.Equals(x.Key1, y.Key1) &&
string.Equals(x.Key2, y.Key2) && (
!string.Equals(x.Value1, y.Value1) ||
!string.Equals(x.Value2, y.Value2) ||
x.Value3 != y.Value3
);
return e;
}
public int GetHashCode(DataModel obj)
{
return 0;
}
}
The comparer used to find old and new records:
internal class MatchingDataComparer : IEqualityComparer<DataModel>
{
public bool Equals(DataModel x, DataModel y)
{
return string.Equals(x.Key1, y.Key1) && string.Equals(x.Key2, y.Key2);
}
public int GetHashCode(DataModel obj)
{
return 0;
}
}