Linq duplicate removal with a twist
- by Danthar
I got a list that contains al the status items of each order.
The problem that i have is that i need to remove all the items of which the status - logdate combination is not the highest.
e.g
var inputs = new List<StatusItem>();
//note that the 3th id is simply a modifier that adds that amount of secs
//to the current datetime, to make testing easier
inputs.Add(new StatusItem(123, 30, 1));
inputs.Add(new StatusItem(123, 40, 2));
inputs.Add(new StatusItem(123, 50, 3));
inputs.Add(new StatusItem(123, 40, 4));
inputs.Add(new StatusItem(123, 50, 5));
inputs.Add(new StatusItem(100, 20, 6));
inputs.Add(new StatusItem(100, 30, 7));
inputs.Add(new StatusItem(100, 20, 8));
inputs.Add(new StatusItem(100, 30, 9));
inputs.Add(new StatusItem(100, 40, 10));
inputs.Add(new StatusItem(100, 50, 11));
inputs.Add(new StatusItem(100, 40, 12));
var l = from i in inputs
group i by i.internalId
into cg
select
from s in cg
group s by s.statusId
into sg
select sg.OrderByDescending(n => n.date).First()
;
This creates a list that returnes me the following:
order 123 status 30 date 4/9/2010 6:44:21 PM
order 123 status 40 date 4/9/2010 6:44:24 PM
order 123 status 50 date 4/9/2010 6:44:25 PM
order 100 status 20 date 4/9/2010 6:44:28 PM
order 100 status 30 date 4/9/2010 6:44:29 PM
order 100 status 40 date 4/9/2010 6:44:32 PM
order 100 status 50 date 4/9/2010 6:44:31 PM
This is ALMOST correct. However that last line which has status 50 needs to be filtered out as well because it was overruled by status 40 in the historylist. U can tell by the fact that its date is lower then the "last" status-item with the status 40.
I was hoping someone could give me some pointers because im stuck.