List<T> and IEnumerable difference
- by Jonas Elfström
While implementing this generic merge sort, as a kind of Code Kata, I stumbled on a difference between IEnumerable and List that I need help to figure out.
Here's the MergeSort
public class MergeSort<T>
{
public IEnumerable<T> Sort(IEnumerable<T> arr)
{
if (arr.Count() <= 1) return arr;
int middle = arr.Count() / 2;
var left = arr.Take(middle).ToList();
var right = arr.Skip(middle).ToList();
return Merge(Sort(left), Sort(right));
}
private static IEnumerable<T> Merge(IEnumerable<T> left, IEnumerable<T> right)
{
var arrSorted = new List<T>();
while (left.Count() > 0 && right.Count() > 0)
{
if (Comparer<T>.Default.Compare(left.First(), right.First()) < 0)
{
arrSorted.Add(left.First());
left=left.Skip(1);
}
else
{
arrSorted.Add(right.First());
right=right.Skip(1);
}
}
return arrSorted.Concat(left).Concat(right);
}
}
If I remove the .ToList() on the left and right variables it fails to sort correctly. Do you see why?
Example
var ints = new List<int> { 5, 8, 2, 1, 7 };
var mergeSortInt = new MergeSort<int>();
var sortedInts = mergeSortInt.Sort(ints);
With .ToList()
[0]: 1
[1]: 2
[2]: 5
[3]: 7
[4]: 8
Without .ToList()
[0]: 1
[1]: 2
[2]: 5
[3]: 7
[4]: 2
Edit
It was my stupid test that got me.
I tested it like this:
var sortedInts = mergeSortInt.Sort(ints);
ints.Sort();
if (Enumerable.SequenceEqual(ints, sortedInts)) Console.WriteLine("ints sorts ok");
just changing the first row to
var sortedInts = mergeSortInt.Sort(ints).ToList();
removes the problem (and the lazy evaluation).
EDIT 2010-12-29
I thought I would figure out just how the lazy evaluation messes things up here but I just don't get it.
Remove the .ToList() in the Sort method above like this
var left = arr.Take(middle);
var right = arr.Skip(middle);
then try this
var ints = new List<int> { 5, 8, 2 };
var mergeSortInt = new MergeSort<int>();
var sortedInts = mergeSortInt.Sort(ints);
ints.Sort();
if (Enumerable.SequenceEqual(ints, sortedInts)) Console.WriteLine("ints sorts ok");
When debugging You can see that before ints.Sort() a sortedInts.ToList() returns
[0]: 2
[1]: 5
[2]: 8
but after ints.Sort() it returns
[0]: 2
[1]: 5
[2]: 5
What is really happening here?