What guarantees are there on the run-time complexity (Big-O) of LINQ methods?
Posted
by tzaman
on Stack Overflow
See other posts from Stack Overflow
or by tzaman
Published on 2010-05-09T22:29:05Z
Indexed on
2010/05/09
22:38 UTC
Read the original article
Hit count: 144
I've recently started using LINQ quite a bit, and I haven't really seen any mention of run-time complexity for any of the LINQ methods. Obviously, there are many factors at play here, so let's restrict the discussion to the plain IEnumerable
LINQ-to-Objects provider. Further, let's assume that any Func
passed in as a selector / mutator / etc. is a cheap O(1) operation.
It seems obvious that all the single-pass operations (Select
, Where
, Count
, Take/Skip
, Any/All
, etc.) will be O(n), since they only need to walk the sequence once; although even this is subject to laziness.
Things are murkier for the more complex operations; the set-like operators (Union
, Distinct
, Except
, etc.) work using GetHashCode
by default (afaik), so it seems reasonable to assume they're using a hash-table internally, making these operations O(n) as well, in general. What about the versions that use an IEqualityComparer
?
OrderBy
would need a sort, so most likely we're looking at O(n log n). What if it's already sorted? How about if I say OrderBy().ThenBy()
and provide the same key to both?
I could see GroupBy
(and Join
) using either sorting, or hashing. Which is it?
Contains
would be O(n) on a List
, but O(1) on a HashSet
- does LINQ check the underlying container to see if it can speed things up?
And the real question - so far, I've been taking it on faith that the operations are performant. However, can I bank on that? STL containers, for example, clearly specify the complexity of every operation. Are there any similar guarantees on LINQ performance in the .NET library specification?
© Stack Overflow or respective owner