Does LINQ require significantly more processing cycles and memory than lower-level data iteration techniques?
- by Matthew Patrick Cashatt
Background
I am recently in the process of enduring grueling tech interviews for positions that use the .NET stack, some of which include silly questions like this one, and some questions that are more valid. I recently came across an issue that may be valid but I want to check with the community here to be sure.
When asked by an interviewer how I would count the frequency of words in a text document and rank the results, I answered that I would
Use a stream object put the text file in memory as a string.
Split the string into an array on spaces while ignoring punctuation.
Use LINQ against the array to .GroupBy() and .Count(), then OrderBy() said count.
I got this answer wrong for two reasons:
Streaming an entire text file into memory could be disasterous. What if it was an entire encyclopedia? Instead I should stream one block at a time and begin building a hash table.
LINQ is too expensive and requires too many processing cycles. I should have built a hash table instead and, for each iteration, only added a word to the hash table if it didn't otherwise exist and then increment it's count.
The first reason seems, well, reasonable. But the second gives me more pause. I thought that one of the selling points of LINQ is that it simply abstracts away lower-level operations like hash tables but that, under the veil, it is still the same implementation.
Question
Aside from a few additional processing cycles to call any abstracted methods, does LINQ require significantly more processing cycles to accomplish a given data iteration task than a lower-level task (such as building a hash table) would?