What are the fastest-performing options for a read-only, unordered collection of unique strings?
- by Dan Tao
Disclaimer: I realize the totally obvious answer to this question is HashSet<string>. It is absurdly fast, it is unordered, and its values are unique.
But I'm just wondering, because HashSet<T> is a mutable class, so it has Add, Remove, etc.; and so I am not sure if the underlying data structure that makes these operations possible makes certain performance sacrifices when it comes to read operations -- in particular, I'm concerned with Contains.
Basically, I'm wondering what are the absolute fastest-performing data structures in existence that can supply a Contains method for objects of type string. Within or outside of the .NET framework itself.
I'm interested in all kinds of answers, regardless of their limitations. For example I can imagine that some structure might be restricted to strings of a certain length, or may be optimized depending on the problem domain (e.g., range of possible input values), etc. If it exists, I want to hear about it.
One last thing: I'm not restricting this to read-only data structures. Obviously any read-write data structure could be embedded inside a read-only wrapper. The only reason I even mentioned the word "read-only" is that I don't have any requirement for a data structure to allow adding, removing, etc. If it has those functions, though, I won't complain.