What are the fastest-performing options for a read-only, unordered collection of unique strings?

Posted by Dan Tao on Stack Overflow See other posts from Stack Overflow or by Dan Tao
Published on 2010-06-17T16:21:02Z Indexed on 2010/06/17 17:03 UTC
Read the original article Hit count: 214

Disclaimer: I realize the totally obvious answer to this question is HashSet<string>. It is absurdly fast, it is unordered, and its values are unique.

But I'm just wondering, because HashSet<T> is a mutable class, so it has Add, Remove, etc.; and so I am not sure if the underlying data structure that makes these operations possible makes certain performance sacrifices when it comes to read operations -- in particular, I'm concerned with Contains.

Basically, I'm wondering what are the absolute fastest-performing data structures in existence that can supply a Contains method for objects of type string. Within or outside of the .NET framework itself.

I'm interested in all kinds of answers, regardless of their limitations. For example I can imagine that some structure might be restricted to strings of a certain length, or may be optimized depending on the problem domain (e.g., range of possible input values), etc. If it exists, I want to hear about it.

One last thing: I'm not restricting this to read-only data structures. Obviously any read-write data structure could be embedded inside a read-only wrapper. The only reason I even mentioned the word "read-only" is that I don't have any requirement for a data structure to allow adding, removing, etc. If it has those functions, though, I won't complain.

© Stack Overflow or respective owner

Related posts about .NET

Related posts about Performance