Data structure to build and lookup set of integer ranges

Posted by actual on Stack Overflow See other posts from Stack Overflow or by actual
Published on 2011-01-05T07:31:43Z Indexed on 2011/01/05 7:53 UTC
Read the original article Hit count: 208

I have a set of uint32 integers, there may be millions of items in the set. 50-70% of them are consecutive, but in input stream they appear in unpredictable order.

I need to:

  1. Compress this set into ranges to achieve space efficient representation. Already implemented this using trivial algorithm, since ranges computed only once speed is not important here. After this transformation number of resulting ranges is typically within 5 000-10 000, many of them are single-item, of course.

  2. Test membership of some integer, information about specific range in the set is not required. This one must be very fast -- O(1). Was thinking about minimal perfect hash functions, but they do not play well with ranges. Bitsets are very space inefficient. Other structures, like binary trees, has complexity of O(log n), worst thing with them that implementation make many conditional jumps and processor can not predict them well giving poor performance.

Is there any data structure or algorithm specialized in integer ranges to solve this task?

© Stack Overflow or respective owner

Related posts about algorithm

Related posts about data-structures