What's the recommended implementation for hashing OLE Variants?

Posted by Barry Kelly on Stack Overflow See other posts from Stack Overflow or by Barry Kelly
Published on 2010-03-19T18:12:55Z Indexed on 2010/03/24 19:13 UTC
Read the original article Hit count: 268

Filed under:
|
|
|
|

OLE Variants, as used by older versions of Visual Basic and pervasively in COM Automation, can store lots of different types: basic types like integers and floats, more complicated types like strings and arrays, and all the way up to IDispatch implementations and pointers in the form of ByRef variants.

Variants are also weakly typed: they convert the value to another type without warning depending on which operator you apply and what the current types are of the values passed to the operator. For example, comparing two variants, one containing the integer 1 and another containing the string "1", for equality will return True.

So assuming that I'm working with variants at the underlying data level (e.g. VARIANT in C++ or TVarData in Delphi - i.e. the big union of different possible values), how should I hash variants consistently so that they obey the right rules?

Rules:

  • Variants that hash unequally should compare as unequal, both in sorting and direct equality
  • Variants that compare as equal for both sorting and direct equality should hash as equal

It's OK if I have to use different sorting and direct comparison rules in order to make the hashing fit.

The way I'm currently working is I'm normalizing the variants to strings (if they fit), and treating them as strings, otherwise I'm working with the variant data as if it was an opaque blob, and hashing and comparing its raw bytes. That has some limitations, of course: numbers 1..10 sort as [1, 10, 2, ... 9] etc. This is mildly annoying, but it is consistent and it is very little work. However, I do wonder if there is an accepted practice for this problem.

© Stack Overflow or respective owner

Related posts about variant

Related posts about com