Search Lucene with precise edit distances

Posted by askullhead on Stack Overflow See other posts from Stack Overflow or by askullhead
Published on 2010-01-15T18:27:39Z Indexed on 2010/05/12 12:14 UTC
Read the original article Hit count: 296

Filed under:
|

I would like to search a Lucene index with edit distances. For example, say, there is a document with a field FIRST_NAME; I want all documents with first names that are 1 edit distance away from, say, 'john'.

I know that Lucene supports fuzzy searches (FIRST_NAME:john~) and takes a number between 0 and 1 to control the fuzziness. The problem (for me) is this number does not directly translate to an edit distance. And when the values in the documents are short strings (less than 3 characters) the fuzzy search has difficulty finding them. For example if there is a document with FIRST_NAME 'J' and I search for FIRST_NAME:I~0.0 I don't get anything back.

© Stack Overflow or respective owner

Related posts about lucene

Related posts about fuzzy-search