Search Lucene with precise edit distances
Posted
by askullhead
on Stack Overflow
See other posts from Stack Overflow
or by askullhead
Published on 2010-01-15T18:27:39Z
Indexed on
2010/05/12
12:14 UTC
Read the original article
Hit count: 306
lucene
|fuzzy-search
I would like to search a Lucene index with edit distances. For example, say, there is a document with a field FIRST_NAME; I want all documents with first names that are 1 edit distance away from, say, 'john'.
I know that Lucene supports fuzzy searches (FIRST_NAME:john~) and takes a number between 0 and 1 to control the fuzziness. The problem (for me) is this number does not directly translate to an edit distance. And when the values in the documents are short strings (less than 3 characters) the fuzzy search has difficulty finding them. For example if there is a document with FIRST_NAME 'J' and I search for FIRST_NAME:I~0.0 I don't get anything back.
© Stack Overflow or respective owner