Lucene wildcard queries

Posted by Javi on Stack Overflow See other posts from Stack Overflow or by Javi
Published on 2010-03-12T11:59:41Z Indexed on 2010/03/12 14:47 UTC
Read the original article Hit count: 284

Filed under:
|
|

Hello,

I have this question relating to Lucene.

I have a form and I get a text from it and I want to perform a full text search in several fields. Suppose I get from the input the text "textToLook".

I have a Lucene Analyzer with several filters. One of them is lowerCaseFilter, so when I create the index, words will be lowercased.

Imagine I want to search into two fields field1 and field2 so the lucene query would be something like this (note that 'textToLook' now is 'texttolook'):

field1: texttolook* field2:texttolook*

In my class I have something like this to create the query. I works when there is no wildcard.

String text = "textToLook";
String[] fields = {"field1", "field2"};
//analyser is the same as the one used for indexing
Analyzer analyzer = fullTextEntityManager.getSearchFactory().getAnalyzer("customAnalyzer");
MultiFieldQueryParser parser = new MultiFieldQueryParser(fields, analyzer);
org.apache.lucene.search.Query queryTextoLibre = parser.parse(text);

With this code the query would be:

field1: texttolook field2:texttolook

but If I set text to "textToLook*" I get

field1: textToLook* field2:textToLook*

which won't find correctly as the indexes are in lowercase.

I have read in lucene website this:

" Wildcard, Prefix, and Fuzzy queries are not passed through the Analyzer, which is the component that performs operations such as stemming and lowercasing"

My problem cannot be solved by setting the behaviour case insensitive cause my analyzer has other fields which for examples remove some suffixes of words.

I think I can solve the problem by getting how the text would be after going through the filters of my analyzer, then I could add the "*" and then I could build the Query with MultiFieldQueryParser. So in this example I woud get "textToLower" and after being passed to to these filters I could get "texttolower". After this I could make "textotolower*".

But, is there any way to get the value of my text variable after going through all my analyzer's filters? How can I get all the filters of my analyzer? Is this possible?

Thanks

© Stack Overflow or respective owner

Related posts about lucene

Related posts about java