How to use NGramTokenizerFactory or NGramFilterFactory?
Posted
by
user572485
on Stack Overflow
See other posts from Stack Overflow
or by user572485
Published on 2011-01-12T09:51:11Z
Indexed on
2011/01/12
9:53 UTC
Read the original article
Hit count: 207
Hi,
Recently, I am studying how to store and index using Solr. I want to do facet.prefix search. With whitespace tokenizer, "Where are you" will be splited into three words and indexed. If I search facet.prefix="where are", no result will be returned.
I google and found NGramFilterFactory can help me. But when I apply this filter factory, I found the result is "w, h, e, ..., wh, ..", which split the sentence by character, not by token word.
I use the parameters maxGramSize and minGramSize, set to 1 and 3. Does the NGramFilterFactory work right? Should I add some other parameters? Is there some other filter factories which can help me?
Thanks!
© Stack Overflow or respective owner