Lucene DuplicateFilter question

Posted by chardex on Stack Overflow See other posts from Stack Overflow or by chardex
Published on 2010-05-30T21:25:42Z Indexed on 2010/05/30 21:32 UTC
Read the original article Hit count: 256

Filed under:

Hi,

Why DuplicateFilter doesn't work together with other filters? For example, if a little remake of the test DuplicateFilterTest, then the impression that the filter is not applied to other filters and first trims results:

public void testKeepsLastFilter()
        throws Throwable {
    DuplicateFilter df = new DuplicateFilter(KEY_FIELD);
    df.setKeepMode(DuplicateFilter.KM_USE_LAST_OCCURRENCE);

    Query q = new ConstantScoreQuery(new ChainedFilter(new Filter[]{
            new QueryWrapperFilter(tq),
            // new QueryWrapperFilter(new TermQuery(new Term("text", "out"))), // works right, it is the last document.
            new QueryWrapperFilter(new TermQuery(new Term("text", "now"))) // why it doesn't work? It is the third document.

    }, ChainedFilter.AND));

    ScoreDoc[] hits = searcher.search(q, df, 1000).scoreDocs;

    assertTrue("Filtered searching should have found some matches", hits.length > 0);
    for (int i = 0; i < hits.length; i++) {
        Document d = searcher.doc(hits[i].doc);
        String url = d.get(KEY_FIELD);
        TermDocs td = reader.termDocs(new Term(KEY_FIELD, url));
        int lastDoc = 0;
        while (td.next()) {
            lastDoc = td.doc();
        }
        assertEquals("Duplicate urls should return last doc", lastDoc, hits[i].doc);
    }
}

© Stack Overflow or respective owner

Related posts about lucene