How does lucene index documents?

Posted by Mehdi Amrollahi on Stack Overflow See other posts from Stack Overflow or by Mehdi Amrollahi
Published on 2010-04-08T17:51:25Z Indexed on 2010/04/08 18:03 UTC
Read the original article Hit count: 507

Filed under:
|
|

Hello,

I read some document about Lucene; also I read the document in this link (http://lucene.sourceforge.net/talks/pisa).

I don't really understand how Lucene indexes documents and don't understand which algorithms Lucene uses for indexing?

On the above link, it says Lucene uses this algorithm for indexing:

  • incremental algorithm:
    • maintain a stack of segment indices
    • create index for each incoming document
    • push new indexes onto the stack
    • let b=10 be the merge factor; M=8

for (size = 1; size < M; size *= b) {
    if (there are b indexes with size docs on top of the stack) {
        pop them off the stack;
        merge them into a single index;
        push the merged index onto the stack;
    } else {
        break;
    }
}

How does this algorithm provide optimized indexing?

Does Lucene use B-tree algorithm or any other algorithm like that for indexing - or does it have a particular algorithm?

Thank you for reading my post.

© Stack Overflow or respective owner

Related posts about lucene

Related posts about indexing