How does lucene index documents?
- by Mehdi Amrollahi
Hello,
I read some document about Lucene; also I read the document in this link
(http://lucene.sourceforge.net/talks/pisa).
I don't really understand how Lucene indexes documents and don't understand which algorithms Lucene uses for indexing?
On the above link, it says Lucene uses this algorithm for indexing:
incremental algorithm:
maintain a stack of segment indices
create index for each incoming document
push new indexes onto the stack
let b=10 be the merge factor; M=8
for (size = 1; size < M; size *= b) {
if (there are b indexes with size docs on top of the stack) {
pop them off the stack;
merge them into a single index;
push the merged index onto the stack;
} else {
break;
}
}
How does this algorithm provide optimized indexing?
Does Lucene use B-tree algorithm or any other algorithm like that for indexing
- or does it have a particular algorithm?
Thank you for reading my post.