How does lucene index documents?
Posted
by Mehdi Amrollahi
on Stack Overflow
See other posts from Stack Overflow
or by Mehdi Amrollahi
Published on 2010-04-08T17:51:25Z
Indexed on
2010/04/08
18:03 UTC
Read the original article
Hit count: 516
Hello,
I read some document about Lucene; also I read the document in this link (http://lucene.sourceforge.net/talks/pisa).
I don't really understand how Lucene indexes documents and don't understand which algorithms Lucene uses for indexing?
On the above link, it says Lucene uses this algorithm for indexing:
- incremental algorithm:
- maintain a stack of segment indices
- create index for each incoming document
- push new indexes onto the stack
- let b=10 be the merge factor; M=8
for (size = 1; size < M; size *= b) {
if (there are b indexes with size docs on top of the stack) {
pop them off the stack;
merge them into a single index;
push the merged index onto the stack;
} else {
break;
}
}
How does this algorithm provide optimized indexing?
Does Lucene use B-tree algorithm or any other algorithm like that for indexing - or does it have a particular algorithm?
Thank you for reading my post.
© Stack Overflow or respective owner