How to optimize indexing of large number of DB records using Zend_Lucene and Zend_Paginator

Posted by jdichev on Stack Overflow See other posts from Stack Overflow or by jdichev
Published on 2010-04-23T13:39:10Z Indexed on 2010/04/23 13:43 UTC
Read the original article Hit count: 199

So I have this cron script that is deployed and ran using Cron on a host and indexes all the records in a database table - the index is later used both for the front end of the site and the backed operations as well.

After the operation, the index is about 3-4 MB.

The problem is it takes a lot of resources (CPU: 30+ and a good chunk of memory) and slows the machine down. My question is about how to optimize the operation described below:

First there is a select query built using the Zend Framework API, this query is then passed to a Paginator factory that returns a paginator which I am using to balance the current number of items being indexed and not iterate over too much items. The script is iterating over the current items in the paginator object using a foreach loop until reaching the end and then it starts from the beginning after getting items for the next page.

I am suspecting this overhead is caused by the Zend_Lucene but no idea how this could be improved.

© Stack Overflow or respective owner

Related posts about zend-lucene

Related posts about php