How to optimize indexing of large number of DB records using Zend_Lucene and Zend_Paginator
- by jdichev
So I have this cron script that is deployed and ran using Cron on a host and indexes all the records in a database table - the index is later used both for the front end of the site and the backed operations as well.
After the operation, the index is about 3-4 MB.
The problem is it takes a lot of resources (CPU: 30+ and a good chunk of memory) and slows the machine down. My question is about how to optimize the operation described below:
First there is a select query built using the Zend Framework API, this query is then passed to a Paginator factory that returns a paginator which I am using to balance the current number of items being indexed and not iterate over too much items.
The script is iterating over the current items in the paginator object using a foreach loop until reaching the end and then it starts from the beginning after getting items for the next page.
I am suspecting this overhead is caused by the Zend_Lucene but no idea how this could be improved.