How can I make hundreds of simultaneously running processes communicate with a database through one
- by Olfan
Long speech short: How can I make hundreds of simultaneously running processes communicate with a database through one or few permanent sessions?
The whole story:
I once built a number crunching engine that handles vast amounts of large data files by forking off one child after another giving each a small number of files to work on. File locking, progress monitoring and result propagation happen in an Oracle database which all (sub-)processes access at various times using an application-specific module which encapsulates DBI.
This worked well at first, but now with higher volumes of input data, the number of database sessions (one per child, and they can be very short-lived) constantly being opened and closed is becoming an issue. I now want to centralise database access so that there are only one or few fixed database sessions which handle all database access for all the (sub-)processes. The presence of the database abstraction module should make the changes easy because the function calls in the worker instances can stay the same. My problem is that I cannot think of a suitable way to enhance said module in order to establish communication between all the processes and the database connector(s).
I thought of message queueing, but couldn't come up with a way of connecting a large herd of requestors with one or few database connectors in a way so that bidirectional communication is possible (for collecting the query result).
An asynchronous approach could help here in that all requests are written to the same queue and the database connector servicing the request will "call back" to submit the result. But my mind fails me in generating an image clear enough so that I can paint into code.
Threading instead of forking might have given me an easier start, but this would now require massive changes to the code base that I'm not prepared to do to a live system.
The more I think of it, the more the base idea looks like a pre-forked web server to me only that it doesn't serve web pages but database queries. Any ideas on what to dig into, and where? Sample (pseudo) code to inspire me, links to possibly related articles, ready solutions on CPAN maybe?