"It is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail." - Abraham Maslow
I need to write a tool to dump a large hierarchical (SQL) database to XML. The hierarchy consists of a Person table with subsidiary Address, Phone, etc. tables.
I have to dump thousands of rows, so I would like to do so incrementally and not keep the whole XML file in memory.
I would like to isolate non-pure function code to a small portion of the application.
I am thinking that this might be a good opportunity to explore FP and concurrency in Clojure. I can also show the benefits of immutable data and multi-core utilization to my skeptical co-workers.
I'm not sure how the overall architecture of the application should be. I am thinking that I can use an impure function to retrieve the database rows and return a lazy sequence that can then be processed by a pure function that returns an XML fragment.
For each Person row, I can create a Future and have several processed in parallel (the output order does not matter).
As each Person is processed, the task will retrieve the appropriate rows from the Address, Phone, etc. tables and generate the nested XML.
I can use a a generic function to process most of the tables, relying on database meta-data to get the column information, with special functions for the few tables that need custom processing. These functions could be listed in a map(table name -> function).
Am I going about this in the right way? I can easily fall back to doing it in OO using Java, but that would be no fun.
BTW, are there any good books on FP patterns or architecture? I have several good books on Clojure, Scala, and F#, but although each covers the language well, none look at the "big picture" of function programming design.