How to split xml to header and items using smooks?
Posted
by
palto
on Stack Overflow
See other posts from Stack Overflow
or by palto
Published on 2012-03-01T15:03:11Z
Indexed on
2012/06/12
22:40 UTC
Read the original article
Hit count: 324
I have a xml file roughly like this:
<batch>
<header>
<headerStuff />
</header>
<contents>
<timestamp />
<invoices>
<invoice>
<invoiceStuff />
</invoice>
<!-- Insert 1000 invoice elements here -->
</invoices>
</contents>
</batch>
I would like to split that file to 1000 files with the same headerStuff and only one invoice. Smooks documentation is very proud of the possibilities of transformations, but unfortunately I don't want to do those.
The only way I've figured how to do this is to repeat the whole structure in freemarker. But that feels like repeating the structure unnecessarily. The header has like 30 different tags so there would be lots of work involved also.
What I currently have is this:
<?xml version="1.0" encoding="UTF-8"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd"
xmlns:calc="http://www.milyn.org/xsd/smooks/calc-1.1.xsd"
xmlns:frag="http://www.milyn.org/xsd/smooks/fragment-routing-1.2.xsd"
xmlns:file="http://www.milyn.org/xsd/smooks/file-routing-1.1.xsd">
<params>
<param name="stream.filter.type">SAX</param>
</params>
<frag:serialize fragment="INVOICE" bindTo="invoiceBean" />
<calc:counter countOnElement="INVOICE" beanId="split_calc" start="1" />
<file:outputStream openOnElement="INVOICE" resourceName="invoiceSplitStream">
<file:fileNamePattern>invoice-${split_calc}.xml</file:fileNamePattern>
<file:destinationDirectoryPattern>target/invoices</file:destinationDirectoryPattern>
<file:highWaterMark mark="10"/>
</file:outputStream>
<resource-config selector="INVOICE">
<resource>org.milyn.routing.io.OutputStreamRouter</resource>
<param name="beanId">invoiceBean</param>
<param name="resourceName">invoiceSplitStream</param>
<param name="visitAfter">true</param>
</resource-config>
</smooks-resource-list>
That creates files for each invoice tag, but I don't know how to continue from there to get the header also in the file.
EDIT:
The solution has to use Smooks. We use it in an application as a generic splitter and just create different smooks configuration files for different types of input files.
© Stack Overflow or respective owner