Java split xml file

Posted by CC on Stack Overflow See other posts from Stack Overflow or by CC
Published on 2010-05-11T12:56:09Z Indexed on 2010/05/11 13:14 UTC
Read the original article Hit count: 420

Filed under:
|

Hi all,

I'm working on a piece of code to split files. I want to split flat file (that's ok, it is working fine) and xml file. The idea is to split based of a number of files to split: I have a file, and I want to split it in x files (x is a parameters). I'm doing the split by taking the size of the file and spliting the size by the number of files to split. Then, mysolution was to use a BufferedReader and to use it like

while ((n = reader.read(buffer, 0, buffer.length)) != -1) {


{

The main problem is that for the xml file I cannot just split it, but I have to split it based on a block delimited by a start xml tag and end xml tag:

<start tag>
bla bla xml stuff
</end tag>

So I cannot cut a block at the middle. So if when I'm at the half of a block, is the size of my new file is greater than my max, I will have to read until the end of the tag, and then, to start a next file.

The problem is that I have all sort of cases, and is a bit difficult to search the end tag. - the block reads a text until the middle of the end tag - the block reads a text until the end of the end tag, and no more other caracter after - etc and in the same time to have a loop and read the next block. Some times the end of a block concatenated with the start of the next one, I have the end xml tag. I hope you get the idea.

My question is, does anyone have some algorithm that does that more accurate and who i treating all special cases ?

The idea is to split the file as quickly as possible.

Thanks alot.

© Stack Overflow or respective owner

Related posts about java

Related posts about Xml