Java split xml file
Posted
by CC
on Stack Overflow
See other posts from Stack Overflow
or by CC
Published on 2010-05-11T12:56:09Z
Indexed on
2010/05/11
13:14 UTC
Read the original article
Hit count: 420
Hi all,
I'm working on a piece of code to split files. I want to split flat file (that's ok, it is working fine) and xml file. The idea is to split based of a number of files to split: I have a file, and I want to split it in x files (x is a parameters). I'm doing the split by taking the size of the file and spliting the size by the number of files to split. Then, mysolution was to use a BufferedReader and to use it like
while ((n = reader.read(buffer, 0, buffer.length)) != -1) {
{
The main problem is that for the xml file I cannot just split it, but I have to split it based on a block delimited by a start xml tag and end xml tag:
<start tag>
bla bla xml stuff
</end tag>
So I cannot cut a block at the middle. So if when I'm at the half of a block, is the size of my new file is greater than my max, I will have to read until the end of the tag, and then, to start a next file.
The problem is that I have all sort of cases, and is a bit difficult to search the end tag. - the block reads a text until the middle of the end tag - the block reads a text until the end of the end tag, and no more other caracter after - etc and in the same time to have a loop and read the next block. Some times the end of a block concatenated with the start of the next one, I have the end xml tag. I hope you get the idea.
My question is, does anyone have some algorithm that does that more accurate and who i treating all special cases ?
The idea is to split the file as quickly as possible.
Thanks alot.
© Stack Overflow or respective owner