Cutting large XML file into smaller pieces in C#
- by NDraskovic
I have a problem that I'm working on for quite some time now. I have an XML file with over 50000 records (one record has 3 levels). This file is used by one of my applications to control document sending (the record holds, among other informations, the type of document that has to be sent to a certain person). So in my application I load the XML file into a XmlDocument, and then by using SelectNodes method, I create a XmlNodeList from which I read the data I want. The process is like this - our worker takes the persons ID card (simple eith barcode) and reads it with barcode reader. When the barcode value has been read, my application finds the person with that ID in the XML file, and stores the type of the document into a string variable. Then the worker takes the document and reads its barcode, and if the value of documents barcode and the value in the value in the string variable match, the application makes a record that document of type xxxxxxxx will be sent to the person with ID yyyyyyyyy. This is very simple code, it works perfectly for now, and this is how it looks:
On textBox1_TextChanged event (worker read persons ID):
foreach(XmlNode node in NodeList){
if(String.Compare(node.Attributes.GetNamedItem("ID").Value.ToString(),textBox1.Text)==0)
{
ControlString = node.ChildNode[3].FirstChild.Attributes.GetNamedItem("doctype").Value.ToString();
break;
}
}
textBox2.Focus();
And on textBox2_TextChanged event (worker read the documents barcode):
if(String.Compare(textBox2.Text,ControlString)==0)
{
//Create a record and insert it into a SQL database
}
My question is - how will my application perform with larger XML files (I was told that the XML file might be up to 500,000 records large), will this approach be valid, or will I need to cut the file into smaller files. If I have to cut it, please give me an idea with some code samples, I've tried to do it like this:
Reading entire record and storing it into a string:
private void WriteXml(XmlNode record)
{
tempXML = record.InnerXml;
temp = "<" + record.Name + " code=\"" + record.Attributes.GetNamedItem("code").Value + "\">" + Environment.NewLine;
temp += tempXML + Environment.NewLine;
temp += "</" + record.Name + ">";
SmallerXMLDocument += temp + Environment.NewLine;
temp = "";
i++;
}
tempXML, temp and SmallerXMLDocument are all string variables.
And then in button_Click method I load the XML file into a XmlNodeList (again by using XmlDocument.SelectNodes method) and I try to create one big string value that would hold all records like this:
foreach(XmlNode node in nodes)
{
if(String.Compare(node.ChildNode[3].FirstChild.Attributes.GetNamedItem("doctype").Value.ToString(),doctype1)==0)
{
WriteXML(node);
}
}
My idea was to create a string value (in this case called SmallerXmlDocument), and when I pass trough the entire XML file, to simply copy the value of that string into a new file. This works, but only for files that have up to 2000 records (and my has way more than that). So, if I need to cut the file into smaller pieces, what would be the best way to do it (keep in mind that there could be up to half a million records in a XML file)?
Thanks