XDocument + IEnumerable is causing out of memory exception in System.Xml.Linq.dll
- by Manatherin
Basically I have a program which, when it starts loads a list of files (as FileInfo) and for each file in the list it loads a XML document (as XDocument).
The program then reads data out of it into a container class (storing as IEnumerables), at which point the XDocument goes out of scope.
The program then exports the data from the container class to a database. After the export the container class goes out of scope, however, the garbage collector isn't clearing up the container class which, because its storing as IEnumerable, seems to lead to the XDocument staying in memory (Not sure if this is the reason but the task manager is showing the memory from the XDocument isn't being freed).
As the program is looping through multiple files eventually the program is throwing a out of memory exception. To mitigate this ive ended up using
System.GC.Collect();
to force the garbage collector to run after the container goes out of scope. this is working but my questions are:
Is this the right thing to do? (Forcing the garbage collector to run seems a bit odd)
Is there a better way to make sure the XDocument memory is being disposed?
Could there be a different reason, other than the IEnumerable, that the document memory isnt being freed?
Thanks.
Edit: Code Samples:
Container Class:
public IEnumerable<CustomClassOne> CustomClassOne { get; set; }
public IEnumerable<CustomClassTwo> CustomClassTwo { get; set; }
public IEnumerable<CustomClassThree> CustomClassThree { get; set; }
...
public IEnumerable<CustomClassNine> CustomClassNine { get; set; }</code></pre>
Custom Class:
public long VariableOne { get; set; }
public int VariableTwo { get; set; }
public DateTime VariableThree { get; set; }
...
Anyway that's the basic structures really. The Custom Classes are populated through the container class from the XML document. The filled structures themselves use very little memory.
A container class is filled from one XML document, goes out of scope, the next document is then loaded e.g.
public static void ExportAll(IEnumerable<FileInfo> files)
{
foreach (FileInfo file in files)
{
ExportFile(file);
//Temporary to clear memory
System.GC.Collect();
}
}
private static void ExportFile(FileInfo file)
{
ContainerClass containerClass = Reader.ReadXMLDocument(file);
ExportContainerClass(containerClass);
//Export simply dumps the data from the container class into a database
//Container Class (and any passed container classes) goes out of scope at end of export
}
public static ContainerClass ReadXMLDocument(FileInfo fileToRead)
{
XDocument document = GetXDocument(fileToRead);
var containerClass = new ContainerClass();
//ForEach customClass in containerClass
//Read all data for customClass from XDocument
return containerClass;
}
Forgot to mention this bit (not sure if its relevent), the files can be compressed as .gz so I have the GetXDocument() method to load it
private static XDocument GetXDocument(FileInfo fileToRead)
{
XDocument document;
using (FileStream fileStream = new FileStream(fileToRead.FullName, FileMode.Open, FileAccess.Read, FileShare.Read))
{
if (String.Compare(fileToRead.Extension, ".gz", true) == 0)
{
using (GZipStream zipStream = new GZipStream(fileStream, CompressionMode.Decompress))
{
document = XDocument.Load(zipStream);
}
}
else
{
document = XDocument.Load(fileStream);
}
return document;
}
}
Hope this is enough information.
Thanks
Edit: The System.GC.Collect() is not working 100% of the time, sometimes the program seems to retain the XDocument, anyone have any idea why this might be?