iText PDFReader Extremely Slow To Open
- by Wbmstrmjb
I have some code that combines a few pages of acro forms (with acrofields in tact) and then at the end writes some JS to the entire document.
It is the PdfReader in the function adding the JS that is taking extremely long to instantiate (about 12 seconds for a 1MB file).
Here is the code (pretty simple):
public static byte[] AddJavascript(byte[] document, string js)
{
PdfReader reader = new PdfReader(new RandomAccessFileOrArray(document), null);
MemoryStream msOutput = new MemoryStream();
PdfStamper stamper = new PdfStamper(reader, msOutput);
PdfWriter writer = stamper.Writer;
writer.AddJavaScript(js);
stamper.Close();
reader.Close();
byte[] withJS = msOutput.GetBuffer();
return withJS;
}
I have benchmarked the above and the line that is slow is the first one. I have tried reading it from a file instead of memory and tried using a MemoryStream instead of the RandomAccessFileOrArray. Nothing makes it any faster.
If I add JS to a single page document, it is very fast. So my thought is that the code that combines the pages is somehow making the PDF slow to read for the PdfReader.
Here is the combine code:
public static byte[] CombineFiles(List<byte[]> sourceFiles)
{
MemoryStream output = new MemoryStream();
PdfCopyFields copier = new PdfCopyFields(output);
try
{
output.Position = 0;
foreach (var fileBytes in sourceFiles)
{
PdfReader fileReader = new PdfReader(fileBytes);
copier.AddDocument(fileReader);
}
}
catch (Exception exception)
{
//throw
}
finally
{
copier.Close();
}
byte[] concat = output.GetBuffer();
return concat;
}
I am using PdfCopyFields because I need to preserve the form fields and so cannot use the PdfCopy or PdfSmartCopy. This combine code is very fast (few ms) and produces working documents. The AddJS code above is called after it and the PdfReader open is the slow piece.
Any ideas?