2

All,

I have the below code for Transforming an XML Document using an XSLT. The problem is when the XML Document is around 12MB the C# runs out of memory. Is there a different way of doing the transform without consuming that much memory?

public string Transform(XPathDocument myXPathDoc, XslCompiledTransform myXslTrans)
    {
        try
        {
            var stm = new MemoryStream();
            myXslTrans.Transform(myXPathDoc, null, stm);
            var sr = new StreamReader(stm);
            return sr.ReadToEnd();
        }
        catch (Exception e)
        {
          //Log the Exception
        }
    }

Here is the stack trace:

at System.String.GetStringForStringBuilder(String value, Int32 startIndex, Int32       length, Int32 capacity)
at System.Text.StringBuilder.GetNewString(String currentString, Int32 requiredLength)   
at System.Text.StringBuilder.Append(Char[] value, Int32 startIndex, Int32 charCount)
at System.IO.StreamReader.ReadToEnd()
at Transform(XPathDocument myXPathDoc, XslCompiledTransform myXslTrans)
3
  • 2
    Can you please provide the complete exception details, i.e. the output of e.ToString() in the catch block? Can you also show your transform and a (reduced) sample input document? Commented Sep 23, 2010 at 10:20
  • And something else: When it does work with a 10MB input, how big is the resulting string? Commented Sep 23, 2010 at 11:05
  • It could be you've got a problem with the xslt resulting in massive output, have you trying running the transform against the xml in Visual Studio or other tool outside of the context of the code provided? Commented Sep 23, 2010 at 14:56

6 Answers 6

4

The first thing I would do is to isolate the problem. Take the whole MemoryStream business out of play and stream the output to a file, e.g.:

using (XmlReader xr = XmlReader.Create(new StreamReader("input.xml")))
using (XmlWriter xw = XmlWriter.Create(new StreamWriter("output.xml")))
{
   xslt.Transform(xr, xw);
}

If you still get an out-of-memory exception (I'd bet folding money that you will), that's a pretty fair indication that the problem's not with the size of the output but rather with something in the transform itself, e.g. something that recurses infinitely like:

<xsl:template match="foo">
   <bar>
      <xsl:apply-templates select="."/>
   </bar>
</xsl:template>
Sign up to request clarification or add additional context in comments.

Comments

3

The MemoryStream + ReadToEnd means you need 2 copies in memory at that point. You could optimize that to 1 copy by using a StringWriter object as target (replacing MemStream + Reader) and use the writer.ToString() when you're done.

But that would get you only up to 24 MB at best, still way too small. Something else must be going on.
Impossible to say what, maybe your XSLT is too complicated or inefficient.


var writer = new StringWriter();
//var stm = new MemoryStream();
myXslTrans.Transform(myXPathDoc, null, writer);
//var sr = new StreamReader(stm);
//return sr.ReadToEnd();
return writer.ToString();

3 Comments

I assume that the exception already happens earlier, i.e. in myXslTrans.Transform. But without a stack trace we can only guess.
Added stack trace in the original post
Could you please provide an example how to replace it?
2

You need

stm.Position = 0

to reset the memory stream to the beginning before reading the contents with the StreamReader. Otherwise you are trying to read content from past the end of the stream.

1 Comment

I actually had this one but didn't make any difference
0

The ReadToEnd() function loads the entire stream into memory. You are better off using an XmlReader to stream the document in chunks, and then run xslt against smaller fragments. You may also want to consider passing the document with XmlReader entirely and not use xslt which is less suited to streaming data and less scalable for large files.

Comments

0

It may or may not be related but you need to make sure you dispose your stream and reader objects. I have also added in the position = 0 that Nick Jones pointed out.

public string Transform(XPathDocument myXPathDoc, XslCompiledTransform myXslTrans)
{
    try
    {
        using (var stm = new MemoryStream())
        {
             myXslTrans.Transform(myXPathDoc, null, stm);
             stm.Position = 0;
             using (var sr = new StreamReader(stm))
             {
                 return sr.ReadToEnd();
             }
        }
    }
    catch (Exception e)
    {
        //Log the Exception
    }
}

4 Comments

While it is a good habit, a MemoryStream doesn't actually need Disposing.
Yes and no. As far as I understand it if any of the async methods have been called (BeginRead, BeginWrite) and have not finished you could leak event handles albeit unlikely. As you said it is good practice.
@Henk The point of implementing IDisposable is so that callers know to always dispose objects as soon as possible to release resources. IMO I don't think there's ever an argument for not doing this, or a reason not to. If you look at the implementation of MemoryStream.Dispose in reflector there are consequences to not doing this, however small. I would always consider not disposing a disposable object as a bug.
+1 Good point but I think Henk's comment was more in the context of the question pointing out disposing of the MemoryStream would have little impact in this instance.
0

Make sure you don't have any JavaScript, otherwise there is a known memory leak.

My response has validity and can avoid many errors and memory leaks. A user voted me down because he did not understand that JavaScript can be embedded in XSLT as an extension.

Here is an old article that explains how to do it. http://msdn.microsoft.com/en-us/magazine/cc302079.aspx

.Net classes, which are hosted on a web server, have known memory leaks when using XslTransform class when JavaScript is embedded in the XSLT document via an extension. JavaScript was used to get things like dates and do some more dynamic processing. That is why I am giving a warning to those who use the JavaScript extension. This is the most likely reason for a memory leak.

Another warning is to be careful using the the newer XslCompliedTransform class. With my large XSLT documents I profiled the processor at 4 times the XslTransform class and twice its memory.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.