1

I am writing a program which is parsing large, non-predictable files. No problem with this part. I have been using the code below, looping through ReadLine until the end of the document to keep the memory footprint low. My problem being is an OutOfMemoryException when a line is simply too long.

System.IO.StreamReader casereader = new System.IO.StreamReader(dumplocation);
string line;
while ((line = casereader.ReadLine()) != null)
{
    foreach (Match m in linkParser.Matches(line))
    {
        Console.Write(displaytext);
        Console.WriteLine(m.Value);
        XMLWrite.Start(m.Value, displaytext, dumplocation, line);
    }
}

XMLWrite is just writing any strings that match my Regex Function to an XML Document. The Regex function is a simple email search. The issue occurs when ReadLine is called and the application finds an extremely long line in the file I am reading(I can see this as the memory usage in task manger climbs and climbs as it populates the string 'line'). Eventually it runs out of memory and crashes. What I want to do is read pre defined blocks (e.g 8,000 characters) and then run these one at a time through the same process. This means that I will then always know the length of string line (8,000 chars) and should not receive and out of memory exception. Does my logic seem logic!? I am looking for the best way to implement ReadBlock as currently I am unable to get it working.

Any help much appreciated!

9
  • 1
    Is the problem that you're still getting out of memory? If not, what is your question? You imply that the first block of code has the OOM problem and reading a predefined block fixes that. Commented Aug 7, 2012 at 15:42
  • And what's happening? Errors? Commented Aug 7, 2012 at 15:42
  • what error do you get? What is the problem that you have? I did not try your code in visual studio but the logic seems good... Is it that the compiler is complaining? Commented Aug 7, 2012 at 15:42
  • 2
    You're plan won't work because you may chop a word in half that was needed in your Regex. You're going to have to use a parser. Or change the process to 64bit. Commented Aug 7, 2012 at 15:43
  • You probably want to take a look at this post since you are trying to use readblock and expecting full lines. social.msdn.microsoft.com/Forums/pl/csharpgeneral/thread/… Commented Aug 7, 2012 at 15:46

3 Answers 3

1

You can try with this code

            using (StreamReader sr = new StreamReader(yourPath)) 
            {
                //This is an arbitrary size for this example.
                char[] c = null;

                while (sr.Peek() >= 0) 
                {
                    c = new char[5];//Read block of 5 characters
                    sr.Read(c, 0, c.Length);
                    Console.WriteLine(c); //print block
                }
            }

Link : http://msdn.microsoft.com/en-us/library/9kstw824.aspx

Sign up to request clarification or add additional context in comments.

2 Comments

Came across this example but have not had any luck implementing it. Thanks anyway
@BenCollins Show some effort. Where did that simple example fail?
1

line = buffer.ToString(); This statement should be to blame. buffer is a char array, and its ToString() methods just return System.char[].

Comments

-1

Use: line= new string(buffer); Instead

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.