2

I try to catch text by Regular Expression. I list codes as follows.

Pattern p=Pattern.compile("<@a>(?:.|\\s)+?</@a>"); 
Matcher m = p.matcher(fileContents.toString());
while(m.find()) {
    //Error will be thrown at this point
    System.out.println(m.group());
}

If the length of text I want to catch is too long, system will throw me a StackOverflowError. Otherwise, the codes work well. Please help me how to solve this problem.

1
  • How large is fileContents.toString()? Commented Mar 3, 2010 at 9:44

2 Answers 2

3

The dot and \s both match whitespace characters. That might lead to unnecessary backtracking. What do you want to match? Probably any character, including linebreaks?

Then just use the lazy dot with the dot-matches-newlines option enabled:

Pattern p=Pattern.compile("<@a>.+?</@a>", Pattern.DOTALL);

You are aware that you'll run into trouble if <@a> tags can be nested in your input?

Sign up to request clarification or add additional context in comments.

2 Comments

In fact, the backtrack points are probably what is causing the stack overflow.
That's what I meant to imply (without having seen the text that is to be matched). Thanks for the clarification :)
-1
I always try to search text with case insensitive..
Pattern p=Pattern.compile("<@a>.+?</@a>", Pattern.DOTALL | Pattern.CASE_INSENSITIVE);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.