59

I have a problem with the replaceAll for a multiline string:

String regex = "\\s*/\\*.*\\*/";
String testWorks = " /** this should be replaced **/ just text";
String testIllegal = " /** this should be replaced \n **/ just text";

testWorks.replaceAll(regex, "x"); 
testIllegal.replaceAll(regex, "x"); 

The above works for testWorks, but not for testIllegal!? Why is that and how can I overcome this? I need to replace something like a comment /* ... */ that spans multiple lines.

2
  • And what about this string: "String s = \"/*\"; /* comment */" Commented Nov 11, 2010 at 12:28
  • Well the point is that the mathing regex should match only in the beginning of the string. Now it looks like this:(?s)^\\s*/\*.*\*/ Not sure though, if to make it reluctant (?s)^\\s*/\*.*?\*/ Commented Nov 11, 2010 at 12:41

3 Answers 3

103

You need to use the Pattern.DOTALL flag to say that the dot should match newlines. e.g.

Pattern.compile(regex, Pattern.DOTALL).matcher(testIllegal).replaceAll("x")

or alternatively specify the flag in the pattern using (?s) e.g.

String regex = "(?s)\\s*/\\*.*\\*/";
Sign up to request clarification or add additional context in comments.

2 Comments

This is the best solution because it does not interact with the regex string itself, you just specify a flag. I did not know that, Thanks!
If you have multiple "multi-line" comments, this method will remove text between those comments as well. Use the method posted by Boris instead.
16

Add Pattern.DOTALL to the compile, or (?s) to the pattern.

This would work

String regex = "(?s)\\s*/\\*.*\\*/";

See Match multiline text using regular expression

1 Comment

Unfortunately, this does not work in combination with String.replaceAll. :(
6

The meta character . matches any character other than newline. That is why your regex does not work for multi line case.

To fix this replace . with [\d\D] that matches any character including newline.

Code In Action

1 Comment

Swapping in [\d\D] for . (which normally means [^\n], at least in Pattern.UNIX_LINES mode) strikes me as inappropriate because it is not obvious what it is doing, because it is 6 chars for 1, and because there are other ways of doing this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.