2

I'm new to regular expressions and java so please bear with my newbish question.

I want to do the following:

If I have a string:

"I like ice cream only if it is chocolate ice cream. Chocolate cream" 

and a pattern like

"chocolate ice cream" 

I want to match and replace all words matched with a # surrounding them. Like this:

"I like #ice cream# only if it is #chocolate ice cream#. #Cholcolate cream#"

I used java's regex api and I understand I can use Matcher.replaceAll. But I'm having trouble coming up with a proper regex. I came up with this chocolate*\\s*ice*\\s*cream*. But the problem here is it's only matching the whole substring, i.e "chocolate ice cream". I think something like this could work:

chocolate|ice|cream|chocolate ice|ice cream|chocolate cream|chocolate ice cream

etc, i.e all permutations, but this would be cumbersome as the substring grows.

I would appreciate any ideas on proceeding in the right direction.

1
  • Does order matter? Do you want ice chocolate to match? Commented Jul 6, 2011 at 8:55

4 Answers 4

7

Use the pattern:

(?i)\b((?:chocolate|ice|cream)(?:\s+(?:chocolate|ice|cream))*)\b

and replace it with:

#$1#

Demo:

String s = "I like ice cream only if it is chocolate ice cream. Chocolate cream";
s = s.replaceAll("(?i)\\b((?:chocolate|ice|cream)(?:\\s+(?:chocolate|ice|cream))*)\\b", "#$1#");
System.out.println(s);

The word boundaries cause "creamy" (and other such words) not to be replaced.

Note that this will change "ice ice" into "#ice ice#" (ie. the words can occur more than once!), as @stema mentioned in the comments.

Sign up to request clarification or add additional context in comments.

1 Comment

Nice pattern if it is OK to match stuff like cream ice cream
3
(?:chocolate|ice|cream)(?:\s+(?:chocolate|ice|cream))*

This will match one or more of the specified words delimited by whitespace

3 Comments

And maybe make it case insensitive to have the exact output as mentioned in the question (?i:chocolate|ice|cream)(?i:\\s+(?:chocolate|ice|cream))*
Note that "ice creamy" will now be changed into: "#ice cream#y". Perhaps that is what the OP wants, but perhaps not.
@Bart true, +1 to your solution for catching that :-)
0

Try this:

final String source = "I like ice cream only if it is chocolate ice cream. Chocolate cream";
final String result = source.replaceAll("((?:[Cc]hocolate )?(?:[Ii]ce )?cream)", "#$1#");

// Prints I like #ice cream# only if it is #chocolate ice cream#. #Chocolate cream#
System.out.println(result);

See Optional Items for more information.

Comments

0

Maybe you will find interesting MessageFormat from the Java API

Object[] testArgs = {new Long(3), "MyDisk"};

MessageFormat form = new MessageFormat(
  "The disk \"{1}\" contains {0} file(s).");

 System.out.println(form.format(testArgs));

 // output, with different testArgs
 output: The disk "MyDisk" contains 0 file(s).
 output: The disk "MyDisk" contains 1 file(s).
 output: The disk "MyDisk" contains 1,273 file(s).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.