2

I tried the following regex replacement:

Regex.Replace("one, two, three, ", ",([.*?]),\s$", ", and$1.");

Which returns

"one, and two, three."

Looking for:

"one, two, and three."

I have a regex that can do this. I don't need help there.

My question: Doesn't the lazily quantified .*? mean it will match as few as possible? If it did (obviously it didn't), it would stop matching at the comma after "two". Does it instead find the first match possible starting from the start of the string?

Update:

first line should read:

Regex.Replace("one, two, three, ", ",(.*?),\s$", ", and$1.");

2 Answers 2

2

To begin with, [.*?] is incorrect. A character class defines a set of characters. Saying, "match one character specified by the class". Therefore, your regex does not do what you expect. You can't wrap a class around .*?; it matches the characters (., *, ?) literally when implemented.

You can change the grouping construct to use a negated character class instead to simply avoid greedily matching the first comma and everything afterwards until the end of the string.

String result = Regex.Replace("one, two, three, ", @"([^,]*),\s$", " and$1.");
Console.WriteLine(result); //=> "one, two, and three."

Note: *? does mean a non-greedy match meaning "zero or more — preferably as few as possible". The way it is used in context here (used with the end of string $ anchor), the token will greedily match the first comma and every single character afterwards; advancing to the next token in the string and continuously keep backtracking until it asserts at the end of the string position.

Sign up to request clarification or add additional context in comments.

1 Comment

Whoops, I was using [^,] and just replaced the "^," with ".*?" . Sounds like it does just start at the beginning and returns the first match, then. That's what I was wondering. Thanks for the help.
1

,.*?,\s$ matches all the characters from the first comma to the last because . matches also the character comma.

,([^,]*),\s$

DEMO

  • ,.*?,\s$ - comma in your regex matches all the commas.
  • .*? - will do a non-greedy match of all the characters upto
  • ,\s$ - comma and a space followed by end of the line. So we got a match from the first comma to last.

2 Comments

"*?" is supposed to be lazy. Doesn't that mean as few as possible? Starting the match at the first comma is not the fewest possible.
Note that ^.*?$, ^.*$ both matches the whole line. Here the part folllowing the .*? pattern plays a main role.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.