Skip to main content
"ungreesy" should be "non-greedy" or better yet "lazy", also spelling
Source Link
phk
  • 6.1k
  • 7
  • 44
  • 76

It's not the shortest possible match, just a short match. Greedy mode tries to find the last possible match, lazy mode the first possible match. But the first possible match is not necessarily the shortest one.

Take the input string foobarbaz and the regexp o.*a (greedy) or o.*?a (ungreedylazy).

The shortest possible match in this input string would be oba.

However the RegExp looks for matches from left to right, so the o finds the first oo in foobarbaz. And if the rest of the pattern produces a match, that's where it stays.

Following the first o, .* (greedy) eats obarbaz (the entire string) and then backtracks in order to match the rest of the pattern (a). Thus it finds the last a in baz and ends up matching oobarba.

Following the first o, .*? (ungreedylazy) doesn't eat the entire string, instead it looks for the first occuranceoccurrence of the rest of the pattern. So first it sees the second o, which doesn't match a, then it sees b, which doesn't match a, then it sees a, which matches a, and because it's lazy that's where it stops. (and the result is ooba, but not oba)

So while it's not THE shortest possible one, it's a shorter one than the greedy version.

It's not the shortest possible match, just a short match. Greedy mode tries to find the last possible match, lazy mode the first possible match. But the first possible match is not necessarily the shortest one.

Take the input string foobarbaz and the regexp o.*a (greedy) or o.*?a (ungreedy).

The shortest possible match in this input string would be oba.

However the RegExp looks for matches from left to right, so the o finds the first o in foobarbaz. And if the rest of the pattern produces a match, that's where it stays.

Following the first o, .* (greedy) eats obarbaz (the entire string) and then backtracks in order to match the rest of the pattern (a). Thus it finds the last a in baz and ends up matching oobarba.

Following the first o, .*? (ungreedy) doesn't eat the entire string, instead it looks for the first occurance of the rest of the pattern. So first it sees the second o, which doesn't match a, then it sees b, which doesn't match a, then it sees a, which matches a, and because it's lazy that's where it stops. (and the result is ooba, but not oba)

So while it's not THE shortest possible one, it's a shorter one than the greedy version.

It's not the shortest possible match, just a short match. Greedy mode tries to find the last possible match, lazy mode the first possible match. But the first possible match is not necessarily the shortest one.

Take the input string foobarbaz and the regexp o.*a (greedy) or o.*?a (lazy).

The shortest possible match in this input string would be oba.

However the RegExp looks for matches from left to right, so the o finds the first o in foobarbaz. And if the rest of the pattern produces a match, that's where it stays.

Following the first o, .* (greedy) eats obarbaz (the entire string) and then backtracks in order to match the rest of the pattern (a). Thus it finds the last a in baz and ends up matching oobarba.

Following the first o, .*? (lazy) doesn't eat the entire string, instead it looks for the first occurrence of the rest of the pattern. So first it sees the second o, which doesn't match a, then it sees b, which doesn't match a, then it sees a, which matches a, and because it's lazy that's where it stops. (and the result is ooba, but not oba)

So while it's not THE shortest possible one, it's a shorter one than the greedy version.

added 42 characters in body
Source Link
frostschutz
  • 52.2k
  • 7
  • 129
  • 179

It's not the shortest possible match, just a short match. Greedy mode tries to find the last possible match, lazy mode the first possible match. But the first possible match is not necessarily the shortest one.

Take the input string foobarbaz and the regexp o.*a (greedy) or o.*?a (ungreedy).

The shortest possible match in this input string would be oba.

However the RegExp looks for matches from left to right, so the o finds the first o in foobarbaz. And if the rest of the pattern produces a match, that's where it stays.

Following the first o, .* (greedy) eats obarbaz (the entire string) and then backtracks in order to match the rest of the pattern (a). Thus it finds the last a in baz and ends up matching oobarba.

Following the first o, .*? (ungreedy) doesn't eat the entire string, instead it looks for the first occurance of the rest of the pattern. So first it sees the second o, which doesn't match a, then it sees b, which doesn't match a, then it sees a, which matches a, and because it's lazy that's where it stops. (and the result is ooba, but not oba)

So while it's not THE shortest possible one, it's a shorter one than the greedy version.

It's not the shortest possible match, just a short match. Greedy mode tries to find the last possible match, lazy mode the first possible match. But the first possible match is not necessarily the shortest one.

Take the input string foobarbaz and the regexp o.*a (greedy) or o.*?a (ungreedy).

The shortest possible match in this input string would be oba.

However the RegExp looks for matches from left to right, so the o finds the first o in foobarbaz. And if the rest of the pattern produces a match, that's where it stays.

Following the first o, .* (greedy) eats obarbaz (the entire string) and then backtracks in order to match the rest of the pattern (a). Thus it finds the last a in baz and ends up matching oobarba.

Following the first o, .*? (ungreedy) doesn't eat the entire string, instead it looks for the first occurance of the rest of the pattern. So first it sees the second o, which doesn't match a, then it sees b, which doesn't match a, then it sees a, which matches a, and because it's lazy that's where it stops.

So while it's not THE shortest possible one, it's a shorter one than the greedy version.

It's not the shortest possible match, just a short match. Greedy mode tries to find the last possible match, lazy mode the first possible match. But the first possible match is not necessarily the shortest one.

Take the input string foobarbaz and the regexp o.*a (greedy) or o.*?a (ungreedy).

The shortest possible match in this input string would be oba.

However the RegExp looks for matches from left to right, so the o finds the first o in foobarbaz. And if the rest of the pattern produces a match, that's where it stays.

Following the first o, .* (greedy) eats obarbaz (the entire string) and then backtracks in order to match the rest of the pattern (a). Thus it finds the last a in baz and ends up matching oobarba.

Following the first o, .*? (ungreedy) doesn't eat the entire string, instead it looks for the first occurance of the rest of the pattern. So first it sees the second o, which doesn't match a, then it sees b, which doesn't match a, then it sees a, which matches a, and because it's lazy that's where it stops. (and the result is ooba, but not oba)

So while it's not THE shortest possible one, it's a shorter one than the greedy version.

Source Link
frostschutz
  • 52.2k
  • 7
  • 129
  • 179

It's not the shortest possible match, just a short match. Greedy mode tries to find the last possible match, lazy mode the first possible match. But the first possible match is not necessarily the shortest one.

Take the input string foobarbaz and the regexp o.*a (greedy) or o.*?a (ungreedy).

The shortest possible match in this input string would be oba.

However the RegExp looks for matches from left to right, so the o finds the first o in foobarbaz. And if the rest of the pattern produces a match, that's where it stays.

Following the first o, .* (greedy) eats obarbaz (the entire string) and then backtracks in order to match the rest of the pattern (a). Thus it finds the last a in baz and ends up matching oobarba.

Following the first o, .*? (ungreedy) doesn't eat the entire string, instead it looks for the first occurance of the rest of the pattern. So first it sees the second o, which doesn't match a, then it sees b, which doesn't match a, then it sees a, which matches a, and because it's lazy that's where it stops.

So while it's not THE shortest possible one, it's a shorter one than the greedy version.