1

I am using std::regex and want to find the last position in a string that matches some user defined regular expression string.

For example given the regex :.* and the string "test:55:last", I want to find ":last", not ":55:last".

To clarify, as a user provided regex, I just get their regex plus a "reverse" checkbox, so I can alter the regex, but it must be in a programmatic way.

7
  • Match it this way: :[^:]+$ Commented Oct 4, 2016 at 21:02
  • 1
    @revo: If you are posting an answer please do. I already wrote it, but saw your comment. Commented Oct 4, 2016 at 21:03
  • I'll if this is what OP looks for. Commented Oct 4, 2016 at 21:04
  • Well, there is no other way. Surely, we can use ^.*(:.*)$ but it is not that cool. Commented Oct 4, 2016 at 21:05
  • 1
    @FireLancer: At any rate, there is no option to make C++ std::regex to analyze the string from right to left (as in .NET). Commented Oct 4, 2016 at 21:08

1 Answer 1

4

If you have a user provided regex that you cannot change, but you still need the rightmost match, wrap the pattern with ^.*( and ) (or [\s\S]* to match across linebreaks) and grab capture group 1 contents:

"^.*(:.*)"

See the regex demo

The thing is that the above pattern matches

  • ^ - the start of string
  • .* - matches any 0+ characters other than linebreak characters (if you use [\s\S]*, all chars will be matched) as many as possible (because * is a greedy quantifier)
  • (:.*) - a capturing group that matches : and then any 0+ characters other than linebreak characters.

Note that the first .* will actually grab as many chars as possible, up to the end of the line (and in most cases, it is the end of the string if there are no linebreaks). Then backtracking occurs, the regex engine will start trying to accommodate text for the subsequent subpatterns (here, it will be the user pattern). Thus, the user subpattern that will get captured will be at the rightmost position.

An example (basic) C++ program showing how this can work:

#include <regex>
#include <string>
#include <iostream>
using namespace std;

int main() {
    string user_pattern(":.*");
    string s("test:55:last");
    regex r("^.*(" + user_pattern + ")");
    smatch matches;
    if (regex_search(s, matches, r)) {
        cout<<matches[1].str();
    }
    return 0;
}
Sign up to request clarification or add additional context in comments.

5 Comments

Might be a good idea to explain about the backtracking "trick" used.
The [\s\S]* version worked in all the cases I could think of right now (so in a multi-line document, starting from the last line rather than the first).
I tried the .* pattern but this causes c++ regex_search to fall into infinite recursion on certain architectures. Crayxc super computer suffers from this behavior.
@Bobby another solution is to use :[^:]*$.
@Mehrdad The OP problem with :.* pattern is complex since .* matches :, too. The usual approach to match the last occurrence of a pattern is just match multiple occurrences, all of them, and then get/(only) store the last item. Here, it won't work since .* will consume the whole rest of the string. Generic use of regex is not always very efficient, custome patterns are more efficient. :[^:]*$ (17 steps) will perform better than .*(:.*) (70 steps).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.