0

I have the requirement to match strings in a C++ code of the form

L, N{1, 3}, N{1, 3}, N{1, 3} 

where in the above pseudo-code, L is always a letter (upper or lower case) or a fullstop (. character) and N is always numeric [0-9].

So explicitly, we might have B, 999, 999, 999 or ., 8, 8, 8 but the number of numeric characters is always the same after each , and is either 1, 2 or 3 digits in length; so D, 23, 232, 23 is not possible.

In C# I would match this as follows

string s = "   B,801, 801, 801 other stuff";
Regex reg = new Regex(@"[\.\w],\s*\d{1,3},\s*\d{1,3},\s*\d{1,3}");
Match m = reg.Match(s);

Great. However, I need a similar regex using boost::regex. I have attempted

std::string s = "   B,801, 801, 801 other stuff";
boost::regex regex("[\\.\w],\s*\d{1,3},\s*\d{1,3},\s*\d{1,3}");
boost::match_results<std::string::const_iterator> results;
boost::regex_match(s, results, regex);

but this is giving me 'w' : unrecognized character escape sequence and the same for s and d. But from the documentation I was under the impression I can use \d, \s and \w without issue.

What am I doing wrong here?


Edit. I have switched to std::regex as-per a comment above. Now, presumably the regex is the same and the following compiles but the regex does not match...

std::string p = "XX";
std::string s = "    B,801, 801, 801 other stuff";
std::regex regex(R"del([\.\w],\s*\d{1,3},\s*\d{1,3},\s*\d{1,3})del");
if (std::regex_match(s, regex))
   p = std::regex_replace(s, regex, "");
1
  • 2
    C++ has escape characters, just like C# does. It also has raw string literals, just like C# has verbatim strings. For what good it does, C++ also has a standard regular expressions library. Commented Jun 18, 2014 at 16:03

1 Answer 1

1

You can use \w, \s, and \d in your regular expressions. However, that's not what you're doing; you're trying to use \w as a character in the string. For there to be a \ followed by a w in the actual string, you need to escape the \ (same for s and d, of course):

boost::regex regex("[\\.\\w],\\s*\\d{1,3},\\s*\\d{1,3},\\s*\\d{1,3}");

As of C++11, you can use raw string literals to make your code even more similar to the C# version:

boost::regex regex(R"del([\.\w],\s*\d{1,3},\s*\d{1,3},\s*\d{1,3})del");
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for your answer. I have ran the regex using boost and std::regex but neither match for the above regex...
@Killercam Does replacing [\\.\\w] with [\\._[:alnum:]] (with std::regex) work for you?
No, if I have std::string s = " B,801, 801, 801 other stuff"; and std::regex regex("\\w"); then std::regex_match(s, regex) returns false!?
@Killercam Correct. std::regex_match attempts to match an entire string. I think what you want is std::regex_search

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.