2

I'm trying to replace all occurrences of names within a given string. I'm using regex, since a simple substring match won't work in this case and I need to match full words.

My problem is that I can only match words before and after blanks. But for example I cannot replace a string when it's followed by a blank, like:

toReplace()

with: theReplacement()

My regex replace method looks like this:

void replaceWord(std::string &str, const std::string& search, const std::string& replace)
{
    // Regular expression to match words beginning with 'search'
    //    std::regex e ("(\\b("+search+"))([^,. ]*)");
    //    std::regex e ("(\\b("+search+"))\\b)");
    std::regex e("(\\b("+search+"))([^,.()<>{} ]*)");
    str = std::regex_replace(str,e,replace) ;
}

How should the regex look like in order to ignore leading and trailing non-alphanumericals?

4
  • Could you please add some test cases that show what the program does incorrectly, and what it should do instead? Commented Aug 28, 2020 at 16:49
  • std::regex e("(\\b("+search+")\\b)"); Commented Aug 28, 2020 at 16:59
  • @Eljay I've tried that (see commented code above). It doesn't work for cases like this: ideone.com/iMFKfL Commented Aug 28, 2020 at 17:47
  • @benjist • it worked on my machine, where \\b correctly broke only on a word boundary. Commented Aug 28, 2020 at 18:07

1 Answer 1

2

You need to

  • Escape all special characters in the regex pattern with std::regex_replace(search, std::regex(R"([.^$|{}()[\]*+?/\\])"), std::string(R"(\$&)"))
  • Escape all special chars in the replacement pattern with std::regex_replace(replace, std::regex("[$]"), std::string("$$$$")) (that is in case you replace with literal $1 text, $ can be set with $$, so to replace with a double $, we need $$$$ in the replacement here)
  • Wrap your search pattern with unambiguous word boundaries, i.e. "(\\W|^)("+search+")(?!\\w)
  • When you replace, add $1 at the start of the replacement pattern to keep the whitespace (if it is matched and captured into the first group with the (\W|^) pattern).

See C++ sample code:

std::string replaceWord(std::string &str, std::string& search, std::string& replace)
{
    // Escape the literal regex pattern
    search = std::regex_replace(search, std::regex(R"([.^$|{}()[\]*+?/\\])"), std::string(R"(\$&)"));
    // Escape the literal replacement pattern
    replace = std::regex_replace(replace, std::regex("[$]"), std::string("$$$$"));
    std::regex e("(\\W|^)("+search+")(?!\\w)");
    return std::regex_replace(str, e, std::string("$1") + replace);
}

Then,

std::string text("String toReplace()");
std::string s("toReplace()");
std::string r("theReplacement()");
std::cout << replaceWord(text, s, r);   
// => String theReplacement()
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for that. Though it does not work when trying to match a word (without brackets), and when there are non-alpha characters before it. I've made an example: ideone.com/iMFKfL
@benjist Then use unambiguous word boundaries rather than whitespace boundaries. See this C++ demo.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.