1

I am trying to replace the occurrences of CR(\r), LF(\n), and the combination of CR and LF as follows

  1. search for patter ([\r\n]+) // pattern can be '\r', '\r\r\r\n\n' '\r\n\r\n' or any combination.
  2. If length is 1 and character is CR, replace with LF. // pattern is '\r'
  3. If length is 2 and both characters are different then replace with LF. // pattern is '\r\n or \n\r'
  4. else replace with 2 LF. // any pattern longer than 2 characters
std::regex search_exp("([\r\n]+)");
auto replace_func = [](std::string& str_mat) -> std::string {
        std::string ret = "";
        if ((str_mat.length() == 1)) {
          if (str_mat == "\r")
            ret = "\n";
        } else if (str_mat.length() == 2 && (str_mat.at(0) != str_mat.at(1))) {
            ret = "\n";
        } else {
          ret = "\n\n";
        }
        return ret;
    }; 
auto str = std::regex_replace(str, search_exp, replace_func);

But std::regex_replace does take lambda function. :(

Edit: Example: "\rThis is just an example\n Learning CPP \r\n Stuck at a point \r\n\n\r C++11 onwards \r\r\r\n\n"

Any suggestions?

2
  • Why don't you describe what you are trying to accomplish? It looks perhaps like you are trying to remove \r when it immediately precedes or follows a \n, but if it is by itself you want to convert it to a \n. Also, regex_replace is powerful enough that you should not need to construct a replacement string: for example, regex_replace (str, "\n?\r|\r\n?", "\n") should perform what I just described. Commented Oct 21, 2021 at 4:11
  • @ChrisMaurer Thanks for the comment, added more description. Commented Oct 21, 2021 at 6:29

1 Answer 1

2

The problem with what you've written is that regex_replace repeats its search & replace for each occurrence. Sometimes this will require one \n and in other cases two. You can't really tailor your replacement in that way.

You can, however, get clever with lookaheads that won't consume the final CR or LF. Then you rely on the repeated processing of regex_replace to add the second \n.

regex_replace(str, "(\r\n?|\n\r)(?![\n\r])|[\n\r]+(?=[\n\r])", "\n")

This is pretty daunting. But you can take it apart into two pieces. The first half is (\r\n?|\n\r)(?![\n\r])| which looks for CR, or CRLF, or LFCR and looks ahead to make sure there are NO CR or LF following. The second half is [\n\r]+(?=[\n\r]) which looks for multiple CR or LF characters in any combination, but does not include the last one. That insures the next repeat will find it and add the second LF.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.