Split string using multiple string delimiters

Question

Suppose I have string like

   Harry potter was written by J. K. Rowling

How to split string using was and by as a delimiter and get result in vector in C++?

I know split using multiple char but not using multiple string.

What about the regex token iterator?: en.cppreference.com/w/cpp/regex/regex_token_iterator — tgmath
– tgmath, Commented Apr 2, 2014 at 10:53
I am trying not to use regex till its possible but a regex answer if possible for given example will also be helpful. — psyche
– psyche, Commented Apr 2, 2014 at 10:57

tgmath · Accepted Answer · 2014-04-02 10:57:23Z

3

If you use c++11 and clang there is a solution using a regex string tokenizer:

#include <fstream>
#include <iostream>
#include <algorithm>
#include <iterator>
#include <regex>

int main()
{
   std::string text = " Harry potter was written by J. K. Rowling.";

   std::regex ws_re("(was)|(by)"); 
   std::copy( std::sregex_token_iterator(text.begin(), text.end(), ws_re, -1),
              std::sregex_token_iterator(),
              std::ostream_iterator<std::string>(std::cout, "\n"));


}

The output is :

Harry potter 
 written 
 J. K. Rowling.

Sadly gcc4.8 does not have the regex fully integrated. But clang does compile and link this correctly.

answered Apr 2, 2014 at 10:57

tgmath

13.7k2 gold badges18 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

psyche Over a year ago

BOOST is there if you can give example using boost then also it will be helpful.

Ferenc Deak · Accepted Answer · 2014-04-02 12:08:11Z

1

Brute force approach, not boost, no c++11, optimizations more than welcome:

/** Split the string s by the delimiters, place the result in the 
    outgoing vector result */
void split(const std::string& s, const std::vector<std::string>& delims,
           std::vector<std::string>& result)
{
    // split the string into words
    std::stringstream ss(s);
    std::istream_iterator<std::string> begin(ss);
    std::istream_iterator<std::string> end;
    std::vector<std::string> splits(begin, end);

    // then append the words together, except if they are delimiter
    std::string current;
    for(int i=0; i<splits.size(); i++)
    {
        if(std::find(delims.begin(), delims.end(), splits[i]) != delims.end())
        {
            result.push_back(current);
            current = "";
        }
        else
        {
            current += splits[i] + " " ;
        }
    }

    result.push_back(current.substr(0, current.size() - 1));
}

edited Apr 2, 2014 at 12:08

answered Apr 2, 2014 at 11:34

Ferenc Deak

35.6k19 gold badges107 silver badges176 bronze badges

1 Comment

tgmath Over a year ago

The resulting whitespace at the end and beginning of tokens is not correct. E.g. for the example I get 'Harry potter ','written ','J. K. Rowling ' instead of 'Harry potter ',' written ',' J. K. Rowling'.

Collectives™ on Stack Overflow

Split string using multiple string delimiters

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related