13

I'm trying to extract submatches from a text file using boost regex. Currently I'm only returning the first valid line and the full line instead of the valid email address. I tried using the iterator and using submatches but I wasn't having success with it. Here is the current code:

if(Myfile.is_open()) {
    boost::regex pattern("^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})$");
    while(getline(Myfile, line)) {
            string::const_iterator start = line.begin();
            string::const_iterator end = line.end();
            boost::sregex_token_iterator i(start, end, pattern);
            boost::sregex_token_iterator j;
            while ( i != j) {
            cout << *i++ << endl;  

    } 
    Myfile.close(); 
}

4 Answers 4

19

Use boost::smatch.

boost::regex pattern("what(ever) ...");
boost::smatch result;
if (boost::regex_search(s, result, pattern)) {
    string submatch(result[1].first, result[1].second);
    // Do whatever ...
}
Sign up to request clarification or add additional context in comments.

1 Comment

Perhaps my Regex is wrong but that's not yielding proper results for me.
17
const string pattern = "(abc)(def)";  
const string target = "abcdef"; 

boost::regex regexPattern(pattern, boost::regex::extended); 
boost::smatch what; 

bool isMatchFound = boost::regex_match(target, what, regexPattern); 
if (isMatchFound) 
{ 
    for (unsigned int i=0; i < what.size(); i++) 
    { 
        cout << "WHAT " << i << " " << what[i] << endl; 
    } 
} 

The output is the following

WHAT 0 abcdef 
WHAT 1 abc 
WHAT 2 def 

Boost uses parenthesized submatches, and the first submatch is always the full matched string. regex_match has to match the entire line of input against the pattern, if you are trying to match a substring, use regex_search instead.

The example I used above uses the posix extended regex syntax, which is specified using the boost::regex::extended parameter. Omitting that parameter changes the syntax to use perl style regex syntax. Other regex syntax is available.

Comments

0

This line:

string submatch(result[1].first, result[1].second);

causes errors in visual c++ (I tested against 2012, but expect earlier version do, too)

See https://groups.google.com/forum/?fromgroups#!topic/cpp-netlib/0Szv2WcgAtc for analysis.

Comments

0

The most simplest way to convert boost::sub_match to std::string :

boost::smatch result;
// regex_search or regex_match ...
string s = result[1];

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.