0

I been stumped on a problem for a while. I can't seem to check a text file with a set of excluded words before inputing it into a map container. I tried many things but it just can't seem to solve it. I'm new to C++ and just started to learn STL and containers.

using namespace std;
//checking I know is wrong but I do not know how to compare the pair with the set.

bool checking(pair<string, int> const & a, set<string> const &b) {
    return a.first != b;
}

void print(pair<string, int> const & a) {cout << a.first << "  " << a.second << endl;}

int main() {

    ifstream in("document.txt");
    ifstream exW("excluded.txt");

    map<string, int> M;
    set<string> words;

    copy(istream_iterator<string>(exW),
         istream_iterator<string>(),
         inserter(words, begin(words)));

    //Need to exlclude certain words before copying into a Map
    // CAN NOT USE FOR LOOP
    //I cant seem to get the predicate right.
    copy_if(istream_iterator<string>(in),
            istream_iterator<string>(),
    [&](const string & s) { M[s]++;},
    checking);

    for_each(begin(M),
             end(M),
             [](pair<string, int> const & a) 
             {
                 cout << a.first << "  " <<  a.second << endl;
             }
    );

    return 0;
}

Any tips or advice word be great!

3
  • 2
    What exactly are you trying to achieve? You cannot "compare" a string with a set, they represent different concepts altogether. Are you trying to see if the string belongs to the set? Commented Nov 24, 2015 at 23:02
  • @vsoftco I"m trying to read the sample.txt and copy_if not exclude words into a map container. Commented Nov 24, 2015 at 23:19
  • The copy_if predicate needs to return true if you want the string copied, and false if not. Commented Nov 24, 2015 at 23:21

2 Answers 2

2

I'd do it like this, using a lambda expression as your test, so this can get you started:

#include <set>
#include <fstream>
#include <iostream>
#include <algorithm>
#include <iterator>

using namespace std;

int main() 
{
    ifstream in("document.txt");
    ifstream exW("excluded.txt");

    set<string> words{istream_iterator<string>(exW),{}}; // here we store the excluded words

    copy_if(istream_iterator<string>(in),
            istream_iterator<string>(), // can also use just {} instead
            ostream_iterator<string>(std::cout," "), // output to std::cout
            [&words](const std::string& word) // this is how the predicate should look
            {
                return words.find(word) == words.end(); // true if not found
            }
            );
}

Note that I output directly to std::cout in the std::copy_if. You can of course use an iterator into some container instead (your std::map for example). Also remark that the predicate takes a std::string as an input (that's what you verify) and checks whether it belongs to the std::set of excluded words, returning a bool. Also words needs to be captured inside the lambda. I capture it by reference so you don't end up with an additional copy.

Sign up to request clarification or add additional context in comments.

2 Comments

@vsofto Thanks vsoft! Is there any source you recommend for learning and understand about stl containers and iterators better?
@Nahniv stackoverflow.com/questions/388242/… The ultimate standard library book (IMO) is The C++ Standard Library by Nicolai Josuttis.
0

If you need to use a standard algorithm instead of a loop then I can suggest to use standard algorithm std::accumulate declared in header <numeric>

Here is a demonstrative program. Instead of the files I am using string streams.

#include <iostream>
#include <set>
#include <map>
#include <string>
#include <sstream>
#include <numeric>
#include <iterator>

int main( void )
{
    std::istringstream exclude( "two four six" );
    std::set<std::string> words( ( std::istream_iterator<std::string>( exclude ) ),
                                 std::istream_iterator<std::string>() ); 

    for ( const auto &t : words ) std::cout << t << ' ';
    std::cout << std::endl;

    std::cout << std::endl;

    std::map<std::string, int> m;

    std::istringstream include( "one two three four five six five four one one" );

    std::accumulate( std::istream_iterator<std::string>( include ),
                     std::istream_iterator<std::string>(),
                     &m,
                     [&]( std::map<std::string, int> *acc, const std::string &t )
                     {
                         if ( !words.count( t ) ) ++( *acc )[t];
                         return acc;
                     } );

    for ( const auto &p : m ) std::cout << p.first << '\t' << p.second << std::endl;                     
}

The program output is

four six two 

five    2
one 3
three   1

For readability of the program the lambda definition can be placed outside the algorithm call. For example

auto add_if_not_in_set = [&]( std::map<std::string, int> *acc, const std::string &t )
{
    if ( !words.count( t ) ) ++( *acc )[t];
    return acc;
};

//...

std::accumulate( std::istream_iterator<std::string>( include ),
                 std::istream_iterator<std::string>(),
                 &m, add_if_not_in_set );

Or as @T.C. pointed out a more simplified approach is to use standard algorithm std::for_each

For example

#include <iostream>
#include <set>
#include <map>
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>

int main( void )
{
    std::istringstream exclude( "two four six" );
    std::set<std::string> words( ( std::istream_iterator<std::string>( exclude ) ),
                                 std::istream_iterator<std::string>() ); 

    for ( const auto &t : words ) std::cout << t << ' ';
    std::cout << std::endl;

    std::cout << std::endl;

    std::map<std::string, int> m;


    std::istringstream include( "one two three four five six five four one one" );

    std::for_each( std::istream_iterator<std::string>( include ),
                   std::istream_iterator<std::string>(),
                   [&m, &words]( const std::string &s )
                   {
                       if ( !words.count( s ) ) ++m[s];
                   } );

    for ( const auto &p : m ) std::cout << p.first << '\t' << p.second << std::endl;                     
}

Usually the same task can be done in several ways using different algorithms.:)

2 Comments

Why are you using accumulate and passing an accumulator that doesn't get changed ever? std::for_each( std::istream_iterator<std::string>( include ), std::istream_iterator<std::string>(), [&]( const std::string &t ) { if ( !words.count( t ) ) ++m[t]; } ); is shorter and arguably easier to understand.
@T.C. I did not use std::for_each because I thought that it is the same as the range-based for loop.:) So I thought about some other algorithm.:)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.