0

This is probably a simple question, but I can't get it to work. I've searched and have tried all the suggestions people gave for the OR operator ||, but my code just won't properly use it.

So here's what I've got: This part of my code looks at a vector (currently around ~20,000 entries long, but later will be in the millions) and erases all elements that contain the keywords: " event", " /event", " rwgt", and " /rwgt". It works perfectly fine when I use four for loops, one for each keyword like so:

for (int j = 0; j< myvec.size()-1;j++) {

    if(myvec[j] == "  <event>") {    //erase all instances of "<event>"
        myvec.erase(myvec.begin()+j);
    }
}

for (int j = 0; j< myvec.size()-1; j++) {

    if(myvec[j] == "  </event>") {   //erase all instances of "</event>"
        myvec.erase(myvec.begin()+j);
    }
}

for (int j = 0; j< myvec.size()-1; j++) {

    if(myvec[j] == "  <rwgt>") {   //erase all instances of "<rwgt>"
        myvec.erase(myvec.begin()+j);
    }
}

for (int j = 0; j< myvec.size()-1; j++) {

    if(myvec[j] == "  </rwgt>") {   //erase all instances of "</rwgt>"
        myvec.erase(myvec.begin()+j);
    }
}

However this is getting fairly computationally expensive (it takes a few minutes right now with only 20,000 entries; I can't imagine it when we get to the millions!) So I wanted to combine all four keywords into one for loop using the || (OR) operator like so:

for (int j = 0; j< myvec.size()-1;j++) {

if(myvec[j] == "  <event>" || myvec[j]  == "  </event>" || myvec[j] == "  <rwgt>" || myvec[j] == "  </rwgt>") {  //erase all instances of "<event>"
    myvec.erase(myvec.begin()+j);
}
}

However this is taking even longer than the original four for loops, and it ends up giving me a segmentation fault (core dumped) error, which to my knowledge means the vector is now empty somehow.

Does anyone have any ideas on how to fix this?

Thanks in advance!

3
  • 2
    C++ have many nice algorithmic function in the standard library, including a function to remove entries from a range. Use it with the erase-remove idiom. Commented Jun 16, 2015 at 13:32
  • 3
    Also, if you are in fact parsing some XML, use a specialized parser. Commented Jun 16, 2015 at 13:34
  • 1
    Please note your code as written is buggy and will skip some elements. This is due to you erasing an item and then moving on. If, for example, you erase something at position '4' you'll then check position 5. However, when you erase position 4 the rest of the vector will move up putting something new in position 4 but you wont run the check on this... Commented Jun 16, 2015 at 14:06

3 Answers 3

3

You can use functor and std::remove_if() to do this.

#include <iostream>
#include <vector>
#include <string>
#include <algorithm>


struct removeTag
{
    bool operator()(const std::string& tag) const
    {
        return tag == "  <event>" || tag == "  </event>" || tag == "  <rwgt>" || tag == "  </rwgt>";
    }
};

int main()
{
    std::vector<std::string> data = { "  <event>", "  </event>", "  <rwgt>", "  </rwgt>" };
    auto it = std::remove_if(data.begin(), data.end(), removeTag());
    data.erase(it, data.end());
    std::cout << data.size();
    std::cin.get();
    return 0;
}

Ouput:

0
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for your input! I tried using the method you gave but my compiler didn't like it, it seems like I need to install c++11 in order for it to work. Guess that's what I'll be doing next!
Yes std::vector<std::string> data = { " <event>", " </event>", " <rwgt>", " </rwgt>" }; requires C++11 as it uses an initializer list.
2

Use erase-remove idiom.

myvec.erase(
    std::remove_if(myvec.begin(), myvec.end(), [](std::string const &e) {
      return e == " blabla" || e == "blasd";
    }), myvec.end());

Comments

1

I tried your code, and it worked fine after I replaced

for(int j = 0; j < myvec.size()-1; j++)

with

for(int j = 0; j < myvec.size(); j++)

and

myvec.erase(myvec.begin() + j);

with

myvec.erase(myvec.begin() + j--);

My Testprogramm looked like this:

#include <iostream>
#include <string>
#include <vector>

using namespace std;

int main()
{
   vector<string> myvec = {"<event>", "nichts", "</event>", "<rwgt>", "</rwgt>"};

   for (int j = 0; j < myvec.size(); j++) {

       if(myvec[j] == "<event>" || myvec[j] == "</event>" || 
             myvec[j] == "<rwgt>" || myvec[j] == "</rwgt>") {   

          myvec.erase(myvec.begin() + j--);
       }
   }

   for(int i = 0; i < myvec.size(); i++) {
      cout<<myvec[i]<<endl;
   }
   return 0;
}

1 Comment

@khfrekek This solution is far less efficient than the ones presented in the other two answers, which is why you're seeing such poor performance. Think (and learn) about what happens in which case, and don't use this code in practice.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.