0

I have a structure:

struct wordItem
{
   string word;
   int count;
};

I'm reading in a text file with many different words and storing them into an array.

    ifstream inputFile("data.txt");
    if(inputFile.is_open())
    {
         while(getline(inputFile, data, ' '))
         {
         wordItemList[i].word = data;
         i++;
         }
    }

My question is what is the best way to count each time a word appears in the array. For example if my data.txt file was

the fox jumped over the fence

I want to be able to store how many times each word appears within the struct within the "int count;"

4
  • You need a data structure to map a word to a counter. In C++ we have std::map. If this is homework you should tell us what your level of progress is, maybe a std::map is not yet an option to you. Commented Sep 5, 2016 at 20:55
  • std::map<std::string, int> or std::unordered_map<std::string, int>. Commented Sep 5, 2016 at 20:57
  • Concur with Cornstalks. Something like this. Apologies in advance for any syntax errors. Change every unordered_map to map if you want ordering). Yes, it really is that simple. In short, you don't really need that structure; the map will hold the count for your as the mapped-to value. Commented Sep 5, 2016 at 21:06
  • C++ does not have structures. You have a class. Commented Sep 5, 2016 at 21:14

2 Answers 2

1
ifstream inputFile("data.txt");
if(!inputFile.is_open()) {
    cerr << "Can't open data.txt\n";
    exit(0);
}

map<string, int> freq;
while(getline(inputFile, word, ' '))
    ++freq[word];
Sign up to request clarification or add additional context in comments.

Comments

1

Use an std::multiset or std::unordered_multiset. The performance depends a bit on your data set so some tuning is required to find the best one in practice. Something like this would work (adapt with your file reading code):

#include <iostream>
#include <unordered_set>

int main() {

    std::unordered_multiset<string> dict;

    for (auto&& s : {"word1", "word2", "word1"}) {
       dict.insert(s);
    }

    std::cout << dict.count("word1") << std::endl; // prints 2
    return 0;
}

Depending on the data set & size, you could also use a more optimised data structure for storing & comparing strings, such as a trie, but this is not available in the standard, or boost and most of the times is a bit of an overkill IMHO (although you can find some implementations).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.