0

I want to check for a word contained within a bigger string, but not necessarily in the same order. Example: The program will check if the word "car" exists in "crqijfnsa". In this case, it does, because the second string contains c, a, and r.

2
  • In worst case, you could call std::find multiple times or just loop yourself. Not sure if there is an algorithm ready for this. Commented Apr 5, 2014 at 22:37
  • A naive O(n+m) way of doing it (where n is the total length of the two strings and m is the number of unique characters in the string to search for), as @AdrianKrupa suggests, is to convert both strings into a map of character counts, then check to see if the searched for string's character counts can exist in the jumbled string's character counts. Commented Apr 5, 2014 at 22:45

5 Answers 5

1

You could build a map containing the letters "car" with the values set to 0. Cycle through the array with all the letters and if it is a letter in the word "car" change the value to 1. If all the keys in the map have a value greater than 0, than the word can be constructed. Try implementing this.

Sign up to request clarification or add additional context in comments.

1 Comment

Assuming the OP needs to make the general case and not just car, and could have duplicate letters in the search string, it would be better to make the search map a count of the search string's characters and then decrement them as you go through the jumble-string. If all are less than zero, you've got a match.
1

An anagram is a type of word play, the result of rearranging the letters of a word or phrase to produce a new word or phrase, using all the original letters exactly once;

So, actually what you are looking for is an algorithm to check if two words are "Anagrams" are not.

Following thread provides psuedocode that might be helpful Finding anagrams for a given word

Comments

0

A very primitive code would be something like this:

for ( std::string::iterator it=str.begin(); it!=str.end(); ++it)
    for ( std::string::iterator it2=str2.begin(); it2!=str2.end(); ++it2) {
        if (*it == *it2) {
            str2.erase(it);
            break;
        }
    }

if (str2.empty())
    found = true;

1 Comment

This code has issues with multiple occurences of the same character though like "ccar", but it could be a good basis, however.
0

You could build up a table of count of characters of each letter in the word you are searching for, then decrement those counts as you work through the search string.

bool IsWordInString(const char* word, const char* str)
{
    // build up table of characters in word to match
    std::array<int, 256> cword = {0};
    for(;*word;++word) {
        cword[*word]++;
    }
    // work through str matching characters in word
    for(;*str; ++str) {
        if (cword[*str] > 0) {
            cword[*str]--;
        }
    }
    return std::accumulate(cword.begin(), cword.end(), 0) == 0;
}

It's also possible to return as soon as you find a match, but the code isn't as simple.

bool IsWordInString(const char* word, const char* str)
{
    // empty string
    if (*word == 0)
        return true;
    // build up table of characters in word to match
    int unmatched = 0;
    char cword[256] = {0};
    for(;*word;++word) {
        cword[*word]++;
        unmatched++;
    }
    // work through str matching characters in word
    for(;*str; ++str) {
        if (cword[*str] > 0) {
            cword[*str]--;
            unmatched--;
            if (unmatched == 0)
                return true;
        }
    }
    return false;
}

Some test cases

"" in "crqijfnsa" => 1
"car" in "crqijfnsa" => 1
"ccar" in "crqijfnsa" => 0
"ccar" in "crqijfnsac" => 1

Comments

0

I think the easiest (and probably fastest, test that youself :) ) implementation would be done with std::includes:

std::string testword {"car"};
std::string testarray {"crqijfnsa"};

std::sort(testword.begin(),testword.end());
std::sort(testarray.begin(),testarray.end());

bool is_in_array = std::includes(testarray.begin(),testarray.end(),
    testword.begin(),testword.end());

This also handles all cases of duplicate letters correctly. The complexity of this approach should be O(n * log n) where n is the length of testarray. (sort is O(n log n) and includes has linear complexity.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.