0

Here is an algorithm counting occurrences of anagrams of one string (search_word) in the other (text):

#include<iostream>
#include<algorithm>
#include<string>
#include<deque>
using namespace std;

int main()
{
    string text = "forxxorfxdofr";
    string search_word = "for";
    deque<char> word;
    word.insert(word.begin(), text.begin(), text.begin() +  search_word.size());
    int ana_cnt = 0;

    for (int ix = 3; ix <= text.size(); ++ix)
    {   
            deque<char> temp = word;
            sort(word.begin(), word.end());
            if (string(word.begin(), word.end()) == search_word)
                    ++ana_cnt; 
            word = temp;    
            word.pop_front();
            word.push_back(text[ix]);
    }   
    cout << ana_cnt << endl;
}

What's the complexity of this algorithm?

I think it's O(n) algorithm, where n is the length o text. This is because the amount of time needed to execute what is inside for loop is independent of the lenght of n. However, some think it is not O(n). They say the sorting algorithm also counts when computing complexity.

16
  • 10
    Its impossible that it was O(n) cos any sorting algorithm has at least O(nlogn) complexity, and you are doing that in the body of a loop, so the complexity of that algorithm is O(n^2logn) at least Commented Sep 15, 2013 at 16:09
  • 1
    @us2012 ok, its only other variable to consider: O(n*mlogm) which is O(n^2logn) complexity in any practical sense Commented Sep 15, 2013 at 16:13
  • 3
    Side remark: text.begin() + 3 should probably better be text.begin() + search_word.length(). Commented Sep 15, 2013 at 16:19
  • 2
    @us2012 They are not independent. m is bounded by n, i.e. m = O(n). Therefore O(nm log m) = O(n^2 log n). So yes, you are wrong. You’d be right for Θ, but O is an upper bound. Commented Sep 15, 2013 at 16:24
  • 3
    @us2012 You’re correct there. For that reason I intensely dislike the convention of equalling O-notation in computer science. More mathematically correct would be to use “∈” instead of “=” to make it clear that they are not equivalent. And yes, you should generally give bounds as sharply as possible. Regarding the case m>n, in a sane algorithm this would be caught with a simple check in O(1), or in the worst case after probing n characters. Commented Sep 15, 2013 at 16:36

1 Answer 1

1

It's O(n) if you only consider the string text with length n as input.

Proof: You're looping over ix from 3 (probably search_word.size(), isn't it?) to text.size(), so asymptotically you execute the loop body n times (since there is no break, continue or modification of ix in the loop body).

The loop body is independent of n. It sorts a queue of fixed size, namely m = search_word.size(), that is O(m log(m)) in the average case (worst case O(m^2)). As this is independent of n we're done with a total of O(n).

It's not O(n): If you want to be a little bit more precise, you'd probably count search_word with length m as input and this comes to a total of O(n m log(m)) on average, O(n m^2) in the worst case.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.