What's the complexity of this algorithm?

Question

Here is an algorithm counting occurrences of anagrams of one string (search_word) in the other (text):

#include<iostream>
#include<algorithm>
#include<string>
#include<deque>
using namespace std;

int main()
{
    string text = "forxxorfxdofr";
    string search_word = "for";
    deque<char> word;
    word.insert(word.begin(), text.begin(), text.begin() +  search_word.size());
    int ana_cnt = 0;

    for (int ix = 3; ix <= text.size(); ++ix)
    {   
            deque<char> temp = word;
            sort(word.begin(), word.end());
            if (string(word.begin(), word.end()) == search_word)
                    ++ana_cnt; 
            word = temp;    
            word.pop_front();
            word.push_back(text[ix]);
    }   
    cout << ana_cnt << endl;
}

What's the complexity of this algorithm?

I think it's O(n) algorithm, where n is the length o text. This is because the amount of time needed to execute what is inside for loop is independent of the lenght of n. However, some think it is not O(n). They say the sorting algorithm also counts when computing complexity.

Its impossible that it was O(n) cos any sorting algorithm has at least O(nlogn) complexity, and you are doing that in the body of a loop, so the complexity of that algorithm is O(n^2logn) at least — Manu343726
– Manu343726, Commented Sep 15, 2013 at 16:09
@us2012 ok, its only other variable to consider: O(n*mlogm) which is O(n^2logn) complexity in any practical sense — Manu343726
– Manu343726, Commented Sep 15, 2013 at 16:13
Side remark: text.begin() + 3 should probably better be text.begin() + search_word.length(). — dyp
– dyp, Commented Sep 15, 2013 at 16:19
@us2012 They are not independent. m is bounded by n, i.e. m = O(n). Therefore O(nm log m) = O(n^2 log n). So yes, you are wrong. You’d be right for Θ, but O is an upper bound. — Konrad Rudolph
– Konrad Rudolph, Commented Sep 15, 2013 at 16:24
@us2012 You’re correct there. For that reason I intensely dislike the convention of equalling O-notation in computer science. More mathematically correct would be to use “∈” instead of “=” to make it clear that they are not equivalent. And yes, you should generally give bounds as sharply as possible. Regarding the case m>n, in a sane algorithm this would be caught with a simple check in O(1), or in the worst case after probing n characters. — Konrad Rudolph
– Konrad Rudolph, Commented Sep 15, 2013 at 16:36

stefan · Accepted Answer · 2013-09-15 16:48:57Z

1

It's O(n) if you only consider the string text with length n as input.

Proof: You're looping over ix from 3 (probably search_word.size(), isn't it?) to text.size(), so asymptotically you execute the loop body n times (since there is no break, continue or modification of ix in the loop body).

The loop body is independent of n. It sorts a queue of fixed size, namely m = search_word.size(), that is O(m log(m)) in the average case (worst case O(m^2)). As this is independent of n we're done with a total of O(n).

It's not O(n): If you want to be a little bit more precise, you'd probably count search_word with length m as input and this comes to a total of O(n m log(m)) on average, O(n m^2) in the worst case.

answered Sep 15, 2013 at 16:48

stefan

10.4k4 gold badges55 silver badges91 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

What's the complexity of this algorithm?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related