What is the time complexity of the following code- find smallest missing number in an array

Question

the problem statement is:

given an array A of N integers, returns the smallest positive integer (greater than 0) that does not occur in A.

For example, given A = [1, 3, 6, 4, 1, 2], the function should return 5.

Given A = [1, 2, 3], the function should return 4.

Given A = [−1, −3], the function should return 1.

Can anyone tell me what the time-complexity of the following solution to the code be:

def solution(A):
    m = max(A)
    if m < 1:
       return 1

    A = set(A)
    B = set(range(1, m + 1))
    D = B - A
    if len(D) == 0:
        return m + 1
    else:
        return min(D)

It has to be O(n) or greater because we are finding the max. I had a doubt if creating a set from range 1 to the max element will be nlogn as the result will be a sorted set of elements..?

no, it's not, a number could be skipped like the first example — Karina
– Karina, Commented Aug 26, 2021 at 18:30
You might get a better answer if you ask a more focused question. It seems like you're most interested in set(range(1, m + 1)), though I think set(A) and B - A are also very interesting for this question. — Kyle Parsons
– Kyle Parsons, Commented Aug 26, 2021 at 18:35
set(B) is not sorted, because sets are not sorted in python. Consider solution([1,2,3]) and solution([1,2,1_000_000]): 1.21µs vs 107ms. Your algorithm's runtime depends mostly on the largest number in your array, because of set(range(1, m+1)). — Michael Szczesny
– Michael Szczesny, Commented Aug 26, 2021 at 18:41

user1984 · Accepted Answer · 2021-08-27 07:14:14Z

1

The time complexity of this solution is linear: (O(m + n)) where m is the maximum element in A and n is the length of the input array A).

Here's a break down:

max(A) is O(n) since you have to look at each element of A.
set(A) is O(n) since you have to look at each lement of A.
set(range(1, m + 1)) is O(m) since you iterate over every element between 1 and m.

The result is O(m + n). While we can ignore any multiple of n or m (you don't say O(2n)), we can't ignore the minimum of n and m. The reason is that there is no guaranteed relation between n and m. They can be as different as 1 and 1e64, or vice versa.

On a sidenote, this algorithm is really really bad. You can solve this problem in O(n) time and O(n) additional space.

A solution with O(n) time and space complexity could be as follows:

def solve(A):
  seen = set()
  res = 1
  for el in A:
    seen.add(el):
    while res in seen:
      res += 1
  return res

It works as follows:

Init res to the best possible result (1)
keep track of seen elements with a set for constant time lookups.
iterate over As elements
add them to the set.
increment res while it is present in seen

The reason this is of O(n) time complexity is that in the worst case the elements of the input array don't have any gap between them, start at one, and we have to increment res n times.

If there's any gap, we've found it already and sit there not incrementing further.

edited Aug 27, 2021 at 7:14

answered Aug 26, 2021 at 20:36

user1984

6,9383 gold badges17 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Stef Over a year ago

In the literal sense, "O(m + n) where m is the maximum element in A" is exponential, not linear.

prison-mike Over a year ago

Thank you so much. So what would a better algorithm be?

user1984 Over a year ago

added a better algorithm @prison-mike

Collectives™ on Stack Overflow

What is the time complexity of the following code- find smallest missing number in an array

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related