You've got an algorithm for "k-combinations": given n items, select k of them, treating ordering as irrelevant. From the ancients, we know how many combinations to expect:
n!
-----------
(n - k)! k!
For a given n (for example, 10), that expression is maximized when k equals half of n (5). As either n or k approach the extremes, the number of combinations gets much smaller.
With a little bit of reorganizing and simplifying, we can rewrite your code so that the number of calls to combos() is roughly equal to the number of combinations in the worst case. Interestingly, the number of calls and the number of combinations have a nicely symmetrical inverse relationship.
Most important is that both are bounded above by the formula shown above for the worst case. That effectively is the O() bound that you're asking for. But maybe not exactly, because the rewritten code makes fewer subroutine calls than your code, even though they do produce the same results. The short-circuiting logic in the example below prevents the extra calls and thus allows the worst-case argument to operate cleanly.
If that formula is your worst-case bound, does your algorithm run in polynomial time? I'm more intuitive than expert on such matters, but I think the answer is no. The worst case is when k = n / 2, which gives you the following simplification. Even though the denominator gets really big really fast, it pales in comparison to the Chuck-Norris rate of growth of the numerator.
n!
-------------
(n/2)! (n/2)!
# For example, when n = 40.
product(1..40) product( 21..40) # Eat my dust, Homer!
----------------------------- = ---------------------
product(1..20) product(1..20) product(1..20 ) # Doh!
# Q.E.D.
An empirical illustration for many values of n and k:
from itertools import combinations
from math import factorial
n_calls = 0
def combos(vals, size):
# Track the number of calls.
global n_calls
n_calls += 1
# Basically your algorithm, but simplified
# and written as a generator.
for i in range(0, len(vals) - size + 1):
v = vals[i]
if size == 1:
yield [v]
else:
for c in combos(vals[i+1:], size - 1):
yield [v] + c
def expected_n(n, k):
# The mathematical formula for expected N of k-combinations.
return factorial(n) / ( factorial(n - k) * factorial(k) )
def main():
global n_calls
# Run through a bunch of values for n and k.
max_n = 15
for n in range(1, max_n + 1):
# Worst case is when k is half of n.
worst_case = expected_n(n, n // 2)
for k in range(1, n + 1):
# Get the combos and count the calls.
n_calls = 0
vs = list(range(n))
cs = list(combos(vs, k))
# Our result agrees with:
# - itertools.combinations
# - the math
# - the worst-case analysis
assert cs == list(list(c) for c in combinations(vs, k))
assert len(cs) == expected_n(n, k)
assert n_calls <= worst_case
assert len(cs) <= worst_case
# Inspect the numbers for one value of n.
if n == max_n:
print [n, k, len(cs), n_calls]
main()
Output:
[15, 1, 15, 1]
[15, 2, 105, 15]
[15, 3, 455, 105]
[15, 4, 1365, 455]
[15, 5, 3003, 1365]
[15, 6, 5005, 3003]
[15, 7, 6435, 5005]
[15, 8, 6435, 6435]
[15, 9, 5005, 6435]
[15, 10, 3003, 5005]
[15, 11, 1365, 3003]
[15, 12, 455, 1365]
[15, 13, 105, 455]
[15, 14, 15, 105]
[15, 15, 1, 15]
\binom{len(list)}{size}elements, so the complexity can't be less than that (and if my envelope-less back-of-envelope calculations are right, that's the complexity you have). For fixedsize, it's polynomial inlen(list).combo(list, len(list)//2)isO(2^len(list)/sqrt(len(list))).