Advanced combinatorics in Python: binom(n,2) subsets of binom(n,3) combinations without repetition

Question

I have a list d_n of n integers and the results Z_klm of a function fun(dk, dl, dm) for all binom(n, 3) combinations without repetition (k, l, m) out of d_n indices.

Now, for all binom(n, 2) combinations without repetitions (s, t) of d_n indices, I need to take the T_st partial sums of Z_klm where (s, t) is a subset of (k, l, m).

Here's an example with a small n.

import numpy as np
from itertools import combinations
from scipy.special import binom

# generate random data
n = 5
d_n = np.random.randint(-30, 30, 5)

# define the function
def fun(dk, dl, dm):
    return np.sign(dk - 2*dl + dm)

# calculate Z_klm and store (k,l,m) indices
klm = []
Z_klm = []
for k, l, m in combinations(range(n), 3):
    Z_klm.append(fun(d_n[k], d_n[l], d_n[m]))
    klm.append([k, l, m])

# calculate the partial sums T_st
st = {}
T_st = np.zeros(shape=int(binom(n, 2)))
for h, (s, t) in enumerate(combinations(range(n), 2)):
    st.update({f"({s},{t})": []})
    for i, _klm_ in enumerate(klm):
        if s in _klm_ and t in _klm_:
            T_st[h] += Z_klm[i]
            st[f"({s},{t})"].append(_klm_)

T_st, st

(array([ 1.,  1.,  2.,  2.,  1.,  1.,  1.,  1.,  1., -2.]),

 {'(0,1)': [[0, 1, 2], [0, 1, 3], [0, 1, 4]],
  '(0,2)': [[0, 1, 2], [0, 2, 3], [0, 2, 4]],
  '(0,3)': [[0, 1, 3], [0, 2, 3], [0, 3, 4]],
  '(0,4)': [[0, 1, 4], [0, 2, 4], [0, 3, 4]],
  '(1,2)': [[0, 1, 2], [1, 2, 3], [1, 2, 4]],
  '(1,3)': [[0, 1, 3], [1, 2, 3], [1, 3, 4]],
  '(1,4)': [[0, 1, 4], [1, 2, 4], [1, 3, 4]],
  '(2,3)': [[0, 2, 3], [1, 2, 3], [2, 3, 4]],
  '(2,4)': [[0, 2, 4], [1, 2, 4], [2, 3, 4]],
  '(3,4)': [[0, 3, 4], [1, 3, 4], [2, 3, 4]]})

For example T_{2,4} is the sum of Z_{0,2,4}, Z_{1,2,4}, and Z_{2,3,4}.

My rough implementation is working only because there are very few observations here. But in case of real sample sizes (can usually be up to n = 1000), it would take a lifetime to iterate all over the binom(n,2) and the binom(n,3).

Any suggestions to speed it up with an efficient algorithm or more advanced iteration tools?

P.S.: I do not need to store all (k, l, m) and (s, t) indices; I only did it here to demonstrate how it works and to implement the algorithm for calculating the T_st partial sums.

You are asking how to do this for any fun, correct? And are you only interested in the '2 out of 3' case or are you looking for a solution where the tuple-size might vary? — Grismar
– Grismar, Commented Oct 31 at 23:10
Hi @Grismar. Yes, for any fun taking 3 integer arguments and always '2 out of 3' case. — Max Pierini
– Max Pierini, Commented Nov 1 at 9:45
what is a "(s, t) is a subset of (k, l, m)"? For a single (k, l, m) you get the followings (k, l), (k, m), (l, m)? This will possibly generate repeated pairs — cards
– cards, Commented Nov 1 at 9:54
@cards I mean, for each (s,t) combination without repetition of n elements, where (k,l)=(s,t) or (k,m)=(s,t) or (l,m)=(s,t). I know it isn't a rigorous definition. If it could help, I'll post a LaTeX formula. — Max Pierini
– Max Pierini, Commented Nov 1 at 10:47

Omgitsbees · Accepted Answer · 2025-11-01 00:37:15Z

1

For large n (like 1000), you will want to avoid the nested loop structure. Here is my recommended code for better performance with n=1000:

def compute_partial_sums_optimal(d_n, fun):
    n = len(d_n)
    pairs_list = list(combinations(range(n), 2))
    n_pairs = len(pairs_list)
    
    T_st = np.zeros(n_pairs)
    
    # For each pair (s,t), sum over all third indices
    for i, (s, t) in enumerate(pairs_list):
        # All triples containing (s,t)
        for r in range(n):
            if r != s and r != t:
                k, l, m = sorted([s, t, r])
                T_st[i] += fun(d_n[k], d_n[l], d_n[m])
    
    return T_st

# Benchmark
import time

n = 100  # Start with n=100 for testing
d_n = np.random.randint(-30, 30, n)

start = time.time()
result = compute_partial_sums_optimal(d_n, fun)
print(f"Time for n={n}: {time.time() - start:.3f}s")

answered Nov 1 at 0:37

Omgitsbees

2564 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Max Pierini Nov 1 at 9:51

Nice idea, thanks. I'd only avoid the pairs_list = list(combinations(range(n), 2)) cause it could be extremely large. We know n_pairs = int(binom(n, 2)) and the cycle could be for i, (s, t) in enumerate(combinations(range(n), 2)):. What do you think about it?

Stef Nov 3 at 14:40

Yes, you can replace len(list(itertools.combinations(range(n), k))) with the much more efficient math.comb(n, k), and then iterate directly using iterators, for i,(s, t) in enumerate(combinations(range(n), 2)): and DO NOT call list( ) on it

Collectives™ on Stack Overflow

Advanced combinatorics in Python: binom(n,2) subsets of binom(n,3) combinations without repetition

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related