I have a list d_n of n integers and the results Z_klm of a function fun(dk, dl, dm) for all binom(n, 3) combinations without repetition (k, l, m) out of d_n indices.
Now, for all binom(n, 2) combinations without repetitions (s, t) of d_n indices, I need to take the T_st partial sums of Z_klm where (s, t) is a subset of (k, l, m).
Here's an example with a small n.
import numpy as np
from itertools import combinations
from scipy.special import binom
# generate random data
n = 5
d_n = np.random.randint(-30, 30, 5)
# define the function
def fun(dk, dl, dm):
return np.sign(dk - 2*dl + dm)
# calculate Z_klm and store (k,l,m) indices
klm = []
Z_klm = []
for k, l, m in combinations(range(n), 3):
Z_klm.append(fun(d_n[k], d_n[l], d_n[m]))
klm.append([k, l, m])
# calculate the partial sums T_st
st = {}
T_st = np.zeros(shape=int(binom(n, 2)))
for h, (s, t) in enumerate(combinations(range(n), 2)):
st.update({f"({s},{t})": []})
for i, _klm_ in enumerate(klm):
if s in _klm_ and t in _klm_:
T_st[h] += Z_klm[i]
st[f"({s},{t})"].append(_klm_)
T_st, st
(array([ 1., 1., 2., 2., 1., 1., 1., 1., 1., -2.]),
{'(0,1)': [[0, 1, 2], [0, 1, 3], [0, 1, 4]],
'(0,2)': [[0, 1, 2], [0, 2, 3], [0, 2, 4]],
'(0,3)': [[0, 1, 3], [0, 2, 3], [0, 3, 4]],
'(0,4)': [[0, 1, 4], [0, 2, 4], [0, 3, 4]],
'(1,2)': [[0, 1, 2], [1, 2, 3], [1, 2, 4]],
'(1,3)': [[0, 1, 3], [1, 2, 3], [1, 3, 4]],
'(1,4)': [[0, 1, 4], [1, 2, 4], [1, 3, 4]],
'(2,3)': [[0, 2, 3], [1, 2, 3], [2, 3, 4]],
'(2,4)': [[0, 2, 4], [1, 2, 4], [2, 3, 4]],
'(3,4)': [[0, 3, 4], [1, 3, 4], [2, 3, 4]]})
For example T_{2,4} is the sum of Z_{0,2,4}, Z_{1,2,4}, and Z_{2,3,4}.
My rough implementation is working only because there are very few observations here. But in case of real sample sizes (can usually be up to n = 1000), it would take a lifetime to iterate all over the binom(n,2) and the binom(n,3).
Any suggestions to speed it up with an efficient algorithm or more advanced iteration tools?
P.S.: I do not need to store all (k, l, m) and (s, t) indices; I only did it here to demonstrate how it works and to implement the algorithm for calculating the T_st partial sums.
fun, correct? And are you only interested in the '2 out of 3' case or are you looking for a solution where the tuple-size might vary?funtaking 3 integer arguments and always '2 out of 3' case.(k, l, m)you get the followings(k, l),(k, m),(l, m)? This will possibly generate repeated pairs(s,t)combination without repetition ofnelements, where(k,l)=(s,t)or(k,m)=(s,t)or(l,m)=(s,t). I know it isn't a rigorous definition. If it could help, I'll post a LaTeX formula.