1

Input:

import numpy as np
import itertools
a = np.array([2,  3,  5, 13, 14, 16, 19, 20, 22, 29, 30, 34, 41, 43, 45, 49, 50,
       54, 60, 63, 66, 73, 75, 79, 82, 93]).astype(np.uint8)
b = np.array([ 9, 10, 11, 13, 14, 16, 17, 20, 22, 28, 30, 35, 36, 41, 45, 47, 49,
       54, 58, 63, 64, 66, 68, 73, 74, 76, 78, 83, 84, 86, 87, 92, 94, 95]).astype(np.uint8)
c = np.array([ 8, 11, 15, 17, 28, 32, 36, 38, 39, 42, 47, 52, 53, 58, 61, 62, 64,
       68, 69, 70, 71, 74, 78, 79, 80, 81, 83, 84, 85, 86, 89, 92]).astype(np.uint8)
d = np.array([ 6,  8, 12, 15, 21, 23, 25, 26, 31, 32, 37, 38, 44, 51, 52, 53, 56,
       57, 61, 62, 65, 67, 70, 71, 72, 81, 85, 89, 90]).astype(np.uint8)
e = np.array([ 0,  1,  4,  7, 18, 21, 23, 24, 25, 26, 27, 33, 40, 44, 46, 48, 51,
       55, 56, 57, 59, 65, 67, 77, 88, 90, 91]).astype(np.uint8)

Current Solution:

# Get product and ensure no element is repeated
main_combos = np.array([i for i in itertools.product(a, b, c, d, e) if len(set(i)) == 5])

# Remove duplicates (should not return both [13, 14, 8, 6, 0], [14, 13, 8, 6, 0])
u = np.sort(main_combos, axis=1)
_, idx = np.unique(u, axis=0, return_index=True)
main_combos = main_combos[idx]

This currently takes ~41 seconds on my machine to get all 14,133,909 rows, I was wondering if the performance of this can be improved.

The order of each individual row needs to stay the same for example the first element of each array [2, 9, 8, 6, 0] should be in this same order in the final output.

1 Answer 1

1

Your solution takes 58 seconds on my machine.

If you dont require the use of Numpy, the following runs in 23 seconds on my machine, using pure Python

main_combos = {tuple(set(i)): i for i in itertools.product(a, b, c, d, e)}
main_combos = [val for key, val in main_combos.items() if len(key)==5]
Sign up to request clarification or add additional context in comments.

8 Comments

I do not require numpy, but the issue with this is the order does not stay the same, for example with this the first row returned is (36, 77, 85, 22, 30) and I need the order to be consistent (a, b, c, d, e) but 36 does not appear in a
@radio23 I edited my post so it returns them in the consistent order
I suspect this will run even faster if you check len(key) == 5 when making the set. You can do that with the walrus operator.
@radio23 try running both codes on a small example to see what the difference is, and which one is correct.
It looks like my initial code was incorrect, it was dropping both [13, 14, 8, 6, 0], [14, 13, 8, 6, 0] instead of keeping one of the two
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.