1

I have two array like this:

[('a', 'beta'), ('b', 'alpha'), ('c', 'beta'), .. ]

[('b', 37), ('c', 22), ('j', 93), .. ] 

I want to produce something like:

[('b', 'alpha', 37), ('c', 'beta', 22), .. ]

Is there an easy way to do this?

4
  • 1
    have you tried anything? Commented May 9, 2017 at 12:55
  • @depperm I considered a for loop to check if matching and push to a new array but I thought there might be some built in functions that could make it easier. Commented May 9, 2017 at 12:56
  • 2
    Check this thread: stackoverflow.com/questions/7776907/… Commented May 9, 2017 at 12:57
  • stackoverflow.com/questions/17682721/… Commented May 9, 2017 at 13:13

2 Answers 2

1

I would suggest a hash discriminator join like method:

l = [('a', 'beta'), ('b', 'alpha'), ('c', 'beta')]
r = [('b', 37), ('c', 22), ('j', 93)]
d = {}
for t in l:
    d.setdefault(t[0], ([],[]))[0].append(t[1:])
for t in r:
    d.setdefault(t[0], ([],[]))[1].append(t[1:])
from itertools import product
ans = [ (k,) + l + r for k,v in d.items() for l,r in product(*v)]

results in:

[('c', 'beta', 22), ('b', 'alpha', 37)]

This has lower complexity closer to O(n+m) than O(nm) because it avoids computing the product(l,r) and then filtering as the naive method would.

Mostly from: Fritz Henglein's Relational algebra with discriminative joins and lazy products

It can also be written as:

def accumulate(it):
    d = {}
    for e in it:
        d.setdefault(e[0], []).append(e[1:])
    return d
l = accumulate([('a', 'beta'), ('b', 'alpha'), ('c', 'beta')])
r = accumulate([('b', 37), ('c', 22), ('j', 93)])
from itertools import product
ans = [ (k,) + l + r for k in l&r for l,r in product(l[k], r[k])]

This accumulates both lists separately (turns [(a,b,...)] into {a:[(b,...)]}) and then computes the intersection between their sets of keys. This looks cleaner. if l&r is not supported between dictionaries replace it with set(l)&set(r).

Sign up to request clarification or add additional context in comments.

Comments

1

There is no built in method. Adding package like numpy will give extra functionalities, I assume.

But if you want to solve it without using any extra packages, you can use a one liner like this:

ar1 = [('a', 'beta'), ('b', 'alpha'), ('c', 'beta')]
ar2 = [('b', 37), ('c', 22), ('j', 93)]
final_ar = [tuple(list(i)+[j[1]]) for i in ar1 for j in ar2 if i[0]==j[0]]
print(final_ar)

Output:

[('b', 'alpha', 37), ('c', 'beta', 22)]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.