Checking to see if array elements are equal

Question

In python how would I do this:

say I have:

a = [[1, 5], [2,6], [3,3], [4,2]]
b= [[3, 1], [4,2], [1,8], [2,4]]

Now I want to do an operation with the second column values IF the first column values match.

E.G.

a has an entry [1,5], now go through b to see oh it has a value [1,8], now I want to divide 5/8 and store that value into say array c. Next would be matching [2,6] and [2,4] and getting the next value in c: 6/4.

so:

c = [5/8, 6/4, 3/1, 2/2]

Given the above example. I hope this makes sense. Would like to this with numpy and python.

Is the first column of a always sorted? Do every first-column number in a appear in b? Are they of the same size? — kennytm
– kennytm, Commented May 16, 2016 at 18:14
Are duplicates allowed in the first position within each list? — hilberts_drinking_problem
– hilberts_drinking_problem, Commented May 16, 2016 at 18:16

Divakar · Accepted Answer · 2016-05-16 18:40:13Z

You can use np.searchsorted to get the positions where b's first column elements correspond to the a's first column elements and using that get the respective second column elements for division and finally get c. Thus, assuming a and b to be NumPy arrays, the vectorized implementation would be -

a0 = a[:,0]
c = np.true_divide(a[:,1],b[np.searchsorted(a0,b[:,0],sorter=a0.argsort()),1])

The approach listed above works for a generic case when the first column elements of a are not necessarily sorted. But, if they are sorted just like for the listed sample case, you can simply ignore the sorter input argument and have a simplified solution, like so -

c = np.true_divide(a[:,1],b[np.searchsorted(a0,b[:,0]),1])

Sample run -

In [35]: a
Out[35]: 
array([[1, 5],
       [2, 6],
       [3, 3],
       [4, 2]])

In [36]: b
Out[36]: 
array([[3, 1],
       [4, 2],
       [1, 8],
       [2, 4]])

In [37]: a0 = a[:,0]

In [38]: np.true_divide(a[:,1],b[np.searchsorted(a0,b[:,0],sorter=a0.argsort()),1])
Out[38]: array([ 0.625,  1.5  ,  3.   ,  1.   ])

hilberts_drinking_problem · Accepted Answer · 2016-05-16 18:29:04Z

4

Given all of the assumptions in the comment section, this will work:

from operator import itemgetter
from __future__ import division

a = [[1, 5], [2,6], [3,3], [4,2]]
b = [[3, 1], [4,2], [1,8], [2,4]]

result = [x / y for (_, x), (_, y) in zip(a, sorted(b, key=itemgetter(0)))]

Assumptions: lists have equal lengths, elements in the first position are unique for each list, first list is sorted by first element, every element that occurs in the first position in a also occurs in the first position in b.

edited May 16, 2016 at 18:29

answered May 16, 2016 at 18:22

hilberts_drinking_problem

11.6k3 gold badges25 silver badges55 bronze badges

4 Comments

trans1st0r Over a year ago

does this assume that every first column entry in a has a corresponding entry in b?

Eric Over a year ago

Possibly needing a from __future__ import division

Eric Over a year ago

@trans1st0r: Yes, because that is one of "the assumptions in the comment section"

hilberts_drinking_problem Over a year ago

@trans1st0r you are correct - I added explicit assumptions. Eric, good point, I will make an edit.

trans1st0r · Accepted Answer · 2016-05-16 18:20:54Z

1

You can use a simple O(n^2) way with nested loops:

c = []

for x in a:
 for y in b:
   if x[0] == y[0]:
     c.append(x[1]/y[1])
     break

The above is useful when the lists are small. For huge lists, consider a dictionary based approach, where the complexity would be O(n) at the cost of some extra space.

answered May 16, 2016 at 18:20

trans1st0r

2,0832 gold badges18 silver badges23 bronze badges

Comments

Bi Rico · Accepted Answer · 2016-05-16 21:00:56Z

I humbly propose that you're using the wrong data structure. Notice that if you have an array column that has unique values between 1 and N (an index column) you could encode the same data simply by re-ordering your other columns. Once you're re-ordered your data, not only can you drop the "index" column but now it becomes easier to operate on the remaining data. Let me demonstrate:

import numpy as np

N = 5
a = np.array([[1, 5], [2,6], [3,3], [4,2]])
b = np.array([[3, 1], [4,2], [1,8], [2,4]])

a_trans = np.ones(N)
a_trans[a[:, 0]] = a[:, 1]

b_trans = np.ones(N)
b_trans[b[:, 0]] = b[:, 1]

c = a_trans / b_trans
print c

Depending on the nature of your problem, you can sometimes use an implicit index from the beginning, but sometimes an explicit index can be very useful. If you need an explicit index, consider using something like pandas.DataFrame with better support for index operations.

Collectives™ on Stack Overflow

Checking to see if array elements are equal

4 Answers 4

Comments

4 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related