Numpy simple join 2 arrays

Question

How can I join two numpy arrays together, based on common values in each first column, or zero if no match? The arrays need not be the same length. The first column in each array represents a unique ID, and the second column is the count, both obtained from np.unique. The ultimate goal is to simply subtract the counts of each unique ID.

a = np.array([[1,2],
              [3,4]])
b = np.array([[1,10],
              [2,20],
              [3,30]])

Desired output:

c = np.array([[1,10,2],
              [2,20,0],
              [3,30,4]])

Can you please explain some more how array c is obtained from a and b? Your description about common values is not clear to me. — user6764549
– user6764549, Commented Nov 23, 2017 at 21:52
Possible duplicate of SQL join or R's merge() function in NumPy? — hancar
– hancar, Commented Nov 23, 2017 at 21:57
I suppose that works, although it seems like something that should be possible using only vanilla numpy functions/slicing, rather than either recfunctions or pandas. — phloem7
– phloem7, Commented Nov 23, 2017 at 22:14
As can be seen in recfunctions.join_by code, there are lots of choices to deal with - the type of join, dealing with missing values, etc. A plain vanilla numpy function with the same generality would be just as fiddly. — hpaulj
– hpaulj, Commented Nov 24, 2017 at 18:50

Pulsar · Accepted Answer · 2017-11-23 23:33:07Z

2

if all consecutive rows in b are present:

z = np.zeros((b.shape[0],1),dtype=int)
c = np.hstack((b,z))
ai = a[:, 0] - 1
c[ai,2] = a[:, 1]
print c

A more general solution, if both a and b have missing rows:

d = np.union1d(a[:, 0],b[:, 0]).reshape(-1,1)
z = np.zeros((d.shape[0],2),dtype=int)
c = np.hstack((d,z))
mask = np.in1d(c[:, 0], b[:, 0])
c[mask,1] = b[:, 1]
mask = np.in1d(c[:, 0], a[:, 0])
c[mask,2] = a[:, 1]
print c

edited Nov 23, 2017 at 23:33

answered Nov 23, 2017 at 22:58

Pulsar

2881 silver badge6 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

phloem7 Over a year ago

Thanks, this is quite instructional for me, particularly the boolean masking syntax (e.g. c[mask,1]). I'm just really surprised there is no numpy "join this to that" functionality, as in pandas.

Collectives™ on Stack Overflow

Numpy simple join 2 arrays

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related