0

i have two 1D numpy arrays. The lengths are unequal. I want to make pairs (array1_elemnt,array2_element) of the elements which are close to each other. Lets consider following example

    a = [1,2,3,8,20,23]
    b = [1,2,3,5,7,21,35]

The expected result is

    [(1,1), 
    (2,2), 
    (3,3), 
    (8,7),
    (20,21),
    (23,25)]

It is important to note that 5 is left alone. It could easily be done by loops but I have very large arrays. I considered using nearest neighbor. But felt like killing a sparrow with a canon.

Can anybody please suggest any elegant solution.

Thanks a lot.

5
  • 2
    what will you expect for: [1,3,5],[2,4] is it ambigious? or is there more information for tie-breaker? Commented Sep 12, 2011 at 7:03
  • @amit, very nice point... In this case... the order will play the role [(1,2),(3,4)] will be the result. thanks Commented Sep 12, 2011 at 7:10
  • Where does the number 8.7 in your expected result come from? Are your input arrays always pre-sorted like this? Commented Sep 12, 2011 at 9:50
  • @wim well... yes the array will be sorted like this always...sorry its (8,7). I will correct it... thanks Commented Sep 12, 2011 at 10:17
  • If I understand, you want to choose a pairing such that len(result) == min(len(a), len(b)) and sum([abs(p[1]-p[0]) for p in result]) is minimised. Is this correct? Commented Sep 12, 2011 at 11:52

5 Answers 5

3

How about using the Needleman-Wunsch algorithm? :)

The scoring matrix would be trivial, as the "distance" between two numbers is just their difference.

But that will probably feel like killing a sparrow with a tank ...

Sign up to request clarification or add additional context in comments.

1 Comment

@jellybean... cool.... this could be one thing.... lets see if someone comes up with something else. Thanks anyways
1

You could use the built in map function to vectorize a function that does this. For example:

ar1 = np.array([1,2,3,8,20,23])
ar2 = np.array([1,2,3,5,7,21,35])
def closest(ar1, ar2, iter):
    x = np.abs(ar1[iter] - ar2)
    index = np.where(x==x.min())
    value = ar2[index]
    return value

def find(x):
    return closest(ar1, ar2, x)
c = np.array(map(find, range(ar1.shape[0])))

In the example above, it looked like you wanted to exclude values once they had been paired. In that case, you could include a removal process in the first function like this, but be very careful about how array 1 is sorted:

 def closest(ar1, ar2, iter):
    x = np.abs(ar1[iter] - ar2)
    index = np.where(x==x.min())
    value = ar2[index]
    ar2[ar2==value] = -10000000
    return value

Comments

0

The best method I can think of is use a loop. If loop in python is slow, you can use Cython to speedup you code.

Comments

0

I think one can do it like this:

  1. create two new structured arrays, such that there is a second index which is 0 or 1 indicating to which array the value belongs, i.e. the key
  2. concatenate both arrays
  3. sort the united array along the first field (the values)
  4. use 2 stacks: go through the array putting elements with key 1 on the left stack, and when you cross an element with key 0, put them in the right stack. When you reach the second element with key 0, for the first with key 0 check the top and bottom of the left and right stacks and take the closest value (maybe with a maximum distance), switch stacks and continue.

sort should be slowest step and max total space for the stacks is n or m.

Comments

0

You can do the following:

a = np.array([1,2,3,8,20,23])
b = np.array([1,2,3,5,7,21,25])

def find_closest(a, sorted_b):
    j = np.searchsorted(.5*(sorted_b[1:] + sorted_b[:-1]), a, side='right')
    return b[j]

b.sort()  # or, b = np.sort(b), if you don't want to modify b in-place
print np.c_[a, find_closest(a, b)]

# ->
# array([[ 1,  1],
#        [ 2,  2],
#        [ 3,  3],
#        [ 8,  7],
#        [20, 21],
#        [23, 25]])

This should be pretty fast. How it works is that searchsorted will find for each number a the index into the b past the midpoint between two numbers, i.e., the closest number.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.