2

I have an array of strings:

s = np.array(['a', 'b', 'c'])

and I want to have a function array_equal_to_scalar to compare s with string 'a' with writing output to preallocated array (I need fast performance):

mask = np.empty(s.shape)
np.array_equal_to_scalar(s, 'a', out=mask)

So, I expect mask will be

> [True False False]

Is there any way to make something like array_equal_to_scalar?

8
  • 1
    What is wrong with 'a' == np.array(['a', 'b', 'c'])? Try 'a' == np.array(['a', 'b', 'c']) and the output should be array([ True, False, False]); if you want mask to be a list, then just convert the result into a list. Commented Mar 17, 2018 at 11:37
  • Are you sure allocation is your performance bottleneck? Commented Mar 17, 2018 at 11:38
  • @droooze I want to have fast performance, s == np.array['a', 'b', 'c'] will create new array in memory Commented Mar 17, 2018 at 11:38
  • 2
    I am not convinced that memory saving here would result in performance boost. Commented Mar 17, 2018 at 11:41
  • 2
    Use == and break the comparison up into smaller chunks, then store the result of the chunks in a pre-allocated array/list - unless you want to add functionality to np.equal to handle strings. Commented Mar 17, 2018 at 11:57

1 Answer 1

3

What you're looking for is the numpy.equal ufunc, which doesn't seem to work for your use case.

In order to use it in the way you want, we need to explicitly broadcast the scalar to be compared into a numpy array of an appropriate shape:

import numpy as np

a = np.array(['a','b','c'])
res = np.empty(a.shape, dtype=bool)
np.equal(a, np.broadcast_to(['a'], a.shape), out=res)

Unfortunately the above call (1) ignores the broadcast and gives a constant result, and (2) is NotImplemented. We can try allocating a proper comparison array to enforce a proper elementwise comparison, to no avail:

>>> compare = np.full(a.shape, 'a')
>>> np.equal(a, compare)
NotImplemented

It seems that the efficient implementations via numpy ufuncs are only given for numeric types (I haven't had time to look into the source yet). But I don't expect higher-level functions to be able to directly work with your pre-allocated input arrays as buffers. With a compiled ufunc I could imagine that the out keyword argument lets you bypass the creation of a temporary array, but I don't think there's another alternative for you here.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.