48

Is this a bug?

import numpy as np
a1=np.array(['a','b'])
a2=np.array(['E','F'])

In [20]: add(a1,a2)
Out[20]: NotImplemented

I am trying to do element-wise string concatenation. I thought Add() was the way to do it in numpy but obviously it is not working as expected.

4
  • 1
    As the name implies, number is for numbers. Python itself has pretty good string operations. Why not just use that? "".join(["a", "b"]) works fine. Commented Mar 31, 2012 at 18:29
  • 1
    I was looking at this docs.scipy.org/doc/numpy/reference/routines.char.html Commented Mar 31, 2012 at 18:39
  • 2
    That's cool. But: "All of them are based on the string methods in the Python standard library.". So if you just use the standard library you can write code that doesn't depend on numpy. Commented Mar 31, 2012 at 18:44
  • 1
    The add operation does not do the same thing as join. numpy's add can be useful for multidimensional arrays or nested lists. Commented Dec 3, 2015 at 17:50

7 Answers 7

83

This can be done using numpy.char.add. Here is an example:

>>> import numpy as np
>>> a1 = np.array(['a', 'b'])
>>> a2 = np.array(['E', 'F'])
>>> np.char.add(a1, a2)
array(['aE', 'bF'], 
      dtype='<U2')

(This was previously known as numpy.core.defchararray.add, and that name is still usable, but numpy.char.add is the preferred alias now.)

There are other useful string operations available for NumPy data types.

Sign up to request clarification or add additional context in comments.

1 Comment

As noted in the docstring of the module, "the preferred alias for defchararray is numpy.char", so you can just say np.char.add.
15

You can use the chararray subclass to perform array operations with strings:

a1 = np.char.array(['a', 'b'])
a2 = np.char.array(['E', 'F'])

a1 + a2
#chararray(['aE', 'bF'], dtype='|S2')

another nice example:

b = np.array([2, 4])
a1*b
#chararray(['aa', 'bbbb'], dtype='|S4')

Comments

8

This can (and should) be done in pure Python, as numpy also uses the Python string manipulation functions internally:

>>> a1 = ['a','b']
>>> a2 = ['E','F']
>>> map(''.join, zip(a1, a2))
['aE', 'bF']

1 Comment

This seems to just return a map object, not a list
5

Another solution is to convert string arrays into arrays of python of objects so that str.add is called:

>>> import numpy as np
>>> a = np.array(['a', 'b', 'c', 'd'], dtype=np.object)   
>>> print a+a
array(['aa', 'bb', 'cc', 'dd'], dtype=object)

This is not that slow (less than twice as slow as adding integer arrays).

Comments

2

One more basic, elegant and fast solution:

In [11]: np.array([x1 + x2 for x1,x2 in zip(a1,a2)])
Out[11]: array(['aE', 'bF'], dtype='<U2')

It is very fast for smaller arrays.

In [12]: %timeit np.array([x1 + x2 for x1,x2 in zip(a1,a2)])
3.67 µs ± 136 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [13]: %timeit np.core.defchararray.add(a1, a2)
6.27 µs ± 28.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [14]: %timeit np.char.array(a1) + np.char.array(a2)
22.1 µs ± 319 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

For larger arrays, time difference is not much.

In [15]: b1 = np.full(10000,'a')    
In [16]: b2 = np.full(10000,'b')    

In [189]: %timeit np.array([x1 + x2 for x1,x2 in zip(b1,b2)])
6.74 ms ± 66.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [188]: %timeit np.core.defchararray.add(b1, b2)
7.03 ms ± 419 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [187]: %timeit np.char.array(b1) + np.char.array(b2)
6.97 ms ± 284 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Comments

1

Adding to Niklas B. answer as in later versions of Python this may have changed because as of Python 3.10 this will result in a map object.

To fix this you need to add the list function..

>>> a1 = ['a','b']
>>> a2 = ['E','F']
>>> list(map(''.join, zip(a1, a2)))  # <--- See here we have added list()
['aE', 'bF']

Comments

0

To convert the list of integers [10,20,30] to a list of strings ["10k","20k","30k"] I did the following

import numpy as np
b =np.arange(10,100,10)
d=[]
for i in b:
  c=str(i)+"k"
  d.append(c)

1 Comment

Thank you for your interest in contributing to the Stack Overflow community. This question already has quite a few answers—including one that has been extensively validated by the community. Are you certain your approach hasn’t been given previously? If so, it would be useful to explain how your approach is different, under what circumstances your approach might be preferred, and/or why you think the previous answers aren’t sufficient. Can you kindly edit your answer to offer an explanation?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.