This is my very first question on stackoverflow. So far all my questions had already been asked, but even after much research I couldn't find an answer to this one. So here goes:
I would like to do mathematical operations in numpy arrays for which I casted a dtype. This would be trivial in R but is complicated in python.
import numpy as np
from StringIO import StringIO
test = "a,1,2\nb,3,4"
data = np.genfromtxt(StringIO(test), delimiter=",", dtype=None)
This gives me:
print data
#array([('a', 1, 2), ('b', 3, 4)],
# dtype=[('f0', '|S1'), ('f1', '<i8'), ('f2', '<i8')])
But then if I try to perform any mathematical operation on the numerical subset of data I get error messages:
subData = data[['f1','f2']]
print subData
# [(1, 2) (3, 4)]
subData+1
#TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'int'
or even:
subData + subData
#TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray'
The only solution I came up with is not a very elegant or practical one because I tend lose the column names and types as well as the original shape:
subData.view(int) + 1
Thanks a lot in advance.
pandasis a much better choice for this, though. It's meant for "spreadsheet-like" data.