7

I have a list, my_list, with mixed data types that I want to convert into a numpy array. However, I get the error TypeError: expected a readable buffer object. See code below. I've tried to base my code on the NumPy documentation.

my_list = [['User_0', '2012-2', 1, 6, 0, 1.0], ['User_0', '2012-2', 5, 6, 0, 1.0], ['User_0', '2012-3', 0, 0, 4, 1.0]]
my_np_array = np.array(my_list, dtype='S30, S8, i4, i4, f32')   

2 Answers 2

13

Why don't use dtype=object?

In [1]: my_list = [['User_0', '2012-2', 1, 6, 0, 1.0], ['User_0', '2012-2', 5,
6, 0, 1.0], ['User_0', '2012-3', 0, 0, 4, 1.0]]
In [2]: my_np_array = np.array(my_list, dtype=object)
In [3]: my_np_array
Out[3]:
array([['User_0', '2012-2', 1, 6, 0, 1.0],
       ['User_0', '2012-2', 5, 6, 0, 1.0],
       ['User_0', '2012-3', 0, 0, 4, 1.0]], dtype=object)

Note It's about memory usage, when you specify the dtype of each column, memory allocated to your ndarray will be less than when you use dtype=object which contain all possible type in python so the memory allocated for each column will be maximal.

Sign up to request clarification or add additional context in comments.

2 Comments

What is the pros and cons of using dtype=object vs. specifying each datatype?
Memory addresses are of fixed size regardless of the type of object they point to. An array of pointers has constant-width elements, even if the objects they point differ vastly in size. Your comments about memory usage make no sense to me.
5

Your nested items should be tuple also you omitted one i4 in your types :

>>> my_np_array = np.array(map(tuple,my_list), dtype='|S30, |S8, i4, i4, i4, f32')  
>>> my_np_array
array([('User_0', '2012-2', 1, 6, 0, 1.0),
       ('User_0', '2012-2', 5, 6, 0, 1.0),
       ('User_0', '2012-3', 0, 0, 4, 1.0)], 
      dtype=[('f0', 'S30'), ('f1', 'S8'), ('f2', '<i4'), ('f3', '<i4'), ('f4', '<i4'), ('f5', '<f4')])

As far as is know since numpy use tuples to preserve its types when you used multiple type for array items you need to convert your sub arrays to tuple like dtype elements.

3 Comments

Yes, the data for a structure array (complex dtype like this) is supposed to be a list of tuples. The data isn't actually stored as tuples, but they chose the tuple notation for input and display. This is distinct from the usual list of lists used for nd arrays.
@hpaulj Indeed. its like so!
TypeError: data type "f32" not understood

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.