2

So I have an CSV file that I have read into a list. I have turned that list into an array, and have saved the array into a MATLAB file with the following function.

def save_array(arr,filename):
    import scipy.io
    out_dict={}
    out_dict[filename]=arr
    scipy.io.savemat(filename + '.mat',out_dict)

However, when I open the MATLAB file, something goes wrong. When I open up in Python, I get the following output:

{'M': array([[u'153  ', u'81   ', u'0.28 ', ..., u'0.19 ', u'-0.07', u'1    '],
   [u'168  ', u'76   ', u'0.08 ', ..., u'0.98 ', u'0.42 ', u'0    '],
   [u'184  ', u'92   ', u'0.18 ', ..., u'0.92 ', u'0.75 ', u'0    '],
   ..., 
   [u'183  ', u'62   ', u'0.57 ', ..., u'0.87 ', u'0.31 ', u'0    '],
   [u'181  ', u'72   ', u'0.48 ', ..., u'0.91 ', u'1.2  ', u'0    '],
   [u'158  ', u'77   ', u'1.01 ', ..., u'0.99 ', u'0.88 ', u'0    ']], 
  dtype='<U5'),
 '__globals__': [],
 '__header__': 'MATLAB 5.0 MAT-file Platform: posix, Created on: Tue Nov  5 15:28:57 2013',
 '__version__': '1.0'}

Why is there a u at the beginning of each element? How can I rectify this?

3
  • 1
    the u indicates that it's a unicode string. have't worked with matlab files in 6 years, but i don't imagine that's the problem. are the the data in your array strings or floats? what's arr.dtype? Commented Nov 5, 2013 at 20:37
  • I agree with @Paul, the fact that the strings are unicode isn't the issue --- the fact that you have an array of strings is the real issue. If before calling save_array(arr, fname), arr.dtype is <U5, then that's the issue. Commented Nov 5, 2013 at 20:49
  • The first row of the csv is a list of names, which are strings. Would that do anything? Commented Nov 5, 2013 at 21:04

1 Answer 1

3

I see that you are reading the CSV file and getting an array of strings. You can convert them to an array of floating point numbers before saving them:

import numpy as np
out_dict[filename]=np.array(arr, dtype=np.float64)
Sign up to request clarification or add additional context in comments.

5 Comments

Or just don't read the CSV as a list in the first place, just use arr = np.genfromtxt('data.csv', delimiter=',') which will automatically give an appropriate dtype.
This worked. Thank you. @askewchan, I would do it that way, but this is for a class.
@askewchan, correct. That would be the more efficient way. But at the end you could always manipulate the data that you want so save into your hard drive, in order to save space, you can try np.float32 or np.int32, or np.in16, etc... of course if space is an issue.
Now my data is showing a lot of deimal points. How can I truncate them? np.float32?
Well, why do you want to truncate? For display purposes you can write something like: print (len(arr)*'%.2f\t')%tuple(arr)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.