0

I have a simple ASCII dat file which I want to import into python as numpy array. The dat file (a.dat) simply looks like:

1.0000000e+00   2.0000000e+00

3.0000000e+00   4.0000000e+00

The issue I encounter is that I use the pandas.read_table to import the file

a=pd.read_table('a.dat',header=None)

and when I convert to the array by using a.values

The result is

array([['   1.0000000e+00   2.0000000e+00'],
       ['   3.0000000e+00   4.0000000e+00']], dtype=object)

The problem is that the float numbers are interpreted as strings. My actual data file is much larger than this simple matrix and thus post process of converting string to float may not be very efficient.

Strangely, I cannot even specify dtype=np.float as it says:

TypeError: Cannot cast array from dtype('O') to dtype('float64') according to the rule 'safe'

So is there a direct way that I can import this kind of matrix-like dat file into numpy float array?

Any comments and idea are appreciated. Thanks!

1

1 Answer 1

0

The default separator for read_table is TAB, not space. Just tell it to use space:

pd.read_table('a.dat', header=None, sep='\s')
#     0    1
#0  1.0  2.0
#1  3.0  4.0
Sign up to request clarification or add additional context in comments.

1 Comment

I got it. Thank you! I used pd.read_table('a.dat', header=None, sep='\s+') and it works. If I only use sep='\s'. Then it will show like 1. NaN NaN 2. I guess it is because there are multiple spaces in between.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.