pandas.read_csv turns strings into 'numbers' in scientific notation (which I don't want)

Question

I have a dataset where some of the sample identifiers (found in the index column) can be interpreted as numbers. Examples: 20010104123140E5 and 2001010412314529. I try to specify that the index column has type string, but pandas.read_csv insists on turning identifiers into floats. See example below.

Does anyone know how I can get around this? Or am I doing something wrong here?

import pandas as pd

with open('test.data', mode = 'w') as infile: 
    infile.write('id\tval\n20010104123140E5\t1\n2001010412314529\t2')

df = pd.read_csv('test.data', dtype = {'id':'str', 'val':'float'}, sep='\t', index_col='id')
print(df)

Prashant Sinha · Accepted Answer · 2020-06-18 15:27:46Z

1

Use df.index = df.index.astype(str)

answered Jun 18, 2020 at 15:27

Prashant Sinha

411 silver badge5 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

pandas.read_csv turns strings into 'numbers' in scientific notation (which I don't want)

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related