0

I have a dataset where some of the sample identifiers (found in the index column) can be interpreted as numbers. Examples: 20010104123140E5 and 2001010412314529. I try to specify that the index column has type string, but pandas.read_csv insists on turning identifiers into floats. See example below.

Does anyone know how I can get around this? Or am I doing something wrong here?

import pandas as pd

with open('test.data', mode = 'w') as infile: 
    infile.write('id\tval\n20010104123140E5\t1\n2001010412314529\t2')

df = pd.read_csv('test.data', dtype = {'id':'str', 'val':'float'}, sep='\t', index_col='id')
print(df)

1 Answer 1

1

Use df.index = df.index.astype(str)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.