1

When the pandas CSV reader function 'read_csv" is used to convert StringIO values strange characters ('.1') are being appended at the end of the second field when delimiting certain fields. The desired results is the first test, but all the fields do not have spaces after the delimiter (','). Splitting "1.5M, 1.5M" should always return "1.5M", but when there is no spaces it returns the second field with "1.5M.1" (adding '.1' at the end of the field). Is there a way to resolve this issue?

>>>import pandas as pd
>>>from io import StringIO
>>>pd.read_csv(StringIO("1.5M, 1.5M"))
Empty DataFrame
Columns: [1.5M,  1.5M]
Index: []
>>> pd.read_csv(StringIO("1.5M,1.5M"))
Empty DataFrame
Columns: [1.5M, 1.5M.1]
Index: []
>>>

1 Answer 1

3

Notice in the first example with the space, your dataframe has zero rows and your column names include the space in the second column.

 df = pd.read_csv(StringIO("1.5M, 1.5M"))
 df.columns

 Index(['1.5M', ' 1.5M'], dtype='object')

In the second case, zero rows also, but you have duplicate column names without the space.

 df = pd.read_csv(StringIO("1.5M,1.5M"))
 df.columns

 Index(['1.5M', '1.5M.1'], dtype='object')

Hence, Pandas adds the '.1' to the duplicated column name.

However, if you want this '1.5M' as data in the dataframe and not as column headings.

Use

df = pd.read_csv(StringIO("1.5M, 1.5M"), header=None)

OR, it doesn't make a difference in this case:

df = pd.read_csv(StringIO("1.5M,1.5M"), header=None)

Output:

      0     1
0  1.5M  1.5M
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you Scott for the breakdown.
@Scott Boston: Could you please look into my question: stackoverflow.com/questions/72490042/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.