This is a small example of a big csv file. Imagine this csv file contais a text like this one below:
import io
import pandas as pd
text="""A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P
"2019,01/2019,Brazil,0,""166,229"",0,""22578,86292"",""47,417"",,,,,,,,"
"2019,01/2019,Brazil,95,0,""50,34563"",0,""5,137"",,,,,,,,"
2019,01/2019,Brazil,0,0,0,0,0,,,,,,,,"""
I would like to read it as a dataframe and the first line should be the column names. My problem occurs with the lines 2 and 3 because of the quotation marks separating float numbers. If I use the code below, the last line is perfect because there aren't any quotation marks separating the numbers.
df = pd.read_csv(io.StringIO(text), sep=',', engine="python", encoding='utf-8', decimal=",")
df
Out[73]:
A B ... O P
0 2019,01/2019,Brazil,0,"166,229",0,"22578,86292... None ... NaN NaN
1 2019,01/2019,Brazil,95,0,"50,34563",0,"5,137",... None ... NaN NaN
2 2019 01/2019 ... NaN NaN
[3 rows x 16 columns]
Could you show me how can I read this properly, separating each value by column? Thanks