I have a .csv file with column values contain some commas. Below are the examples:
Header: ID Value Content Date
1 34 "market, business" 12/20/2013
2 15 "market, business", yesterday, metric 11/21/2014
3 18 "market," business and yesterday 10/20/2014
4 19 yesterday, today, 11/22/2014
This is the format of the .csv file which if I open in Sublime Text, it appears in format:
1, 34, "market, business", 12/20/2013
2, 15, "market, business", "yesterday, metric, 11/21/2014
3, 18, "market," business and yesterday, 10/20/2014
4, 19, yesterday, today, 11/22/2014
But what I want is after the python csv reader program is:
[1, 34, "market, business", 12/20/2013]
[2, 15, "market, business" "yesterday metric, 11/21/2014]
[3, 18, "market," business and yesterday, 10/20/2014]
[4, 19, yesterday today, 11/22/2014]
These are just sample data I have, the "content" column is the headache here cause csv module uses "," as separator, I used
reader = csv.reader(f, skipinitialspace=True)
It works for the first row if all the strings are inside one double quotes. But it doesn't apply for the third and second row if there're commas outside the quotes (single or double)
How can I solve the problem? I'm just using the traditional csv module in python now, does "panda" has the ability to solve the problem?
Thanks.
I made some updates, I think what I want is, method to specify comma at different places... Now I paste here it seems unreasonable cause there's no way I can find inside csv module to tell the differences from separator "," and "," inside a field. Even excel can't...
Any ideas?