0

When I print a row from my data I get this output:

"SL","2018-09-30","89.7","-2.1644","5.9884","Max Kepler",596146,518858,"field_out","hit_into_play",,,,,"9","Max Kepler grounds out, shortstop Leury Garcia to first baseman Matt Davidson.  ","R","L","R","MIN","CWS","X",6,"ground_ball",1,2,2018,"0.0456","0.4072","0.7193","1.8204","null","null","null",2,8,"Bot","118.48","141.29",,,456078,,"180930_215102","6.8928","-130.3241","-5.0488","-0.9006","26.5098","-26.9291","3.5258","1.6280","60","63.4","6.3970","88.0450","2186","5.2660",531825,518858,456078,571602,660162,570560,544725,547170,641477,594953,"55.2332","0.176","0.145","0.00",1,0,0,2,67,"5","Slider",5,4,5,4,4,5,5,4,"Strategic","Standard"

Then when I use this line of code to split it with csv reader:

csvRow = list(csv.reader(data.splitlines()[1]))
print(csvRow)

These are my results:

[['SL'], ['', ''], ['2018-09-30'], ['', ''], ['89.7'], ['', ''], ['-2.1644'], ['', ''], ['5.9884'], ['', ''], ['Max Kepler'], ['', ''], ['5'], ['9'], ['6'], ['1'], ['4'], ['6'], ['', ''], ['5'], ['1'], ['8'], ['8'], ['5'], ['8'], ['', ''], ['field_out'], ['', ''], ['hit_into_play'], ['', ''], ['', ''], ['', ''], ['', ''], ['', ''], ['9'], ['', ''], ['Max Kepler grounds out, shortstop Leury Garcia to first baseman Matt Davidson.  '], ['', ''], ['R'], ['', ''], ['L'], ['', ''], ['R'], ['', ''], ['MIN'], ['', ''], ['CWS'], ['', ''], ['X'], ['', ''], ['6'], ['', ''], ['ground_ball'], ['', ''], ['1'], ['', ''], ['2'], ['', ''], ['2'], ['0'], ['1'], ['8'], ['', ''], ['0.0456'], ['', ''], ['0.4072'], ['', ''], ['0.7193'], ['', ''], ['1.8204'], ['', ''], ['null'], ['', ''], ['null'], ['', ''], ['null'], ['', ''], ['2'], ['', ''], ['8'], ['', ''], ['Bot'], ['', ''], ['118.48'], ['', ''], ['141.29'], ['', ''], ['', ''], ['', ''], ['4'], ['5'], ['6'], ['0'], ['7'], ['8'], ['', ''], ['', ''], ['180930_215102'], ['', ''], ['6.8928'], ['', ''], ['-130.3241'], ['', ''], ['-5.0488'], ['', ''], ['-0.9006'], ['', ''], ['26.5098'], ['', ''], ['-26.9291'], ['', ''], ['3.5258'], ['', ''], ['1.6280'], ['', ''], ['60'], ['', ''], ['63.4'], ['', ''], ['6.3970'], ['', ''], ['88.0450'], ['', ''], ['2186'], ['', ''], ['5.2660'], ['', ''], ['5'], ['3'], ['1'], ['8'], ['2'], ['5'], ['', ''], ['5'], ['1'], ['8'], ['8'], ['5'], ['8'], ['', ''], ['4'], ['5'], ['6'], ['0'], ['7'], ['8'], ['', ''], ['5'], ['7'], ['1'], ['6'], ['0'], ['2'], ['', ''], ['6'], ['6'], ['0'], ['1'], ['6'], ['2'], ['', ''], ['5'], ['7'], ['0'], ['5'], ['6'], ['0'], ['', ''], ['5'], ['4'], ['4'], ['7'], ['2'], ['5'], ['', ''], ['5'], ['4'], ['7'], ['1'], ['7'], ['0'], ['', ''], ['6'], ['4'], ['1'], ['4'], ['7'], ['7'], ['', ''], ['5'], ['9'], ['4'], ['9'], ['5'], ['3'], ['', ''], ['55.2332'], ['', ''], ['0.176'], ['', ''], ['0.145'], ['', ''], ['0.00'], ['', ''], ['1'], ['', ''], ['0'], ['', ''], ['0'], ['', ''], ['2'], ['', ''], ['6'], ['7'], ['', ''], ['5'], ['', ''], ['Slider'], ['', ''], ['5'], ['', ''], ['4'], ['', ''], ['5'], ['', ''], ['4'], ['', ''], ['4'], ['', ''], ['5'], ['', ''], ['5'], ['', ''], ['4'], ['', ''], ['Strategic'], ['', ''], ['Standard']]

As you can see the commas are being put into their own indexes in the list. Obviously this isn't what I want. I can't use just a split function because some of the columns have commas within their actaul data.

3
  • what is your data source, list, csv or text? Commented Dec 17, 2018 at 4:12
  • Data is a csv file being downloaded from URL and decoded. Commented Dec 17, 2018 at 4:25
  • See also stackoverflow.com/a/53643923/235698 Commented Dec 17, 2018 at 5:21

3 Answers 3

1

csv.reader takes an interable that returns lines. Passing a string (an iterable that returns a single character), makes it behave as if each character is on a line by itself, except when grouped by the default quote character. When it gets to a "line" that is a single quote character, it looks like two blank columns: ['', ''].

You haven't shown all your code, but just do something like:

with open('data.csv',newline='') as f: # newline per csv documentation.
    for line in csv.reader(f):
        print(line)

Or if you have your data in one continuous string, use io.StringIO to return a line-at-a-time:

import csv
import io

data = '''\
"SL","2018-09-30","89.7","-2.1644","5.9884","Max Kepler",596146,518858,"field_out","hit_into_play",,,,,"9","Max Kepler grounds out, shortstop Leury Garcia to first baseman Matt Davidson.  ","R","L","R","MIN","CWS","X",6,"ground_ball",1,2,2018,"0.0456","0.4072","0.7193","1.8204","null","null","null",2,8,"Bot","118.48","141.29",,,456078,,"180930_215102","6.8928","-130.3241","-5.0488","-0.9006","26.5098","-26.9291","3.5258","1.6280","60","63.4","6.3970","88.0450","2186","5.2660",531825,518858,456078,571602,660162,570560,544725,547170,641477,594953,"55.2332","0.176","0.145","0.00",1,0,0,2,67,"5","Slider",5,4,5,4,4,5,5,4,"Strategic","Standard"
'''

for line in csv.reader(io.StringIO(data)):
    print(line)

Output:

['SL', '2018-09-30', '89.7', '-2.1644', '5.9884', 'Max Kepler', '596146', '518858', 'field_out', 'hit_into_play', '', '', '', '', '9', 'Max Kepler grounds out, shortstop Leury Garcia to first baseman Matt Davidson.  ', 'R', 'L', 'R', 'MIN', 'CWS', 'X', '6', 'ground_ball', '1', '2', '2018', '0.0456', '0.4072', '0.7193', '1.8204', 'null', 'null', 'null', '2', '8', 'Bot', '118.48', '141.29', '', '', '456078', '', '180930_215102', '6.8928', '-130.3241', '-5.0488', '-0.9006', '26.5098', '-26.9291', '3.5258', '1.6280', '60', '63.4', '6.3970', '88.0450', '2186', '5.2660', '531825', '518858', '456078', '571602', '660162', '570560', '544725', '547170', '641477', '594953', '55.2332', '0.176', '0.145', '0.00', '1', '0', '0', '2', '67', '5', 'Slider', '5', '4', '5', '4', '4', '5', '5', '4', 'Strategic', 'Standard']
Sign up to request clarification or add additional context in comments.

Comments

0

Try using data frame:

replace test.csv with your csv source

import pandas as pd
data = pd.read_csv('test.csv', sep=',', delimiter=None, skip_blank_lines=True)
data = data.transpose()

Comments

0

The reader needs the delimiter changed for CSV files. Specify the delimiter as a string with a single space in your code.

csvRow = list(csv.reader(data.splitlines()[1], delimiter=' '))
print(csvRow)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.