split csv column entry on space using python

Question

I am trying to create a new csv file using python. The new csv file will be the same, but have one entry split based on a space delimiter.

My method is to open the files with read and write access respectively, skip over the headers, then write out the specific column headings I want in the csv.

Then iterate over each rows amending the appropriate section and writing the row to the new file using the .writerow function.

One iteration over the row creates ['data1', 'data2', 'data3 data4', 'data5', 'data6', 'data7' etc. ]

So in this case I'm selecting row[2] to select the 'data3 data4' part and trying to split these to create a list that looks like ['data1', 'data2', 'data3', 'data4', 'data5', 'data6', 'data7' etc. ]

I have tried using .split which gives me a list within a list, I've tried .slicing which means I can show either data3 or data4. I've also tried the .replace which gives me ['data1', 'data2', 'data3,data4', etc.]. I'm quite frustrated and wondering if anyone might give me the a hint as to the probably quite simple solution that i'm missing. Full code is below.

import csv

with open('filepath', mode="rU") as infile:
    with open('filepath', mode="w") as outfile:

        csv_f = csv.reader(infile)
        next(csv_f, None)  # skip the headers

        writer = csv.writer(outfile)
        writer.writerow(['dataheader1', 'dataheader2', 'dataheader3', 'dataheader4', 'dataheader5', 'dataheader6', 'dataheader7' etc. ])

    for row in csv_f:
        row[2] = row[2].replace(' ', ',')
        print row

dsh · Accepted Answer · 2015-10-26 20:58:38Z

2

row[2:3] = row[2].split(' ')

Demonstration:

>>> row = ['a', 'b', 'c d e f', 'g', 'h']
>>> row[2:3] = row[2].split(' ')
>>> row
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

answered Oct 26, 2015 at 20:58

dsh

12.3k3 gold badges37 silver badges53 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

hselbie Over a year ago

You are my hero, would you mind telling me with the row[2:3] bit did?

dsh Over a year ago

That is the slice: start at index 2 and go to (but not including) index 3. A slice can be assigned to just like a single index.

Adam Smith Over a year ago

it reassigns the slice [2:3] in row, which is just the second element. If you used row[2] instead, you'd get ['a', 'b', ['c', 'd', 'e', 'f'], 'g', 'h']. Assigning to the slice tells the interpreter that you want that to be part of the existing list.

hselbie Over a year ago

Thank you very much, so close, yet so far.

Adam Smith · Accepted Answer · 2015-10-26 21:00:27Z

0

If you don't know where the cells with spaces are, then you're looking for itertools.chain.from_iterable

import csv

with open('filepath', mode='rU') as infile,
     open('filepath2', mode='wb') as outfile:  # this changed slightly, look!
    csv_f = csv.reader(infile)
    writer = csv.writer(outfile)
    next(csv_f)  # skip headers
    row = next(csv_f)
    # row looks like
    # ['one', 'two', 'three four', 'five', ...]

    rewritten_row = itertools.chain.from_iterable(
        [cell.split() for cell in row])  # or map(str.split, row)
    # rewritten_row looks like
    # ['one', 'two', 'three', 'four', 'five', ...]

    writer.writerow(rewritten_row)

answered Oct 26, 2015 at 21:00

Adam Smith

54.6k13 gold badges85 silver badges120 bronze badges

Collectives™ on Stack Overflow

split csv column entry on space using python

2 Answers 2

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related