0

I am trying to create a new csv file using python. The new csv file will be the same, but have one entry split based on a space delimiter.

My method is to open the files with read and write access respectively, skip over the headers, then write out the specific column headings I want in the csv.

Then iterate over each rows amending the appropriate section and writing the row to the new file using the .writerow function.

One iteration over the row creates ['data1', 'data2', 'data3 data4', 'data5', 'data6', 'data7' etc. ]

So in this case I'm selecting row[2] to select the 'data3 data4' part and trying to split these to create a list that looks like ['data1', 'data2', 'data3', 'data4', 'data5', 'data6', 'data7' etc. ]

I have tried using .split which gives me a list within a list, I've tried .slicing which means I can show either data3 or data4. I've also tried the .replace which gives me ['data1', 'data2', 'data3,data4', etc.]. I'm quite frustrated and wondering if anyone might give me the a hint as to the probably quite simple solution that i'm missing. Full code is below.

import csv

with open('filepath', mode="rU") as infile:
    with open('filepath', mode="w") as outfile:

        csv_f = csv.reader(infile)
        next(csv_f, None)  # skip the headers

        writer = csv.writer(outfile)
        writer.writerow(['dataheader1', 'dataheader2', 'dataheader3', 'dataheader4', 'dataheader5', 'dataheader6', 'dataheader7' etc. ])

    for row in csv_f:
        row[2] = row[2].replace(' ', ',')
        print row

2 Answers 2

2
row[2:3] = row[2].split(' ')

Demonstration:

>>> row = ['a', 'b', 'c d e f', 'g', 'h']
>>> row[2:3] = row[2].split(' ')
>>> row
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Sign up to request clarification or add additional context in comments.

4 Comments

You are my hero, would you mind telling me with the row[2:3] bit did?
That is the slice: start at index 2 and go to (but not including) index 3. A slice can be assigned to just like a single index.
it reassigns the slice [2:3] in row, which is just the second element. If you used row[2] instead, you'd get ['a', 'b', ['c', 'd', 'e', 'f'], 'g', 'h']. Assigning to the slice tells the interpreter that you want that to be part of the existing list.
Thank you very much, so close, yet so far.
0

If you don't know where the cells with spaces are, then you're looking for itertools.chain.from_iterable

import csv

with open('filepath', mode='rU') as infile,
     open('filepath2', mode='wb') as outfile:  # this changed slightly, look!
    csv_f = csv.reader(infile)
    writer = csv.writer(outfile)
    next(csv_f)  # skip headers
    row = next(csv_f)
    # row looks like
    # ['one', 'two', 'three four', 'five', ...]

    rewritten_row = itertools.chain.from_iterable(
        [cell.split() for cell in row])  # or map(str.split, row)
    # rewritten_row looks like
    # ['one', 'two', 'three', 'four', 'five', ...]

    writer.writerow(rewritten_row)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.