Writing output to csv file [in correct format]

Question

I realize this question has been asked a million times and there is a lot of documentation on it. However, I am unable to output the results in the correct format.

The below code was adopted from: Replacing empty csv column values with a zero

# Save below script as RepEmptyCells.py 
# Add #!/usr/bin/python to script 
# Make executable by chmod +x prior to running the script on desired .csv file 

# Below code will look through your .csv file and replace empty spaces with 0s
# This can be particularly useful for genetic distance matrices 

import csv
import sys

reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
    for i, x in enumerate(row):
                if len(x)< 1:
                         x = row[i] = 0
    print(','.join(int(x) for x in row))

Currently, to get the correct output .csv file [i.e. in correct format] one can run the following command in bash:

 #After making the script executable        
./RepEmptyCells.py input.csv > output.csv # this produces the correct output

I've tried to use csv.writer function to produce the correctly formatted output.csv file (similar to ./RepEmptyCells.py input.csv > output.csv) without much luck.

I'd like to learn how to add this last part to the code to automate the process without having to do it in bash.

What I have tried:

f = open(output2.csv, 'w') 

import csv
import sys

reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
    for i, x in enumerate(row):
                if len(x)< 1:
                         x = row[i] = 0
    f.write(','.join(int(x) for x in row)) 

f.close()

When looking at the raw files from this code and the one before, they look the same.

However, when I open them in either excel or iNumbers the latter (i.e. output2.csv) shows only a single row of the data.

Its important that both output.csv and output2.csv can be opened in excel.

Remi Guan · Accepted Answer · 2015-11-12 02:13:53Z

3

2 options:

Just do a f.write('\n') after your current f.write statement.

Use csv.writer. You mention it but it isn't in your code.

writer = csv.writer(f)
...
writer.writerow([int(x) for x in row])  # Note difference in parameter format

edited Nov 12, 2015 at 2:13

Remi Guan

22.5k17 gold badges68 silver badges90 bronze badges

answered Nov 12, 2015 at 2:00

jeff carey

2,3833 gold badges16 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Novice Over a year ago

Thanks. That did it! So you just had to add new line ('/n')! 1) works. 2) still doesn't, but that's okay.

Cilyan Over a year ago

Beware, I'm surprised 1) works, as on Unix '\n' will translate to LF, while I was pretty sure Excel will only accept csv files when they end on CRLF. In fact, this is a feature of the CSV format, a single LF denotes a line break inside a cell. That's why you open the files as 'rb' for Python 2 and newline='' for Python 3, because the csv writer handles this specific aspect and would be disturbed by the default newline abstraction of Python.

Cilyan · Accepted Answer · 2015-11-12 02:16:49Z

An humble proposition

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import csv
import sys

# Use with statement to properly close files
# Use newline='' which is the right option for Python 3.x
with open(sys.argv[1], 'r', newline='') as fin, open(sys.argv[2], 'w', newline='') as fout:
    reader = csv.reader(fin)
    # You may need to redefine the dialect for some version of Excel that 
    # split cells on semicolons (for _Comma_ Separated Values, yes...)
    writer = csv.writer(fout, dialect="excel")
    for row in reader:
        # Write as reading, let the OS do the caching alone
        # Process the data as it comes in a generator, checking all cells
        # in a row. If cell is empty, the or will return "0"
        # Keep strings all the time: if it's not an int it would fail
        # Converting to int will force the writer to convert it back to str
        # anwway, and Excel doesn't make any difference when loading.
        writer.writerow( cell or "0" for cell in row )

Sample in.csv

1,2,3,,4,5,6,
7,,8,,9,,10

Output out.csv

1,2,3,0,4,5,6,0
7,0,8,0,9,0,10

Corey Goldberg · Accepted Answer · 2015-11-12 02:20:29Z

0

import csv
import sys

with open(sys.argv[1], 'rb') as f:
    reader = csv.reader(f)
    for row in reader:
        print row.replace(' ', '0')

and I don't understand your need for using the shell and redirecting. a csv writer is just:

with open('output.csv', 'wb') as f:
    writer = csv.writer(f)
    writer.writerows(rows)

answered Nov 12, 2015 at 2:20

Corey Goldberg

61.5k30 gold badges135 silver badges148 bronze badges

Collectives™ on Stack Overflow

Writing output to csv file [in correct format]

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related