I realize this question has been asked a million times and there is a lot of documentation on it. However, I am unable to output the results in the correct format.
The below code was adopted from: Replacing empty csv column values with a zero
# Save below script as RepEmptyCells.py
# Add #!/usr/bin/python to script
# Make executable by chmod +x prior to running the script on desired .csv file
# Below code will look through your .csv file and replace empty spaces with 0s
# This can be particularly useful for genetic distance matrices
import csv
import sys
reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
for i, x in enumerate(row):
if len(x)< 1:
x = row[i] = 0
print(','.join(int(x) for x in row))
Currently, to get the correct output .csv file [i.e. in correct format] one can run the following command in bash:
#After making the script executable
./RepEmptyCells.py input.csv > output.csv # this produces the correct output
I've tried to use csv.writer function to produce the correctly formatted output.csv file (similar to ./RepEmptyCells.py input.csv > output.csv) without much luck.
I'd like to learn how to add this last part to the code to automate the process without having to do it in bash.
What I have tried:
f = open(output2.csv, 'w')
import csv
import sys
reader = csv.reader(open(sys.argv[1], "rb"))
for row in reader:
for i, x in enumerate(row):
if len(x)< 1:
x = row[i] = 0
f.write(','.join(int(x) for x in row))
f.close()
When looking at the raw files from this code and the one before, they look the same.
However, when I open them in either excel or iNumbers the latter (i.e. output2.csv) shows only a single row of the data.
Its important that both output.csv and output2.csv can be opened in excel.