Modifying a CSV file in Python

Question

Now I know it's usually not feasible to modify a csv file as you are reading from it so you need to create a new csv file and write to it. The problem I'm having is preserving the original order of the data.

The input csv file looks like follows:

C1       C2         C3
apple    BANANA     Mango
pear     PineApple  StRaWbeRRy

I want to turn all the data into lower case and output a new csv file that looks like:

C1       C2         C3
apple    banana     mango
pear     pineapple  strawberry

So far I can iterate through the input csv file and turn all the values into lower case but I don't know how to rewrite it back into a csv file in that format. The code I have is:

def clean (input)
  aList = []
  file = open(input, "r")
  reader = csv.reader(file, delimiter = ',')
  next(reader, None) # Skip the header but I want to preserve it in the output csv file
  for row in reader:
     for col in row:
        aList.append(col.lower())

So now I have a list with all the lowercase data, how do I rewrite it back into a csv file of the same format (same number of rows and columns) as the input including the header row that I skipped in the code.

Don't bother saving the lines to a list. Just open both your input & output files at the same time, so you can write each modified line as you create it. In fact, I wouldn't even bother using the csv module for this. It's a pity you need to preserve the case of the header line, otherwise you could just process the whole file with the tr program (if you're using a Unix-like OS). — PM 2Ring
– PM 2Ring, Commented Nov 7, 2017 at 6:41
With pandas: pd.read_csv(input).apply(str.lower).to_csv(input) — cs95
– cs95, Commented Nov 7, 2017 at 6:43
I just noticed that your code specifies , as the delimiter, but your sample data uses whitespace. Please explain! — PM 2Ring
– PM 2Ring, Commented Nov 7, 2017 at 6:43
@PM2Ring You could still use command line tools if you use the head command to grab the header. — Tim
– Tim, Commented Nov 7, 2017 at 6:45
@PM2Ring I was just representing the data that way here. The input is in a csv file with those rows and columns. Having said that, I too don't know why the delimiter , works but it does! It was a mistake initially but it works just fine — move_slow_break_things
– move_slow_break_things, Commented Nov 7, 2017 at 6:45

Nabin · Accepted Answer · 2020-08-15 12:47:23Z

12

Pandas way:

Read the file using pandas and get the dataframe. Then you can simply use lower()

import pandas as pd

def conversion(text):
    return text.lower()
    

df = pd.read_csv(file_path)
df[column_name] = df[column_name].map(conversion)

Or even a single liner:

df[column_name] = df[column_name].apply(lambda x: x.lower()) # If you have nan or other non-string values, you may need to convert x to string first like str(x).lower()

Then you can save it using to_csv function

edited Aug 15, 2020 at 12:47

answered Nov 7, 2017 at 7:12

Nabin

11.8k8 gold badges70 silver badges103 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Tim · Accepted Answer · 2017-11-07 06:47:41Z

6

If all you want to do is change the case of the data and preserve everything else you might be best to skip the csv module and just use a straight file eg:

# Open both files
with open("infile.csv") as f_in, open("outfile.csv", 'w') as f_out:
    # Write header unchanged
    header = f_in.readline()
    f_out.write(header)

    # Transform the rest of the lines
    for line in f_in:
        f_out.write(line.lower())

edited Nov 7, 2017 at 6:47

answered Nov 7, 2017 at 6:39

Tim

2,6831 gold badge23 silver badges27 bronze badges

Comments

sonus21 · Accepted Answer · 2017-11-07 06:54:46Z

3

If you want to use csv module for all then use following code snippet.

import os
import csv


def clean(input):
    tmpFile = "tmp.csv"
    with open(input, "r") as file, open(tmpFile, "w") as outFile:
        reader = csv.reader(file, delimiter=',')
        writer = csv.writer(outFile, delimiter=',')
        header = next(reader)
        writer.writerow(header)
        for row in reader:
            colValues = []
            for col in row:
                colValues.append(col.lower())
            writer.writerow(colValues)
    os.rename(tmpFile, input)

edited Nov 7, 2017 at 6:54

answered Nov 7, 2017 at 6:42

sonus21

5,4082 gold badges27 silver badges52 bronze badges

3 Comments

sonus21 Over a year ago

That's correct, then we need to create another file and copy the file finally.

Tim Over a year ago

You should fix the whitespace indent. You are using 1, 2, 3 and 4 spaces at different points. Python will not like this!

sonus21 Over a year ago

@Tim I'm using ideone which has a problem. I've fixed this using PyCharm.

rawwar · Accepted Answer · 2017-11-07 07:03:03Z

0

the easiest way that i found is as follows let the initial CSV file name be test.csv

with open('test.csv','r') as f:
    with open('cleaned.csv','w') as ff:
        ff.write(f.readline())
        ff.write(f.read().lower())

the above code will create a new csv with all lower case

edited Nov 7, 2017 at 7:03

answered Nov 7, 2017 at 6:44

rawwar

5,0909 gold badges38 silver badges62 bronze badges

4 Comments

PM 2Ring Over a year ago

Ok, that works properly now. But like your earlier version, it unnecessarily reads the whole file into a string. Plus it uses more RAM to do the string concatenation, as Tim mentions. But I guess that's probably ok unless the file is huge, and changing the case of the whole file at once is more efficient than doing it line by line.

Tim Over a year ago

You would want to avoid the string concatenation. If this is a large file you are going have to allocate enough memory for the whole file, and then a second time to concatenate the header.

rawwar Over a year ago

so, instead of concatenating, i should directly write it to file? @Tim

Tim Over a year ago

@user8898218 Yes. Strings are immutable in python so concatenation causes a new str to be instantiated and the contents of two strings being concatenated to be copied in.

Collectives™ on Stack Overflow

Modifying a CSV file in Python

4 Answers 4

Comments

Comments

3 Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

3 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related