2

I have several hundred csv files that I would like to search for the string "Keyed,Bet" and change it to "KeyedBet". The string may or may not be within the file, and may be in different columns in different files.

I cobbled together the script below, but it doesn't work. I am definitely using replace() incorrectly, but can't quite figure out how, and am creating a new file when I don't really need to- if it simply updated the current file and saved under the same name, that would be fine (but beyond my beginner capabilities).

Where did I go wrong here? Thanks for the help!

import os 
import csv


path='.'

filenames = os.listdir(path)

for filename in filenames:

    if filename.endswith('.csv'):
        r=csv.reader(open(filename))
        new_data = []
        for row in r:
            replace("Keyed,Bet","KeyedBet")
        new_data.append(row)   

    newfilename = "".join(filename.split(".csv")) + "_edited3.csv"
    with open(newfilename, "w") as f:
        writer = csv.writer(f)
        writer.writerows(new_data)
5
  • 3
    "it doesn't work." Why not? What's it doing incorrectly? Any errors? Commented May 1, 2015 at 20:28
  • 3
    Honestly this sounds like a one-line job for the sed shell command (not python). Commented May 1, 2015 at 20:29
  • 2
    Why reinvent the wheel? Just download sed + its dependencies, then sed -i 's/Keyed,Bet/KeyedBet/ig' *.csv Commented May 1, 2015 at 20:30
  • @rojo make your comment an answer. Commented May 1, 2015 at 20:31
  • @Andy Well, the first issue is my misunderstanding of replace(). I get that it needs to have a defined string to call, and I'm not doing that, but I can't figure out how to have it look at the rows in the csv as strings to search. Commented May 1, 2015 at 20:34

2 Answers 2

4

Why reinvent the wheel? Just download sed + its dependencies, then

sed -i 's/Keyed,Bet/KeyedBet/ig' *.csv

Edit: The command above should work fine in Linux. Windows sed requires its quoted tokens to be double-quoted, rather than single.

sed -i "s/Keyed,Bet,KeyedBet/ig" *.csv
Sign up to request clarification or add additional context in comments.

5 Comments

Well, that's pretty amazing.
Well, when I ran this command through my Cygwin terminal on a test file, it just emptied the file. Clearly I did something wrong, but I used the exact command above. Any ideas why that would have happened?
Ah, Windows sed might require double quotes for quoted tokens instead of single. Try sed -i "s/Keyed,Bet/KeyedBet/ig" *.csv. Sorry, I should've thought of that before.
Yep, that was it! Worked beautifully, thank you. One final question: why did you opt to use /ig rather than /g in the command? Most of the sed tutorials I looked up to try and understand it better used /g- what's the difference? Thanks, again
/ig is case-insensitive. It will also replace keyed,bet with KeyedBet. i is insensitive, g is global (not stopping after the first replacement).
2

If you want to change the original files you can use fileinput.input with inplace=True to actually modify the original file, glob will find all the csv files for you in the given directory:

from glob import iglob
import fileinput

path = '.' 

for line in fileinput.input(iglob(os.path.join(path, "*.csv")),inplace=True):
     print(line.replace("Keyed,Bet", "KeyedBet"),end="")

Not quite one line but a lot less than 15.

To create new files:

path='.'
from glob import iglob


for filename in  iglob(os.path.join(path,"*.csv")):
    with open(os.path.join(path,filename)) as f,open(os.path.join(path, os.path.splitext(filename)[0]+ "_edited3.csv"), "w") as f2:
        for line in f:
            f2.write(line.replace("Keyed,Bet", "KeyedBet"))

Considering you are replacing strings it is easier to just open the files without the csv module and use str.replace, if you knew the string always appeared in the same row then the csv module would be a better option but it seems that substring can appear anywhere.

5 Comments

The last line of the first option gives me "TypeError: replace() takes no keyword arguments". Any idea why that would be?
@datahappy, yep, my mistake I had end="" inside the replace instead of the print, if you are using python 2 add from __future__ import print_function to the top of your file
Thanks for the edit. Well, I got a syntax error from the new version, so I looked it up, and I guess ,end="" is Python 3.x syntax? So, I tried the Python 2.7 syntax of just ("Keyed,Bet","KeyedBet"), ) from the documentation, and that just emptied the entire file- took it from a 500M file down to 0kb. Can't figure out why that would happen, but I got the same result when I tried the sed solution from the answer above in Cygwin.
you need to add the from __future__ import print_function, there is no way this would delete anything as it either replaces if you find a match or else just writes the line
Upload a smaller sample of the file you are testing and i will run it locally

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.