0

i have to ignore some lines and replace some words in a csv, i tried using the following to replace but it looks like it gets ignored

if "myword" not in line:

to replace text i used

csv_writer.writerow(line.replace("oldword", "newword")) 

but this gets an error does someone knows why? EDIT WITH CODE

import csv

with open(r'excelfile.csv') as csv_file:
    csv_reader = csv.DictReader(csv_file)

    with open('new_names.csv', 'w', newline='') as new_file:
        fieldnames = ['value', 'type', 'description']
        writer=csv.writer(new_file)

        csv_writer =csv.DictWriter(new_file, fieldnames=fieldnames, delimiter= ',', extrasaction='ignore')

        csv_writer.writeheader()

        for line in csv_reader:
                csv_writer.writerow(line)
9
  • 2
    Please show the full contents of line, and create a minimum reproducible example that we can just cut and paste to run on out own machines. Without seeing at least a minimal example of your data, I don't think we can answer this question. Commented Feb 16, 2022 at 16:42
  • i edited the post with the code Commented Feb 16, 2022 at 16:52
  • This is the code without the 2 lines that i need to add in order to let it work Commented Feb 16, 2022 at 16:52
  • OK, that's helpful, now can you show where you want to put the broken lines, the actual contents of line when they don't work, and what you would like them to do? I'd like to be able to reproduce the incorrect behaviour you're having trouble with. Commented Feb 16, 2022 at 16:54
  • i want to delete the lines with the word i don't need and replace the words during the copy of the csv to a new one, so i wanted to put the 2 lines of code after "for line in csv_reader:" Commented Feb 16, 2022 at 16:59

1 Answer 1

0

Because you're using a csv.DictReader, the values are nested in a dictionary structure. For example, if the file has a line that reads:

has myword,and oldword,foo

the line variable will contain:

OrderedDict([('value', 'has myword'), ('type', 'and oldword'), ('description', 'foo')])

Just like @furas suggested in the comments, I determined this by adding print(line) in the for line... loop.

So your logical tests have to be taking that into account.

Here are a few examples that you can adapt to fit your exact purpose:

import csv

with open(r'excelfile.csv') as csv_file:
    csv_reader = csv.DictReader(csv_file)

    with open('new_names.csv', 'w', newline='') as new_file:
        fieldnames = ['value', 'type', 'description']
        writer=csv.writer(new_file)

        csv_writer =csv.DictWriter(new_file, fieldnames=fieldnames, delimiter= ',', extrasaction='ignore')

        csv_writer.writeheader()

        for line in csv_reader:
            # Substitute oldword by newword in all values in line
            line = {k: v.replace("oldword", "newword") for k, v in line.items()}

            # Reject line if any value in line is exactly "myword":
            if "myword" not in line.values():
                csv_writer.writerow(line)

            # Reject line if any value in line contains "myword" as a substring:
            if not any("myword" in value for value in line.values()):
                csv_writer.writerow(line)
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks everyone for the help!
OK this worked thanks, but what if i wanted more words to be rejected?
Matching multiple strings has been asked before. For example, stackoverflow.com/q/4953272/3216427 might help, or maybe stackoverflow.com/q/54481198/3216427

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.