0

I've been trying to remove the numberings from the following lines using a Python script.

jokes.txt:

  1. It’s hard to explain puns to kleptomaniacs because they always take things literally.

  2. I used to think the brain was the most important organ. Then I thought, look what’s telling me that.

When I run this Python script:

import re
with open('jokes.txt', 'r+') as original_file:
    modfile = original_file.read()
    modfile = re.sub("\d+\. ", "", modfile)
    original_file.write(modfile)

The numbers are still there and it gets appended like this:

  1. It’s hard to explain puns to kleptomaniacs because they always take things literally.

  2. I used to think the brain was the most important organ. Then I thought, look what’s telling me that.1. It’s hard to explain puns to kleptomaniacs because they always take things literally.਍ഀ਍ഀ2. I used to think the brain was the most important organ. Then I thought, look what’s telling me that.

I guess the regular expression re.sub("\d+\. ", "", modfile)finds all the digits from 0-9 and replaces it with an empty string.

As a novice, I'm not sure where I messed up. I'd like to know why this happens and how to fix it.

1 Answer 1

5

You've opened the file for reading and writing, but after you've read the file in you just start writing without specifying where to write to. That causes it to start writing where you left off reading - at the end of the file.

Other than closing the file and re-opening it just for writing, here's a way to write to the file:

import re
with open('jokes.txt', 'r+') as original_file:
    modfile = original_file.read()
    modfile = re.sub("\d+\. ", "", modfile)
    original_file.seek(0) # Return to start of file
    original_file.truncate() # Clear out the old contents
    original_file.write(modfile)

I don't know why the numbers were still there in the part that you appended, as this worked just fine for me. You might want to add a caret (^) to the start of your regex (resulting in "^\d+\. "). Carets match the start of a line, making it so that if one of your jokes happens to use something like 1. in the joke itself the number at the beginning will be removed but not the number inside the joke.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks. It worked for me... I'm wondering why the numbers weren't removed in the appended part too...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.