0

I got a task to replace "O"(capital O) by "0" in a text file by using python. But one condition is that I have to preserve the other words like Over, NATO etc. I have to replace only the words like 9OO to 900, 2OO6 to 2006 and so on. I tried a lot but yet not successful. My code is given below. Please help me any one. Thanks in advance

import re

srcpatt = 'O'
rplpatt = '0'
cre = re.compile(srcpatt)

with open('myfile.txt', 'r') as file:
    content = file.read()

wordlist = re.findall(r'(\d+O|O\d+)',str(content))
print(wordlist)

for word in wordlist:
    subcontent = cre.sub(rplpatt, word)
    newrep = re.compile(word)
    newcontent = newrep.sub(subcontent,content)

with open('myfile.txt', 'w') as file:
    file.write(newcontent)

print('"',srcpatt,'" is successfully replaced by "',rplpatt,'"')
4
  • is wordlist coming out OK? If that much of it works we can focus on the rest. Commented Jun 13, 2013 at 21:21
  • do you have toreplace O123 with a leading 0 -> 0123 Commented Jun 13, 2013 at 21:24
  • 'wordlist' just for seeing the output @Brian. Thanks for edit. Commented Jun 13, 2013 at 21:40
  • As you'll see - regex are easy to get wrong. "And so on" is no substitute for having a bunch more testcases. Commented Jun 13, 2013 at 21:45

3 Answers 3

1

re.sub can take in a replacement function, so we can pare this down pretty nicely:

import re
with open('myfile.txt', 'r') as file:
    content = file.read()
with open('myfile.txt', 'w') as file:
    file.write(re.sub(r'\d+[\dO]+|[\dO]+\d+', lambda m: m.group().replace('O', '0'), content))
Sign up to request clarification or add additional context in comments.

3 Comments

This will not work for the string "9OO". You could change \d+O+\d+ to \d+O+\d*.
yea thanks a lot! But what about the string like O4:O5 to 04:05 ?
Edited the regex, should be fine now.
0
import re

srcpatt = 'O'
rplpatt = '0'
cre = re.compile(srcpatt)
reg = r'\b(\d*)O(O*\d*)\b'

with open('input', 'r') as f:
    for line in f:
        while re.match(reg,line): line=re.sub(reg, r'\g<1>0\2', line)
        print line

print('"',srcpatt,'" is successfully replaced by "',rplpatt,'"')

Comments

0

You can probably get away with matching just a leading digit followed by O. This won't handle OO7, but it will work nicely with 8080 for example. Which none of the answers here matching the trailing digits will. If you want to do that you need to use a lookahead match.

re.sub(r'(\d)(O+)', lambda m: m.groups()[0] + '0'*len(m.groups()[1]), content)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.