2

I have this Perl regular expression and I want to convert it to Python.

The regex I want is a search and replace that finds text and converts it to upper case. It also must be the first occurring result. Perl regex:

open FILE, "C:/thefile.txt";
while (<FILE>){
    # Converts "foo yadayada bar yadayada"
    #       to "FOO  bar yadayada"
    s/(^.*?)(yadayada)/\U$1/;
    print;
}

The Python regex I have is not working correctly:

import re
lines = open('C:\thefile.txt','r').readlines()
for line in lines:
    line = re.sub(r"(yadayada)","\U\g<1>", line, 1)
    print line

I realize the \U\g<1> is what isn't working because Python doesn't support \U for uppercase.. so what do I use!?!

2
  • Documented what the Perl code does for the Python programmers who aren't familiar enough with Perl. Commented May 17, 2012 at 18:14
  • Are you sure that's not suppose to be s/(yadayada)/\U$1/? ` Commented May 17, 2012 at 18:19

2 Answers 2

3

re.sub can take a function, which processes each match object and returns a string. So you can do it like this:

In [4]: def uppergrp(match):
   ...:     return match.group(1).upper()
   ...: 

In [5]: re.sub("(yada)", uppergrp, "abcyadadef", count=1)
Out[5]: 'abcYADAdef'

Working with regexes in Python is less convenient, but Python programmers tend to be less keen to use regexes than Perl coders.

Sign up to request clarification or add additional context in comments.

4 Comments

That works perfectly but I am confused as to how you would use it with multiple replacements. How could I include multiple functions in re.sub ? How could I concatenate words to the end of the uppercase words?
@user1399782 Do you want to do a series of things to each replacement? You can make the function as complex as you want. Do you want to do different replacements of different parts? You can either call re.sub several times, or build a more complex regex and check the match inside the function. To add something after the uppercase word, you can just make the function return match.group(1).upper() + 'something'.
For that input, the Perl code outputs ABCdef, not abcYADAdef.
@ikegami: If you want that, make the regex (^.*?)(yada), so that group 1 is 'abc'
2

The second argument to sub can also be a function, meaning if regex language in python cannot accomplish what you want (or at least makes it very difficult) you can just define your own function to use instead.

eg.

re.sub(pattern, lambda x: x.group(1).upper(), string)

edit: The function gets passed a MatchObject

2 Comments

We hit essentially the same answer within a few seconds of each other. A win for 'one obvious way to do it'. ;-)
Indeed, though you seem to have just beaten me to it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.