7

How to extract a substring after keyword am, is or are from a string but not include am, is or are?

string = 'I am John'

I used:

re.findall('(?<=(am|is|are)).*', string)

An error occurs

re.error: look-behind requires fixed-width pattern

What is the correct approach?

0

3 Answers 3

11
import re

s = 'I am John'

g = re.findall(r'(?:am|is|are)\s+(.*)', s)
print(g)

Prints:

['John']
Sign up to request clarification or add additional context in comments.

Comments

3

In cases like this I like to use finditer because the match objects it returns are easier to manipulate than the strings returned by findall. You can continue to match am/is/are, but also match the rest of the string with a second subgroup, and then extract only that group from the results.

>>> import re
>>> string = 'I am John'
>>> [m.group(2) for m in re.finditer("(am|is|are)(.*)", string)]
[' John']

Based on the structure of your pattern, I'm guessing you only want at most one match out of the string. Consider using re.search instead of either findall or finditer.

>>> re.search("(am|is|are)(.*)", string).group(2)
' John'

If you're thinking "actually I want to match every instance of a word following am/is/are, not just the first one", that's a problem, because your .* component will match the entire rest of the string after the first am/is/are. E.g. for the string "I am John and he is Steve", it will match ' John and he is Steve'. If you want John and Steve separately, perhaps you could limit the character class that you want to match. \w seems sensible:

>>> string = "I am John and he is Steve"
>>> [m.group(2) for m in re.finditer(r"(am|is|are) (\w*)", string)]
['John', 'Steve']

Comments

-1

One of the solution is using partition function. there is an example

string = 'I am John'
words = ['am','is','are']

for word in words :
    before,word,after = string.partition(word)
    print (after)

OUTPUT :

 John

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.