Python 3 Regex Last Match

Question

How do I grab the 123 part of the following string using Python 3 regex module?

....XX (a lot of HTML characters)123

Here the ... Part denotes a long string consisting of HTML characters, words and numbers.

The number 123 is a characteristic of XX. So if anybody could suggest a universal method in which XX can be any letters like AA or AB, it would be more helpful.

Side Note:
I thought of using Perl's \G operator by first identifying XX in the string and then identifying the first number appearing after XX. But it seems \G operator doesn't work in Python 3.

My code:

import re
source='abcd XX blah blah 123 more blah blah'
grade=str(input('Which grade?'))
#here the user inputs XX

match=re.search(grade,source)
match=re.search('\G\D+',source)
#Trying to use the \G operator to get the location of last match.Doesn't work.

match=re.search('\G\d+',source)
#Trying to get the next number after XX.
print(match.group())

Could you show your attempt so this problem can become more clear — jamylak
– jamylak, Commented Jun 8, 2013 at 4:52
What do you mean by "grab" it? How about just if '123' in text: print '123'? — John Zwinck
– John Zwinck, Commented Jun 8, 2013 at 5:03
You can specify starting position. match = re.search(grade, source); match = re.compile(r'\d+').search(source, match.end()); print(match.group()) — falsetru
– falsetru, Commented Jun 8, 2013 at 5:18
Compiled regular expression's search method accept optional pos parameter. docs.python.org/2/library/re.html#re.RegexObject.search — falsetru
– falsetru, Commented Jun 8, 2013 at 5:29

Community · Accepted Answer · 2020-06-20 09:12:55Z

1

Description

This regex will match the string value XX which can be replaced with the user input. The regex will also require that the XX string be surrounded by white space or at the beginning of your sample text which prevents the accidental edge case where XX is found inside a word like EXXON.

(?<=\s|^)\b(xx)\b\s.*?\s\b(\d+)\b(?=\s|$)

enter image description here

Code Example:

I don't know python well enough to offer a proper python example, so I'm including a PHP example to simply show how the regex would work and the captured groups

<?php
$sourcestring="EXXON abcd XX blah blah 123 more blah blah";
preg_match('/(?<=\s|^)\b(xx)\b\s.*?\s\b(\d+)\b(?=\s|$)/im',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>
 
$matches Array:
(
    [0] => XX blah blah 123
    [1] => XX
    [2] => 123
)

If you need the actual string position, then in PHP that would look like

$position = strpos($sourcestring, $matches[0])

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Jun 8, 2013 at 15:35

Ro Yo Mi

15k5 gold badges38 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

korylprince Over a year ago

Just curious. What did you use to to generate the image?

Ro Yo Mi Over a year ago

@ Korylprince, I'm using debuggex.com. Although it doesn't support lookbehinds or atomic groups it's still handy for understanding the expression flow. There is also regexper.com. They do a pretty good job too, but it's not real time as you're typing.

Collectives™ on Stack Overflow

Python 3 Regex Last Match

1 Answer 1

Description

Code Example:

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Description

Code Example:

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related