Python regex multiple matches with grouping

Question

Input String

<msgCode>1111</msgCode>asdasdad<errorId>2222</errorId>

What I want

(1111,2222)

If I use findall, this is what I get :

>>> import re;
>>> print re.findall("<(msgCode|errorId)>([0-9]+)</(msgCode|errorId)>","<msgCode>1111</msgCode>asdasdad<errorId>2222</errorId>");
[('msgCode', '1111', 'msgCode'), ('errorId', '2222', 'errorId')]

What I hope for is

[('1111','2222')]

Is there a easy way to do it using re instead of post-processing output ?

Yes, let's all pontificate using the same thread over and over again, even though the OP might be certain that his XML/HTML will never contain tags nested within themselves. — Vasili Syrakis
– Vasili Syrakis, Commented Jan 31, 2014 at 3:25

Guy Gavriely · Accepted Answer · 2014-01-31 03:52:43Z

2

consider using xpath instead:

>>> from lxml import html
>>> root = html.fromstring('<msgCode>1111</msgCode>asdasdad<errorId>2222</errorId>')
>>> root.xpath('//*[self::msgcode or self::errorid]/text()')
['1111', '2222']

edited Jan 31, 2014 at 3:52

answered Jan 31, 2014 at 3:42

Guy Gavriely

11.4k6 gold badges31 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Ajeet Ganga Over a year ago

This is the reason why I post SO questions, even when I have crude workaround such as post-processing the regex find. :) :)

Vasili Syrakis · Accepted Answer · 2014-01-31 03:11:58Z

-1

Use a Non-Capture group for the msgCode tags (?:msgCode|errorId)

>> import re
>> subject = "<msgCode>1111</msgCode>asdasdad<errorId>2222</errorId>"
>> result = re.findall("<(?:msgCode|errorId)>([0-9]+)</(?:msgCode|errorId)>", subject)
>> print result

['1111', '2222']

answered Jan 31, 2014 at 3:11

Vasili Syrakis

9,6912 gold badges43 silver badges59 bronze badges

1 Comment

Ajeet Ganga Over a year ago

In my case this happened to be HTML, hence I selected another as answer. But thank you for your answer Vasil. :)

Collectives™ on Stack Overflow

Python regex multiple matches with grouping

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related