Python2 regular expressions seem faulty

Question

Using Python 2.7.3 on Linux. Here is a shell session verbatim.

>>> f = open("feed.xml")
>>> text = f.read()
>>> import re
>>> regexp1 = re.compile(r'</?item>')
>>> regexp2 = re.compile(r'<item>.*</item>')
>>> regexp1.findall(text)
['<item>', '</item>', '<item>', '</item>', '<item>', '</item>', '<item>', '</item>']
>>> regexp2.findall(text)
[]

Is this a bug, or is there something I'm not understanding about Python regular expressions?

chepner · Accepted Answer · 2012-07-30 15:39:49Z

5

By default, '.' does not match a newline. Try with

regexp2 = re.compile(r'<item>.*</item>', re.DOTALL)

answered Jul 30, 2012 at 15:39

chepner

538k77 gold badges594 silver badges746 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Community · Accepted Answer · 2017-05-23 11:48:14Z

0

Here is the best answer to this question: Don't use regular expressions to parse non-regular languages such as XML. It drove one S-O user insane. Another relevant link.

edited May 23, 2017 at 11:48

CommunityBot

11 silver badge

answered Jul 30, 2012 at 15:37

Claudiu

231k174 gold badges507 silver badges702 bronze badges

5 Comments

chepner Over a year ago

This doesn't address his misunderstanding of regular expressions, however.

Jangler Over a year ago

A valid point, but I'm only using this code for a quick hack and thus don't want or need to learn any new APIs.

chepner Over a year ago

I finally followed the link to the insane S-O user. I'd retract my downvote for that if I could :)

Fred Foo Over a year ago

@chepner: made a trivial (whitespace only) edit so you can retract the downvote.

Claudiu Over a year ago

@Jangler: quick hacks often become scripts that you rely on. if you learn the new API then you can do a quick hack with the new API

Collectives™ on Stack Overflow

Python2 regular expressions seem faulty

2 Answers 2

Comments

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related