1

I'm having a problem writing a regular expression for matching HTML tags. I found a similar entry here, but this didn't quite work in my case.

Here's my test string:

<div id="div0" class="myclass">here's some text
that may include whitespace</div><div id="div1" class="myclass">
and some more here
</div>

And here's my regex based on the aforementioned entry:

<div[^>]*class="myclass">[^~]*?<\/div>

Note that I need to match the first instance of <div /> with class of "myclass." The content may have carriage returns. These <div> tags won't be nested.

Here's a rubular page for testing: http://rubular.com/r/vlfcikKMXk

3
  • Just thought you should know, with normal regex that *? would be helping, apparently in Ruby, it doesn't do anything. Commented Jun 17, 2010 at 5:46
  • So you already found a similar question (there are tons of them). The first answer was to use a real HTML parser, and yet you want to continue to use regular expressions for this? :) Commented Jun 17, 2010 at 5:47
  • @Kerry: You're right. Thanks for pointing me to the right direction. @Lukáš Lalinský: Yes.. I have a reason not to use a parser in this case. Thanks for your 2c still. Commented Jun 17, 2010 at 6:03

1 Answer 1

1

That regex tested is not great. It is in fact matching as you want it to, but it is matching it multiple times (2 different matches), and not showing a difference, you only want the first match.

Go here: http://gskinner.com/RegExr/

Test it there, turn off the 'global' you will see it working.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.