Php regex matches html tags contain a certain class name

Question

I need a regular expression matches any tags have classname "share"

I think I'm very close. with this:

class=".*share.*"

I want it to match these:

<div class="share"></div>
<div class="sdfsd share sdfsdfsdf"></div>

But not these:

<div class="sdfsd dfdgdg" share></div>
<a class="icon-share export-to-csv-button"/>
<a class="fxac link" href="/share"/>

Please visit: https://regex101.com/r/uU6dU0/2

Other than the obligatory "don't parse html with regex", something like this should work for most: /<[^>]*class="[^"]*\bshare\b[^"]*"[^>]*>/si. This would only work though if the class was surrounded by double quotes. — Jonathan Kuhn
– Jonathan Kuhn, Commented Feb 17, 2015 at 22:40
regex is not the way to go. What do you want to do once you have them? — Casimir et Hippolyte
– Casimir et Hippolyte, Commented Feb 17, 2015 at 22:41
So, are you trying to match the tag like your question says or just the other classes like your example shows? — Jonathan Kuhn
– Jonathan Kuhn, Commented Feb 17, 2015 at 22:56
I'm using regex in IntelliJ search to match tags have classname "share" cl.ly/ZpTY — goksel
– goksel, Commented Feb 17, 2015 at 23:00

ByteHamster · Accepted Answer · 2018-11-04 21:49:57Z

3

Use this regex:

class=([^=]*)([^(a-z|A-Z|0-9|\-|_)])share("|([^(a-z|A-Z|0-9|\-|_)]).*")

https://regex101.com/r/uU6dU0/4

Edit: This one is easier and will not match multiple tags:

class=("|"([^"]*)\s)share("|\s([^"]*)")

https://regex101.com/r/uU6dU0/5

Edit 2: an improved version that finds classes where single quotes are used on either side:

class=(("|')|("|')([^"']*)\s)top-menu(("|')|\s([^"']*)("|'))

edited Nov 4, 2018 at 21:49

answered Feb 17, 2015 at 22:48

ByteHamster

4,9719 gold badges41 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Jonathan Kuhn Over a year ago

You would want to fix your asterisks to be non-greedy by adding a question mark after them. Else you will possibly match multiple tags.

hwnd Over a year ago

I do not see how your regex does what OP expects

ByteHamster Over a year ago

@hwnd: It is exactly what he wanted to get

Jeffrey Simon Over a year ago

I am wondering why this slightly simpler version would not also work: class=("|"[^"]*\s)line("|\s[^"]*"). It seems to work for me. All it does is remove the grouping parenthesis around the potential additional class selector names before and after the desired selector.

Jeffrey Simon Over a year ago

A further improvement on my previous suggestion: my original suggestion only works for class attributes using double quotes. To handle either case, use the following: class=(("|')|("|')([^"']*)\s)share(("|')|\s([^"']*)("|')). Note it would also find case where there are mismatched quotes; however that is not a problem per the question.

|

Jonathan Kuhn · Accepted Answer · 2015-02-17 22:48:59Z

2

Like I posted in my comment, you don't want to parse html with a regex. There are VERY few cases where you should. Typically you would use DOMDocument and XPath to query (similar to css) for elements. This will allow you to get the inner text, nested elements and more that regular expressions just can't do well/easily.

However, if you need to, this should work:

<?php
$text =<<<HTML
<div class="share"></div>
<div class="sdfsd share sdfsdfsdf"></div>
<div class="sdfsd dfdgdg" share></div>
<a class="icon-share export-to-csv-button"
<a class="fxac link" href="/share "
HTML;

preg_match_all('/<[^>]*class="[^"]*\bshare\b[^"]*"[^>]*>/i', $text, $matches);
echo '<pre>'.htmlentities(print_r($matches,1)).'</pre>';

Outputs:

Array
(
    [0] => Array
        (
            [0] => <div class="share">
            [1] => <div class="sdfsd share sdfsdfsdf">
        )

)

which you can see in action here: http://codepad.viper-7.com/UjBvT8

answered Feb 17, 2015 at 22:48

Jonathan Kuhn

15.3k3 gold badges34 silver badges43 bronze badges

2 Comments

goksel Over a year ago

I'm using regex in IntelliJ search to match tags have classname "share" cl.ly/ZpTY and yours worked too

Nikolay Sergeevich Over a year ago

thank you, i used this as 0 !== preg_match('/<[^>]*class=["\'][^"]*\bfa\b[^"]*["\'][^>]*>/i', BREADCRUMBS_SPACER). Its works!

Collectives™ on Stack Overflow

Php regex matches html tags contain a certain class name

2 Answers 2

6 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related