3

I need a regular expression matches any tags have classname "share"

I think I'm very close. with this:

class=".*share.*"

I want it to match these:

<div class="share"></div>
<div class="sdfsd share sdfsdfsdf"></div>

But not these:

<div class="sdfsd dfdgdg" share></div>
<a class="icon-share export-to-csv-button"/>
<a class="fxac link" href="/share"/>

Please visit: https://regex101.com/r/uU6dU0/2

5
  • @CasimiretHippolyte not that too Commented Feb 17, 2015 at 22:36
  • 1
    Other than the obligatory "don't parse html with regex", something like this should work for most: /<[^>]*class="[^"]*\bshare\b[^"]*"[^>]*>/si. This would only work though if the class was surrounded by double quotes. Commented Feb 17, 2015 at 22:40
  • 1
    regex is not the way to go. What do you want to do once you have them? Commented Feb 17, 2015 at 22:41
  • So, are you trying to match the tag like your question says or just the other classes like your example shows? Commented Feb 17, 2015 at 22:56
  • 2
    I'm using regex in IntelliJ search to match tags have classname "share" cl.ly/ZpTY Commented Feb 17, 2015 at 23:00

2 Answers 2

3

Use this regex:

class=([^=]*)([^(a-z|A-Z|0-9|\-|_)])share("|([^(a-z|A-Z|0-9|\-|_)]).*")

https://regex101.com/r/uU6dU0/4

Edit: This one is easier and will not match multiple tags:

class=("|"([^"]*)\s)share("|\s([^"]*)")

https://regex101.com/r/uU6dU0/5

Edit 2: an improved version that finds classes where single quotes are used on either side:

class=(("|')|("|')([^"']*)\s)top-menu(("|')|\s([^"']*)("|'))
Sign up to request clarification or add additional context in comments.

6 Comments

You would want to fix your asterisks to be non-greedy by adding a question mark after them. Else you will possibly match multiple tags.
I do not see how your regex does what OP expects
@hwnd: It is exactly what he wanted to get
I am wondering why this slightly simpler version would not also work: class=("|"[^"]*\s)line("|\s[^"]*"). It seems to work for me. All it does is remove the grouping parenthesis around the potential additional class selector names before and after the desired selector.
A further improvement on my previous suggestion: my original suggestion only works for class attributes using double quotes. To handle either case, use the following: class=(("|')|("|')([^"']*)\s)share(("|')|\s([^"']*)("|')). Note it would also find case where there are mismatched quotes; however that is not a problem per the question.
|
2

Like I posted in my comment, you don't want to parse html with a regex. There are VERY few cases where you should. Typically you would use DOMDocument and XPath to query (similar to css) for elements. This will allow you to get the inner text, nested elements and more that regular expressions just can't do well/easily.

However, if you need to, this should work:

<?php
$text =<<<HTML
<div class="share"></div>
<div class="sdfsd share sdfsdfsdf"></div>
<div class="sdfsd dfdgdg" share></div>
<a class="icon-share export-to-csv-button"
<a class="fxac link" href="/share "
HTML;

preg_match_all('/<[^>]*class="[^"]*\bshare\b[^"]*"[^>]*>/i', $text, $matches);
echo '<pre>'.htmlentities(print_r($matches,1)).'</pre>';

Outputs:

Array
(
    [0] => Array
        (
            [0] => <div class="share">
            [1] => <div class="sdfsd share sdfsdfsdf">
        )

)

which you can see in action here: http://codepad.viper-7.com/UjBvT8

2 Comments

I'm using regex in IntelliJ search to match tags have classname "share" cl.ly/ZpTY and yours worked too
thank you, i used this as 0 !== preg_match('/<[^>]*class=["\'][^"]*\bfa\b[^"]*["\'][^>]*>/i', BREADCRUMBS_SPACER). Its works!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.