PHP Regex match all HTML tags

Question

I am reading contains of an HTML page for some details, I'm searching for every occurrence of a string, that string comes withing a tag, I want to read just that string only.

Example:

<a href="http://www.example.com/search?la=en&q=javascript">javascript</a>
<a href="http://www.example.com/search?la=en&q=PHP">PHP</a>

I just want to read every occurrence of tags TEXT on the basis of href tag which must contain this (http://www.example.com/search?la=en&q=).

Any idea?

karim79 · Accepted Answer · 2009-08-17 08:43:07Z

4

SimpleHtmlDom example (isn't it pretty?):

// Create DOM from URL or file
$html = file_get_html('http://www.google.com/');

// Find all links 
foreach($html->find('a') as $element) {
       echo $element->href . '<br>';
       echo $element->text; //this is what you want
}

answered Aug 17, 2009 at 8:43

karim79

343k67 gold badges420 silver badges409 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

David Z · Accepted Answer · 2009-08-17 08:44:59Z

0

If the HTML page you're reading is very regular (for instance, machine-generated according to predictable patterns), something like this would work:

preg_match('|<a\s+href="http://www.example.com/search\?la=en&q=(\w+)"\s*>\1</a>|', $page)

But if it gets any more complicated than that, regular expressions probably won't be enough for the job - you'd be better off using a full HTML parser to extract the links and check them one-by-one to find the text you want.

answered Aug 17, 2009 at 8:44

David Z

133k29 gold badges264 silver badges284 bronze badges

Collectives™ on Stack Overflow

PHP Regex match all HTML tags

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related