16

Possible Duplicates:
crawling a html page using php?
Best methods to parse HTML

I have one string-variable in my php-script, that contains html-page. How i can extract DOM-elements from this string?

For example, in this string '<div class="someclass">text</div>', i wish get variable 'text'. How i can do this?

1

2 Answers 2

32

You need to use the DOMDocument class, and, more specifically, its loadHTML method, to load your HTML string to a DOM object.

For example :

$string = <<<HTML
<p>test</p>
<div class="someclass">text</div>
<p>another</p>
HTML;

$dom = new DOMDocument();
$dom->loadHTML($string);

After that, you'll be able to manipulate the DOM, using for instance the [**`DOMXPath`**][3] class to do XPath queries on it.

For example, in your case, you could use something based on this portion of code :

$xpath = new DOMXpath($dom);
$result = $xpath->query('//div[@class="someclass"]');
if ($result->length > 0) {
    var_dump($result->item(0)->nodeValue);
}

Which, here, would get you the following output :

string 'text' (length=4)

As an alternative, instead of `DOMDocument`, you could also use [**`simplexml_load_string`**][4] and [**`SimpleXMLElement::xpath`**][5] -- but for complex manipulations, I generally prefer using `DOMDocument`.
Sign up to request clarification or add additional context in comments.

2 Comments

@Gordon done (yeah, that's kind of a multi-time-duplicate)
how fast is DOMDocument?
5

Have a look at DOMDocument and DOMXPath.

$DOM = new DOMDocument();
$DOM->loadHTML($str);

$xpath = new DOMXPath($DOM);
$someclass_elements = $xpath->query('//[@class = "someclass"]');
// ...

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.