5

I want to extract all links that are in complex selectors like - .timestream .ui-ContentBottom h1 a. I know how to do it with simple links like just a single selector like a :

 $dom = new DOMDocument;
 $dom->loadHTML($html);
 $xpath = new DOMXPath($dom);
 $nodes = $xpath->query('//a/@href');
 foreach($nodes as $href) {
   echo $href->nodeValue;
 }

I am new to xPath so any help would be appreciated.

2 Answers 2

5

The following XPath expression should work for you:

//*[contains(@class, "timestream")]//*[contains(@class, "ui-ContentBottom")]//h1//a/@href

The problem here is that XPath does not have a native class selector. In other words, contains(@class, "smth") is not exactly the same as .smth, but, in practice, it usually works for matching a single class in a multi-valued class attribute value. See also:

Sign up to request clarification or add additional context in comments.

2 Comments

Could you please provide a link where I can read more about using complex xPath queries? Thanks :)
@SanJeetSingh cannot recall anything specific, just google search the XPath tutorials and practice and practice more! Glad to help.
0

xpath lets you search a document such as an xml or html file.

xpath will not show classes in the path, but will show ids with an @ symbol.

The xpath can be obtained in a few ways. One way in Chrome is to view the source of an element, right-click it and click Copy XPath.

When I do this on the the textarea box I am answering this question in, I receive the following xpath ::

//*[@id="wmd-input"]

Do not let that confuse you though. Here is a more simplistic example

/html/body

This is the xpath of the body element.

I wrote a small function that can help you turn xpaths into elements.

function xpath(path){
    for (var found, x = document.evaluate(path, document, null, XPathResult.ANY_TYPE, null), result = []; found = x.iterateNext();) {
        result.push(found);
    }
    return result;
}

This function produces the following when running it against this textarea ::

xpath('//*[@id="wmd-input"]');
[<textarea id=​"wmd-input" class=​"wmd-input processed" name=​"post-text" cols=​"92" rows=​"15" tabindex=​"101" data-min-length>​</textarea>​]

Now that you have the element you can modify it like this example :

var test = xpath('/html/body');
test[0].innerHTML='bye';

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.