xPath/HTML: Select node based on related node

Question

<html>
    <body>
        <table>
            <tr>
                <th>HeaderA</th>
                <th>HeaderB</th>
                <th>HeaderC</th>
                <th>HeaderD</th>
            </tr>
            <tr>
                <td>ContentA</td>
                <td>ContentB</td>
                <td>ContentC</td>
                <td>ContentD</td>
            </tr>
         </table>
    </body>
</html>

I am looking for the most efficient way to select the content 'td' node based on the heading in the corresponding 'th' node..

My current xPath expression..

/html/body/table/tr/td[count(/html/body/table/tr/th[text() = 'HeaderA']/preceding-sibling::*)+1]

Some questions..

Can you use relative paths (../..) inside count()?
What other options to find current node number td[?] or is count(/preceding-sibling::*)+1 the most efficient?

Harmen · Accepted Answer · 2009-12-28 13:25:18Z

3

It is possible to use relative paths inside count()
I have never heard of another way to find the node number...

Here is the code with relative xpath-code inside count()

/html/body/table/tr/td[count(../../tr/th[text()='HeaderC']/preceding-sibling::*)+1]

But well, it is not much shorter... It won't be shorter than this in my opinion:

//td[count(../..//th[text()='HeaderC']/preceding-sibling::*)+1]

edited Dec 28, 2009 at 13:25

answered Dec 28, 2009 at 10:35

Harmen

22.5k4 gold badges57 silver badges77 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

chameleon95 Over a year ago

Excellent.. I am not so much looking for the shortest way to write the expression.. but the most efficient, to minimise the internal lookups..

Mads Hansen · Accepted Answer · 2009-12-28 19:22:00Z

2

Harmen's answer is exactly what you need for a pure XPATH solution.

If you are really concerned with performance, then you could define an XSLT key:

<xsl:key name="columns" match="/html/body/table/tr/th" use="text()"/>

and then use the key in your predicate filter:

/html/body/table/tr/td[count(key('columns', 'HeaderC')/preceding-sibling::th)+1]

However, I suspect you probably won't be able to see a measurable difference in performance unless you need to filter on columns a lot (e.g. for-each loops with checks for every row for a really large document).

answered Dec 28, 2009 at 19:22

Mads Hansen

67.6k12 gold badges119 silver badges154 bronze badges

Comments

Eran Medan · Accepted Answer · 2009-12-28 09:32:42Z

1

I would have left Xpath aside... since I assume it was DOM parsed, I'd use a Map data structure, and match the nodes in either client side or server side (JavaScript / Java) manually.

Seems to me XPath is being streatched beyond its limit here.

answered Dec 28, 2009 at 9:32

Eran Medan

45.9k61 gold badges187 silver badges283 bronze badges

2 Comments

Eran Medan Over a year ago

I still think XPath is not the best solution here, voting down won't change my mind... or the facts...

chameleon95 Over a year ago

I understand and appreciate your comment.. What I am looking for is the most efficient method using xPath.. I can then perform real world benchmarks in my environment using all available options (xPath, java, javascript, etc..) to settle on a final solution.. Thanks for your comment..

mst · Accepted Answer · 2009-12-28 16:30:05Z

0

Perhaps you want position() and XPath axes?

answered Dec 28, 2009 at 16:30

mst

2471 silver badge5 bronze badges

Collectives™ on Stack Overflow

xPath/HTML: Select node based on related node

4 Answers 4

1 Comment

Comments

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

2 Comments

Comments

Related