2

I'm stumped - I'd like to solve this problem using XPath if possible, as it will simplify some supporting Java code. I have a document structured as follows:

<table>
   <row id="a">
      <data-point value="0">5</data-point>
      <data-point value="15">4</data-point>
      <data-point value="30">2</data-point>
      <data-point value="45">0</data-point>
   </row>
   <row id="b">
      <data-point value="0">8</data-point>
      <data-point value="10">6</data-point>
      <data-point value="20">4</data-point>
      <data-point value="30">0</data-point>
   </row>
</table>

The selection is based on an input value. For example, if the inputValue = 17, I need to select the data-points that bracket that value - in this case, select data-points where value="15" and "30". Similarly, if the value is 32, select data-points "30" and "45".

In the event that the number exactly matches one of the data points, it is ok to either return just the matched datapoint, or that datapoint and the next OR previous one (it doesn't really matter so long as the matching datapoint is returned. So, if the inputValue is 15, it is ok to select data-points with value of "15" and "30", "0" and "15", or just "15".

The XPath must also take into account the selector for the row element

I've tried lots of combinations using following-sibbling, last(), etc, but I can't seem to hone in on a suitable XPath. Any gurus out there who can come up with a good XPath?

The following solution gets close but still fails on several cases. The following XPath assumes that the input selection is for row id="a" and an inputValue of 20:

/table/row[@id='a']/data-point[(20 >= number(@value) or 20 <= number(@value)) and following-sibling::*[1]/@value > 20]
8
  • I don't quite understand the row requirement. You'll get an input value and a row? If not, how is the row selected? For input = 17, should both 15-30 (from row a) and 10-20 (from row b) be returned? Commented Apr 22, 2014 at 21:16
  • Is it possible to have input value of, say, 120? Commented Apr 22, 2014 at 21:26
  • Your example input seems to be sorted on the value attribue. Is that true for the real data? Commented Apr 22, 2014 at 21:27
  • Slanec - no, I put the extra "row" in there because that will be selected in addition. So, the XPath would have to start with that /table/row[@id='a']/data-point[???]. I did come up with a solution that seems to work, but it feels a bit "brute force" so maybe someone can propose a simpler solution. Commented Apr 22, 2014 at 21:55
  • Please do not post your answer by editing your question, it just messed up the difference between question and answer. It is perfectly fine to answer your own question. However, your query is incorrect for the more general case. Try 2 or 40 or some other values instead of 20. Commented Apr 22, 2014 at 22:03

4 Answers 4

2

You mention in a comment that you have XPath 2.0, so you can use the min and max functions. Assuming $row is the target row ID and $val is the target value:

/table/row[@id = $row]/data-point[@value = (
   min(../data-point/@value[. ge $val]), max(../data-point/@value[. le $val])
)]

This will find the lowest upper bound and greatest lower bound @values (which may be the same one if there happens to be an exact match) and extract their corresponding data-point elements.

This will work even if the values are not in ascending order in the input.

Sign up to request clarification or add additional context in comments.

Comments

0

You did not specify the XPath version, so I assume it is 1.0. It will be tricky to solve with pure 1.0 capabilities since there is no concept of sorting a node set (but there are certainly N XPath gurus out there that can bend the language to fix this). However, I see two options:

1) Write a custom function in a suitable language, e.g. Java, to take a node set and return another node set that bracket the input value, or null.

2) If you can alter the input data, apply an XSLT that inserts a bogus <data-point> entry with the target value, sorts the <row> children and then selects the preceding and following siblings.

Comments

0

Assuming $row being your row item (a or b in this case) and $input being your input value (17 in the example given), the following XPath will fetch the required nodes:

(/table/row[@id = $row]/data-point[@value < $input])[last()] | (/table/row[@id = $row]/data-point[@value > $input])[1] | /table/row[@id = $row]/data-point[@value = $input]

It will select the last item which does have a lower value than your input value and the first value which does have a higher value than your input. Also, it will add the node itself if the value is equal.

1 Comment

Thank you. This seems to work. I was originally trying to avoid the union operator, but this does return the right values.
0

Assuming that your search value is stored in $para the following XPath expression should work in XPath 1.0:

data-point[  ( (preceding-sibling::data-point[1]/@value &lt; $para 
                or 
                not(preceding-sibling::data-point[1])
               ) 
              and @value > $para
             ) 
           or 
             ( (following-sibling::data-point[1]/@value > $para 
                or 
                not (following-sibling::data-point[1])
               ) 
               and @value &lt; $para
             )
          ]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.