1

is there a possibility to perform a full text search in a xml file with Xpath and select them using java? I want to select all elements (no mater if it is a comment, node or attribute), which contains a special word. For example: Searching for "bob" I would like to get tag1, bob1 and tag3 as a result

<tag1 name="bob">
    <tag2/>
    <bob1/>
    <tag3 bob="true"/>
</tag1>

If it is possible, I prefer a solution without using an external package. I would be very happy if somebody could help me. I couldn't find anything like this until now. Thank you very much! Kind regards

EDIT: I am searching for a possibility to finde every occurrence of the word "Bob", no matter what function Bob has!

4
  • So, you also want to extract all subnodes under the matching node? It smells like a bad pattern. I think that further studying your XML you could have more specific solution. Commented Jun 7, 2013 at 22:21
  • I hoped that maybe there is a general solution so that I dont have to make an assumptions... Thank you for your comment! Commented Jun 7, 2013 at 22:29
  • General solutions are never effective. It really pays off to study your problem domain, occasionally you will make mistakes later because of ignoring some basic fact. Commented Jun 7, 2013 at 22:33
  • Something like //*[@name='bob']? You might want to keep XPath tutorial on hand ;) Commented Jun 7, 2013 at 22:57

2 Answers 2

1

You can write a simple SAX parser to process your XML. Here's the Oracle SAX tutorial for starting.

You can go through all nodes, mark (and save) those nodes that are interesting for you, and return the resulting nodes as a new XML document, or String, or whatever form you want.

Sign up to request clarification or add additional context in comments.

1 Comment

I didnt want to iterate through the whole document and I hoped to do it with Xpath, but apparently it´s not possible. Thank you for your Answer.
1

XPath defines the contains() function (see http://www.w3.org/TR/xpath/#function-contains) which you could use in an expression like:

//*[contains(., "wordtosearchfor")]

which would find all elements containing the word

To find all attributes containing the word, you could use:

//*[@*[contains(., "wordtosearchfor")]]

which would find all elements having an attribute containing the word

However the problem as you stated it is not well-posed. The XML sample you gave is not well-formed, and it is not true that contains "bob". So it's hard to tell exactly what you are trying to do.

1 Comment

Hallo, thank you for you help. I know that it is not true that "bob1" and "tag3" contain "bob", but I'm searching a possibility to do it: I want something like a Full-Text-Search within a String (like I would parse the whole xml to a string) with the possibilty to select the element which has the string that I'm searching for. PS: sorry that the xml was not well-formed (I didn't know how the use the editor at stackoverflow...)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.