0

I have the following problem. I have an XML inside an XML. See example:

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!DOCTYPE ..."><AIDEM><OSERVER>xxx</OSERVER>
<OBJECT SystemID = "111" ObjectID = "00000004009e8bc1" Docu = "some value" DirectoryID = "111" InternalType = "1" TemplateID = "1234" TemplateType = "6" TemplateName = "String">
<OHEADER><OFIELD FieldID = "1" FieldType = "3" FieldName = "string" IsEmpty = "no"><ODATETIME>11111</ODATETIME></OFIELD>
<OFIELD FieldID = "123" FieldType = "3" FieldName = "string" IsEmpty = "no"><ODATETIME>11111</ODATETIME></OFIELD>
<OFIELD FieldID = "124" FieldType = "1" FieldName = "string" IsEmpty =  "no"><TEST_STRING>&lt;mos&gt;
&lt;ID&gt;some.some.some&lt;/ID&gt;
&lt;sID&gt;some.some&lt;/sID&gt;
&lt;mID&gt;53570320&lt;/mID&gt;
&lt;mObj&gt;
&lt;oID&gt;cl178317481&lt;/oID&gt;
....
</TEST_STRING></OFIELD>

In this example, the inner XML in the OFIELD is defined with ID 124. This is 99% true, but could also be in another field. Now I would like to extract the inner XML from the upper XML and create it into a new XML and save the original one without the inner xml. At the end I want two new xml's, one without inside xml and one only the inner xml. Which library packages do i need to solve this problem? I am very grateful for every tip.

public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException, TransformerException  {

    File file = new File("example.xml");

    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db = dbf.newDocumentBuilder();
    Document document = db.parse(file);

    DOMSource domSource = new DOMSource(document);
    StringWriter writer = new StringWriter();
    StreamResult result = new StreamResult(writer);
    TransformerFactory tf = TransformerFactory.newInstance();
    Transformer transformer = tf.newTransformer();
    transformer.transform(domSource, result);
    String mystring = writer.toString();
    System.out.println("String XML: \n" + mystring);

}

}

With my example I have everything in one string, but I don't have the idea how to process the inner XML correctly. To be honest, the formatting irritates me a bit. What is a good way to do this and pack the inner XML into a new XML? Thanks in advance.

4
  • 1
    Did you try parsing the xml using JAXB or any other framework? Commented Dec 10, 2019 at 10:45
  • Thanks for your comment @Smile i edited my question. Commented Dec 16, 2019 at 21:36
  • 1
    Minor nit: you don't have "XML in XML", you have an encoded string as the text of one of your nodes. So take that node's contents, un-escape it, and process the resulting valid XML string. Commented Dec 16, 2019 at 21:39
  • Thanks @DaveNewton your answer helped me a lot. Problem solved with StringEscapeUtils. Commented Dec 16, 2019 at 23:00

1 Answer 1

1

You can use any parser to parse your outer document into a DOM (as you do already).

Then traverse the DOM, get all the node values and try to parse them with XML again. Upon success you could save the document (inner XML), remove that node from the DOM and save it (outer without nested XML).

Finally render the document using the transformer as you do already.

To make this more efficient, you could just try parsing TextNodes, but I'm not sure whether this 99% only affects Text nodes or also attributes.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.