0

How can I compare two XML files, ignoring certain elements using XPath?

For example, I need to compare the below two XML files, but I need to ignore 'Date' element, by passing the Xpath(//Set[1]/Product[\1]/Date) of this element during the run. The element to ignore could vary each time.

XML file 1:

<?xml version="1.0" encoding="utf-8"?>
<Set
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="urn:abc:product:v3" xsi:schemaLocation="urn:abc:product:v3 abc.xsd">
    <Product>
        <id>1</id>
        <ref>1</ref>
        <Date>2021-09-19</Date>
        <company>JJ</company>
        <lastModified>2021-09-20T21:00:30</lastModified>
        <productOne>
            <partProduct>
                <Level>3.0</Level>
                <Flag>0</Flag>
                <Code>EN</Code>
            </partProduct>
        </productOne>
    </Product>
</Set>

XML file 2:

<?xml version="1.0" encoding="utf-8"?>
<Set
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="urn:abc:product:v3" xsi:schemaLocation="urn:abc:product:v3 abc.xsd">
    <Product>
        <id>2</id>
        <ref>2</ref>
        <Date>2021-09-20</Date>
        <company>JJ</company>
        <lastModified>2021-09-20T21:00:30</lastModified>
        <productOne>
            <partProduct>
                <Level>3.0</Level>
                <Flag>0</Flag>
                <Code>EN</Code>
            </partProduct>
        </productOne>
    </Product>
</Set>

2 Answers 2

1

You need to transform both files into a form where they compare equal, by removing the elements you want to ignore. You would typically do this using XSLT. After the transformation you could either compare the results using the XPath 2.0 function deep-equal(), or serialise both documents as canonical XML and compare the files at the character or binary level.

I would do this by running XQuery Update to delete the nodes selected by the path expression, and then comparing the resulting documents either using fn:deep-equal(), or by doing canonical serialization and comparing the resulting lexical forms.

As an alternative to XQuery Update you could use xmlstarlet or Saxon's Gizmo tool.

But it might depend on what you want from the comparison. The above is fine if you want a yes/no answer, but getting details of the differences is more difficult. You could write your own query to find the differences, or use a tool such as DeltaXML.

NOTE: This answer has been subsequently edited by a third party in a way that makes nonsense of the comment thread. Please ignore the comments.

Sign up to request clarification or add additional context in comments.

5 Comments

The problem is the xpath I supply is not the same. It could vary each time
You might get more informative answers if you asked more informative questions.
Sorry, I should've. Done now.
Is there anyway, I can just search the element, remove it and then save the updated xml as a new document or even save it to the original one? From there, I am able to do the comparison
I suggested three ways of doing that: XQuery update, xmlstarlet, or Gizmo. If you are having trouble using any of these tools, please raise a new question -- SO doesn't work well if you try to ask supplementary questions on the original thread.
0

If you are using XmlUnit then you can define a filter for nodes:

Diff myDiff = DiffBuilder.compare(controlXml)
    .withTest(testXml)
    // Ignore all nodes with 'Date' name
    .withNodeFilter(node -> !"Date".equals(node.getNodeName()))
    .build();

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.