64

I have to parse an XML document that looks like this:

 <?xml version="1.0" encoding="UTF-8" ?> 
 <m:OASISReport xmlns:m="http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd" 
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xsi:schemaLocation="http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd">
  <m:MessagePayload>
   <m:RTO>
    <m:name>CAISO</m:name> 
    <m:REPORT_ITEM>
     <m:REPORT_HEADER>
      <m:SYSTEM>OASIS</m:SYSTEM> 
      <m:TZ>PPT</m:TZ> 
      <m:REPORT>AS_RESULTS</m:REPORT> 
      <m:MKT_TYPE>HASP</m:MKT_TYPE> 
      <m:UOM>MW</m:UOM> 
      <m:INTERVAL>ENDING</m:INTERVAL> 
      <m:SEC_PER_INTERVAL>3600</m:SEC_PER_INTERVAL> 
     </m:REPORT_HEADER>
     <m:REPORT_DATA>
      <m:DATA_ITEM>NS_PROC_MW</m:DATA_ITEM> 
      <m:RESOURCE_NAME>AS_SP26_EXP</m:RESOURCE_NAME> 
      <m:OPR_DATE>2010-11-17</m:OPR_DATE> 
      <m:INTERVAL_NUM>1</m:INTERVAL_NUM> 
      <m:VALUE>0</m:VALUE> 
     </m:REPORT_DATA>

The problem is that the namespace "http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd" can sometimes be different. I want to ignore it completely and just get my data from tag MessagePayload downstream.

The code I am using so far is:

String[] namespaces = new String[1];
  String[] namespaceAliases = new String[1];

  namespaceAliases[0] = "ns0";
  namespaces[0] = "http://oasissta.caiso.com/mrtu-oasis/xsd/OASISReport.xsd";

  File inputFile = new File(inputFileName);

  Map namespaceURIs = new HashMap();

  // This query will return all of the ASR records.
  String xPathExpression = "/ns0:OASISReport
                             /ns0:MessagePayload
                              /ns0:RTO
                               /ns0:REPORT_ITEM
                                /ns0:REPORT_DATA";
  xPathExpression += "|/ns0:OASISReport
                        /ns0:MessagePayload
                         /ns0:RTO
                          /ns0:REPORT_ITEM
                           /ns0:REPORT_HEADER";

  // Load up the raw XML file. The parameters ignore whitespace and other
  // nonsense,
  // reduces DOM tree size.
  SAXReader reader = new SAXReader();
  reader.setStripWhitespaceText(true);
  reader.setMergeAdjacentText(true);
  Document inputDocument = reader.read(inputFile);

  // Relate the aliases with the namespaces
  if (namespaceAliases != null && namespaces != null)
  {
   for (int i = 0; i < namespaceAliases.length; i++)
   {
    namespaceURIs.put(namespaceAliases[i], namespaces[i]);
   }
  }

  // Cache the expression using the supplied namespaces.
  XPath xPath = DocumentHelper.createXPath(xPathExpression);
  xPath.setNamespaceURIs(namespaceURIs);

  List asResultsNodes = xPath.selectNodes(inputDocument.getRootElement());

It works fine if the namespace never changes but that is obviously not the case. What do I need to do to make it ignore the namespace? Or if I know the set of all possible namespace values, how can I pass them all to the XPath instance?

9
  • 2
    @user452103: XPath is XML Names complain, so it will never ignore namespace. You can use expression that selects nodes regarding namespace. If namespace URI is changing so often, then is the wrong URI. Namespace URI suppose to indicate that element belong to specific XML vocabulary. Commented Dec 9, 2010 at 19:49
  • @user452103: Keep this formatting, it's more clear. Commented Dec 9, 2010 at 19:54
  • 1
    @Alejandro: thanks for the formatting, it does look better now. What expression can I use to select nodes regardless of namespace? Commented Dec 9, 2010 at 20:15
  • 1
    stackoverflow.com/questions/4440451/… Commented Sep 21, 2015 at 12:47
  • 2
    You could use Namespace = false on a XmlTextReader see: stackoverflow.com/a/49361232/9516092 Commented Mar 19, 2018 at 11:04

2 Answers 2

135

This is FAQ (but I'm lazy to search duplicates today)

In XPath 1.0

//*[local-name()='name']

Selects any element with "name" as local-name.

In XPath 2.0 you can use:

//*:name
Sign up to request clarification or add additional context in comments.

2 Comments

Does the FAQ explain why the namespaces are not ignored by default which is what 95% of users want. How often do one really need namespaces for disambiguation?
@frenchone Have never, not once, needed the namespace.
45

Use:

/*/*/*/*/*
        [local-name()='REPORT_DATA' 
       or 
         local-name()='REPORT_HEADER'
        ]

5 Comments

do you mean to use that as the value of xPathExpression in the code above?
@user452103: Yes, exactly. This is the XPath expression to use.
so, just to clarify, should it be like this now: String xPathExpression = "/*/*/*/*/*[local-name()='REPORT_DATA' or local-name()='REPORT_HEADER']";
@user452103:Yes, Why don't you just try it? This expression selects the two wanted nodes in the provided XML document.
@ClaraOnager, This selects any element on the 4th level below the top, whose local-name() is either 'REPORT_DATA' or 'REPORT_HEADER'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.