3

I am trying to read OSIS formatted documents. I have cut the document down to a simple fragment:

<?xml version="1.0" encoding="utf-8"?>
<osis xmlns="http://www.bibletechnologies.net/2003/OSIS/namespace">
  <osisText osisRefWork="Bible" osisIDWork="kjv" xml:lang="en">
  </osisText>
</osis>

I try to read it with this sample code from the MSDN documentation:

XPathDocument document = new XPathDocument("osis.xml");
XPathNavigator navigator = document.CreateNavigator();
XPathNodeIterator nodes = navigator.Select("/osis/osisText");

while (nodes.MoveNext())
{
    Console.WriteLine(nodes.Current.Name);
}

The problem is that the selection contains no nodes and throws no exception. Since the code discards the root tag, I can't read the document. If I remove the xmlns="http://www.bibletechnologies.net/2003/OSIS/namespace" from the root osis tag, it works just fine. The offensive URL returns a 404 code, but otherwise I see nothing wrong with this XML. Can someone explain why this code won't read the document? What options do I have besides hand editing every document before trying to load it?

1
  • That XML looks pretty well formed to me. What makes you think it is malformed? Commented Oct 25, 2011 at 5:09

2 Answers 2

9

Your XPath expression is missing a namespace prefix.

The element that you're trying to select has a namespace URI of http://www.bibletechnologies.net/2003/OSIS/namespace, and XPath will not match these nodes using paths with an empty namespace URI.

I tested this revision in .NET 2.0 and it found the node as expected.

XPathDocument document = new XPathDocument("osis.xml");
XPathNavigator navigator = document.CreateNavigator();

XmlNamespaceManager xmlns = new XmlNamespaceManager(navigator.NameTable);
xmlns.AddNamespace("osis", "http://www.bibletechnologies.net/2003/OSIS/namespace");

XPathNodeIterator nodes = navigator.Select("/osis:osis/osis:osisText", xmlns);
Sign up to request clarification or add additional context in comments.

4 Comments

XML is pretty simple until you add namespaces. The actual osis node is:<osis xmlns="bibletechnologies.net/2003/OSIS/namespace" xmlns:xsi="w3.org/2001/XMLSchema-instance" xsi:schemaLocation="bibletechnologies.net/2003/OSIS/namespace bibletechnologies.net/osisCore.2.1.1.xsd">. Do I need to add something else to the namespace manager?
@JEdwardEllis, no... and I overlooked that you already had created a navigator. Yeah, you have to get into the zen of namespaces to minimize frustration. It can be ugly sometimes, but I learned from dealing with EPUBs (which use multiple namespaces) that you can't fight it. Good luck.
Is there a good resource for learning about XML namespaces? I hate stumbling around in the dark.
If you're using .NET, the framework documentation is actually pretty good, especially since it's usually geared towards usage in applications. I read XML in a Nutshell in 2004 or so, and much of it still holds true (certainly the parts about namespaces, anyway). The zvon tutorial is still a good starting reference for XPath.
0

You can read the file to a string, replace the namespace in memory, and then load it using a string stream:

string s;
using(var reader = File.OpenText("osis.xml"))
{
    s = reader.ReadToEnd();
}
s = s.Replace("xmlns=\"http://www.bibletechnologies.net/2003/OSIS/namespace\"", "");
Stream stream = new MemoryStream(Encoding.ASCII.GetBytes(s));
XPathDocument document = new XPathDocument("stream");
// Rest of the code

2 Comments

-1: as with most cases of pretending that XML is just a string format, this fails. What if, instead, the document said xmlns:x="http://www.bibletechnologies.net/2003/OSIS/namespace"? What if it used "y" as a prefix? What if there were multiple namespace declarations, for the same namespace, but with different prefixes?
@JohnSaunders did you find an answer?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.