I'm writing an XSD schema to validate a document that stores, among other things, a number of result elements. Two of the attributes for this element, clickIndex and timeViewed are optional and use the built-in datatypes nonNegativeInteger and decimal. Instead of removing these attributes when their values aren't set, I'd like to set them to an empty string, like so:
<Page resultCount="10" visited="true">
<Result index="1" clickIndex="1" timeViewed="45.21" pid="56118" title="alpha"/>
<Result index="2" clickIndex="" timeViewed="" pid="75841" title="beta"/>
<Result index="3" clickIndex="2" timeViewed="21.45" pid="4563" title="gamma"/>
<Result index="4" clickIndex="" timeViewed="" pid="44546" title="delta"/>
<Result index="5" clickIndex="" timeViewed="" pid="1651" title="epsilo"/>
<Result index="6" clickIndex="" timeViewed="" pid="551651" title="zeta"/>
<Result index="7" clickIndex="" timeViewed="" pid="20358" title="eta"/>
<Result index="8" clickIndex="" timeViewed="" pid="61621" title="theta"/>
<Result index="9" clickIndex="" timeViewed="" pid="135154" title="iota"/>
<Result index="10" clickIndex="" timeViewed="" pid="95821" title="kappa"/>
</Page>
Unfortunately this won't validate against the following xsd:
<xs:element name="Result">
<xs:complexType>
<xs:attribute name="index" type="xs:nonNegativeInteger" use="required"/>
<xs:attribute name="clickIndex" type="xs:nonNegativeInteger" use="optional"/>
<xs:attribute name="timeViewed" type="xs:decimal" use="optional"/>
<xs:attribute name="pid" type="xs:positiveInteger" use="required"/>
<xs:attribute name="title" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
Is it impossible to validate the empty string on an attribute with a defined type? Is there a way to do this without defining a custom type? Am I simply forgetting an option in the XSD?
I'll be processing these XML files to generate statistical information where missing attributes will likely introduce unneeded complexity. If there's no workaround for this I'll just replace the empty values with zeroes. I'd prefer to keep the empty strings as it looks cleaner but perhaps zeroes will also be more efficient.