1

My input xml looks like this:

<AGROVOC xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<CONCEPT>
    <Language>EN</Language>
    <termcode>331000</termcode>
    <termspell>site</termspell>
    <NT>12861</NT>
    <NT>13893</NT>
    <NT>15988</NT>
    <NT>24183</NT>
    <NT>28623</NT>
    <NT>35171</NT>
    <NT>4781</NT>
    <NT>5973</NT>
    <NT>8872</NT>
    <NT>9000162</NT>
</CONCEPT>

<CONCEPT>
    <termcode>12861</termcode>
    <termspell>child nurseries</termspell>
    <BT>331000</BT>
</CONCEPT>

<CONCEPT>
    <termcode>13893</termcode>
    <termspell>restaurants</termspell>
    <BT>331000</BT>
</CONCEPT>

<CONCEPT>
    <termcode>15988</termcode>
    <termspell>laboratories</termspell>
    <BT>331000</BT>
    <NT>24298</NT>
</CONCEPT>

<CONCEPT>
    <termcode>24298</termcode>
    <termspell>Veterinary laboratories</termspell>
    <BT>15988</BT>
</CONCEPT>

<CONCEPT>
    <termcode>24183</termcode>
    <termspell>hospitals</termspell>
    <BT>331000</BT>
    <NT>16384</NT>
</CONCEPT>

<CONCEPT>
    <termcode>16384</termcode>
    <termspell>animal hospitals</termspell>
    <BT>24183</BT>
</CONCEPT>

<CONCEPT>
    <termcode>35171</termcode>
    <termspell>Landfills</termspell>
    <BT>331000</BT>
    <NT>35165</NT>
</CONCEPT>

<CONCEPT>
    <termcode>35165</termcode>
    <termspell>waste landfills</termspell>
    <BT>35171</BT>
</CONCEPT>

<CONCEPT>
    <termcode>4781</termcode>
    <termspell>meteorological stations</termspell>
    <BT>331000</BT>
    <NT>8342</NT>
</CONCEPT>

<CONCEPT>
    <termcode>8342</termcode>
    <termspell>Weather ships</termspell>
    <BT>4781</BT>
</CONCEPT>

<CONCEPT>
    <termcode>5973</termcode>
    <termspell>plant nurseries</termspell>
    <BT>331000</BT>
    <NT>34832</NT>
    <NT>34830</NT>
    <NT>14969</NT>
</CONCEPT>

<CONCEPT>
    <termcode>34832</termcode>
    <termspell>Fruit tree nurseries</termspell>
    <BT>5973</BT>
</CONCEPT>

<CONCEPT>
    <termcode>34830</termcode>
    <termspell>Forest nurseries</termspell>
    <BT>5973</BT>
</CONCEPT>

<CONCEPT>
    <termcode>14969</termcode>
    <termspell>Ornamental tree nurseries</termspell>
    <BT>5973</BT>
</CONCEPT>

<CONCEPT>
    <termcode>8872</termcode>
    <termspell>Apiaries</termspell>
    <BT>331000</BT>
</CONCEPT>

<CONCEPT>
    <termcode>9000162</termcode>
    <termspell>Telecentre</termspell>
    <BT>331000</BT>
</CONCEPT>

The desired output should look like this:

<node id="331000" label="site">
    <isComposedBy>
        <node id="12861" label="child nurseries"/>
        <node id="13893" label="restaurants"/>
        <node id="15988" label="laboratories">
            <isComposedBy>
                <node id="24298" label="Veterinary laboratories"/>
            </isComposedBy>
        </node>
        <node id="24183" label="hospitals">
            <isComposedBy>
                <node id="16384" label="animal hospitals"/>
            </isComposedBy>
        </node>
        <node id="35171" label="Landfills">
            <isComposedBy>
                <node id="35165" label="waste landfills"/>
            </isComposedBy>
        </node>
        <node id="4781" label="meteorological stations">
            <isComposedBy>
                <node id="8342" label="Weather ships"/>
            </isComposedBy>
        </node>
        <node id="5973" label="plant nurseries">
            <isComposedBy>
                <node id="34832" label="Fruit tree nurseries"/>
                <node id="34830" label="Forest nurseries"/>
                <node id="14969" label="Ornamental tree nurseries"/>
            </isComposedBy>
        </node>
        <node id="8872" label="Apiaries"/>
        <node id="9000162" label="Telecentre"/>
    </isComposedBy>
</node>

From the given xml above, a concept can only have 1 BT or broader term which is its parent's ID. If it doesn't contain a BT, it means that it is at the top of the hierarchy (e.g. site). A concept can have multiple NTs or children.

My xsl looks like this:

<xsl:key name="kChildren" match="CONCEPT" use="BT"/>
    <xsl:template match="CONCEPT">
    <xsl:element name="node">
        <xsl:attribute name="id">
            <xsl:value-of select="/AGROVOC/CONCEPT/termcode"/>
        </xsl:attribute>
        <xsl:apply-templates select="key('kChildren', '0')"/>
    </xsl:element>
</xsl:template>

<xsl:template match="CONCEPT">
    <xsl:element name="node">
        <xsl:attribute name="id">
            <xsl:value-of select="termcode"/>
        </xsl:attribute>
        <xsl:attribute name="label">
            <xsl:value-of select="termspell"/>
        </xsl:attribute>
        <xsl:if test="key('kChildren', termcode)">
            <isComposedBy>
                <xsl:apply-templates select="key('kChildren', termcode)"/>
            </isComposedBy>
        </xsl:if>
    </xsl:element>
</xsl:template>

The output looks like the desired output PLUS this:

<node id="12861" label="child nurseries"/>
<node id="13893" label="restaurants"/>
<node id="15988" label="laboratories">
    <isComposedBy>
        <node id="24298" label="Veterinary laboratories"/>
    </isComposedBy>
</node>
<node id="24298" label="Veterinary laboratories"/>
<node id="24183" label="hospitals">
    <isComposedBy>
        <node id="16384" label="animal hospitals"/>
    </isComposedBy>
</node>
<node id="16384" label="animal hospitals"/>
<node id="35171" label="Landfills">
    <isComposedBy>
        <node id="35165" label="waste landfills"/>
    </isComposedBy>
</node>
<node id="35165" label="waste landfills"/>
<node id="4781" label="meteorological stations">
    <isComposedBy>
        <node id="8342" label="Weather ships"/>
    </isComposedBy>
</node>
<node id="8342" label="Weather ships"/>
<node id="5973" label="plant nurseries">
    <isComposedBy>
        <node id="34832" label="Fruit tree nurseries"/>
        <node id="34830" label="Forest nurseries"/>
        <node id="14969" label="Ornamental tree nurseries"/>
    </isComposedBy>
</node>
<node id="34832" label="Fruit tree nurseries"/>
<node id="34830" label="Forest nurseries"/>
<node id="14969" label="Ornamental tree nurseries"/>
<node id="8872" label="Apiaries"/>
<node id="9000162" label="Telecentre"/>

Any ideas how can I remove the extra nodes that were just repeated? BTW, I patterned my xsl from this post: xslt recursive template on parent-child data. Thanks in advance.

2
  • euler, In the provided output "plant nurseries" has three children, but in the provided source XML document, "plant nurseries" doesn't have any children. Please, edit and correct. Commented Dec 21, 2012 at 16:59
  • @DimitreNovatchev, thank you so much, that was quick! Thumbs up for you!;-) Commented Dec 21, 2012 at 17:40

1 Answer 1

1

This transformation:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kConcByCode" match="CONCEPT" use="termcode"/>

 <xsl:template match="/">
     <xsl:apply-templates select="/*/CONCEPT[not(BT)]"/>
 </xsl:template>

 <xsl:template match="CONCEPT">
  <node id="{termcode}" label="{termspell}">
   <xsl:if test="key('kConcByCode', NT)">
     <isComposedBy>
       <xsl:apply-templates select="key('kConcByCode', NT)"/>
     </isComposedBy>
   </xsl:if>
  </node>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<AGROVOC xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <CONCEPT>
        <Language>EN</Language>
        <termcode>331000</termcode>
        <termspell>site</termspell>
        <NT>12861</NT>
        <NT>13893</NT>
        <NT>15988</NT>
        <NT>24183</NT>
        <NT>28623</NT>
        <NT>35171</NT>
        <NT>4781</NT>
        <NT>5973</NT>
        <NT>8872</NT>
        <NT>9000162</NT>
    </CONCEPT>
    <CONCEPT>
        <termcode>12861</termcode>
        <termspell>child nurseries</termspell>
        <BT>331000</BT>
    </CONCEPT>
    <CONCEPT>
        <termcode>13893</termcode>
        <termspell>restaurants</termspell>
        <BT>331000</BT>
    </CONCEPT>
    <CONCEPT>
        <termcode>15988</termcode>
        <termspell>laboratories</termspell>
        <BT>331000</BT>
        <NT>24298</NT>
    </CONCEPT>
    <CONCEPT>
        <termcode>24298</termcode>
        <termspell>Veterinary laboratories</termspell>
        <BT>15988</BT>
    </CONCEPT>
    <CONCEPT>
        <termcode>24183</termcode>
        <termspell>hospitals</termspell>
        <BT>331000</BT>
        <NT>16384</NT>
    </CONCEPT>
    <CONCEPT>
        <termcode>16384</termcode>
        <termspell>animal hospitals</termspell>
        <BT>24183</BT>
    </CONCEPT>
    <CONCEPT>
        <termcode>35171</termcode>
        <termspell>Landfills</termspell>
        <BT>331000</BT>
        <NT>35165</NT>
    </CONCEPT>
    <CONCEPT>
        <termcode>35165</termcode>
        <termspell>waste landfills</termspell>
        <BT>35171</BT>
    </CONCEPT>
    <CONCEPT>
        <termcode>4781</termcode>
        <termspell>meteorological stations</termspell>
        <BT>331000</BT>
        <NT>8342</NT>
    </CONCEPT>
    <CONCEPT>
        <termcode>8342</termcode>
        <termspell>Weather ships</termspell>
        <BT>4781</BT>
    </CONCEPT>
    <CONCEPT>
        <termcode>5973</termcode>
        <termspell>plant nurseries</termspell>
        <BT>331000</BT>
    </CONCEPT>
    <CONCEPT>
        <termcode>34832</termcode>
        <termspell>Fruit tree nurseries</termspell>
        <BT>5973</BT>
    </CONCEPT>
    <CONCEPT>
        <termcode>34830</termcode>
        <termspell>Forest nurseries</termspell>
        <BT>5973</BT>
    </CONCEPT>
    <CONCEPT>
        <termcode>14969</termcode>
        <termspell>Ornamental tree nurseries</termspell>
        <BT>5973</BT>
    </CONCEPT>
    <CONCEPT>
        <termcode>8872</termcode>
        <termspell>Apiaries</termspell>
        <BT>331000</BT>
    </CONCEPT>
    <CONCEPT>
        <termcode>9000162</termcode>
        <termspell>Telecentre</termspell>
        <BT>331000</BT>
    </CONCEPT>
</AGROVOC>

produces the correct result:

<node id="331000" label="site">
   <isComposedBy>
      <node id="12861" label="child nurseries"/>
      <node id="13893" label="restaurants"/>
      <node id="15988" label="laboratories">
         <isComposedBy>
            <node id="24298" label="Veterinary laboratories"/>
         </isComposedBy>
      </node>
      <node id="24183" label="hospitals">
         <isComposedBy>
            <node id="16384" label="animal hospitals"/>
         </isComposedBy>
      </node>
      <node id="35171" label="Landfills">
         <isComposedBy>
            <node id="35165" label="waste landfills"/>
         </isComposedBy>
      </node>
      <node id="4781" label="meteorological stations">
         <isComposedBy>
            <node id="8342" label="Weather ships"/>
         </isComposedBy>
      </node>
      <node id="5973" label="plant nurseries"/>
      <node id="8872" label="Apiaries"/>
      <node id="9000162" label="Telecentre"/>
   </isComposedBy>
</node>
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.