1

I am trying to parse the xml file which has collection of nested tags.I was trying with perl XML::Simple API to parse and individual tag values are parsed exactly but couldn able to parse the nested tag values .

<archetype>
    <original_language></original_language>
    <description></description>
    <archetype_id>
    <definition></definition>
    <ontology></ontology>
</archetype>

in the definition part contains the item details

example

<definition>
.
.
<node_id>at0004</node_id>
<attributes xsi:type="C_SINGLE_ATTRIBUTE">
<rm_attribute_name>value</rm_attribute_name>
+<existence> </existence>
<children xsi:type="C_DV_QUANTITY">
    <rm_type_name>DV_QUANTITY</rm_type_name>
    +<occurrences></occurrences>
    <node_id/>
    +<property></property>
    <list>
    <magnitude>
        <lower_included>true</lower_included>
        <upper_included>false</upper_included>
        <lower_unbounded>false</lower_unbounded>
        <upper_unbounded>false</upper_unbounded>
        <lower>0.0</lower>
        <upper>1000.0</upper>
</magnitude>
<units>mm[Hg]</units>
</list>
</children>
</attributes>
.
.
</definition>

From the above example file format i would like to filter the content like

node_id - > at0004
    magnitude -> lower -> 0.0
    magnitude -> higher -> 1000.0

please guide me to filter the content.

1
  • It might be useful if you included your current code. That way we can point out where you're going wrong rather than just giving you the complete answer. Commented Apr 24, 2012 at 10:36

2 Answers 2

2

You need to learn about references: perlreftut, perlref, perldsc.

use strictures;
use XML::Simple qw(:strict);

my $root = XMLin(<<'XML', ForceArray => 0, KeyAttr => undef);
<definition>
.
.
<node_id>at0004</node_id>
<attributes xsi:type="C_SINGLE_ATTRIBUTE">
<rm_attribute_name>value</rm_attribute_name>
+<existence> </existence>
<children xsi:type="C_DV_QUANTITY">
    <rm_type_name>DV_QUANTITY</rm_type_name>
    +<occurrences></occurrences>
    <node_id/>
    +<property></property>
    <list>
    <magnitude>
        <lower_included>true</lower_included>
        <upper_included>false</upper_included>
        <lower_unbounded>false</lower_unbounded>
        <upper_unbounded>false</upper_unbounded>
        <lower>0.0</lower>
        <upper>1000.0</upper>
</magnitude>
<units>mm[Hg]</units>
</list>
</children>
</attributes>
.
.
</definition>
XML

my $m = $root->{attributes}{children}{list}{magnitude};
printf <<'TEMPLATE', $root->{node_id}, $m->{lower}, $m->{upper};
node_id -> %s
    magnitude -> lower -> %.1f
    magnitude -> higher -> %.1f
TEMPLATE

use Data::Dump::Streamer qw(Dump); Dump $root;

Output:

node_id -> at0004
    magnitude -> lower -> 0.0
    magnitude -> higher -> 1000.0

$HASH1 = {
    attributes => {
        children => {
            content => [("\n    +") x 2],
            list    => {
                magnitude => {
                    lower           => '0.0',
                    lower_included  => 'true',
                    lower_unbounded => 'false',
                    upper           => '1000.0',
                    upper_included  => 'false',
                    upper_unbounded => 'false'
                },
                units => 'mm[Hg]'
            },
            node_id      => {},
            occurrences  => {},
            property     => {},
            rm_type_name => 'DV_QUANTITY',
            "xsi:type"   => 'C_DV_QUANTITY'
        },
        content           => "\n+",
        existence         => {},
        rm_attribute_name => 'value',
        "xsi:type"        => 'C_SINGLE_ATTRIBUTE'
    },
    content => [("\n.\n.\n") x 2],
    node_id => 'at0004'
};
Sign up to request clarification or add additional context in comments.

Comments

1

Here's an XML::Twig program that can do it, although I've made some assumptions that you might have to adjust. I don't know if <defintions> can have more than one node-attributes pairs, so I wrote this to handle multiple pairs:

#!/Users/brian/bin/perls/perl5.14.2

use XML::Twig;
use Data::Dumper;

my $twig = XML::Twig->new(
    twig_handlers => {
        magnitude => sub {
            my $m = $_;
            my $hash = $m->simplify;
            my $node_id = $m->parent( 'attributes' )->prev_sibling( 'node_id' )->text;
            print "node -> $node_id\n",
                "\tmagnitude -> lower -> $hash->{lower} $units\n",
                "\tmagnitude -> higher -> $hash->{upper} $units\n";
            },
        },
    );

$twig->parse(*DATA);


__END__
<definition>

<node_id>at0004</node_id>
<attributes xsi:type="C_SINGLE_ATTRIBUTE">
    <rm_attribute_name>value</rm_attribute_name>
    <existence> </existence>
    <children xsi:type="C_DV_QUANTITY">
        <rm_type_name>DV_QUANTITY</rm_type_name>
        <occurrences></occurrences>
        <node_id/>
        <property></property>
        <list>
            <magnitude>
                <lower_included>true</lower_included>
                <upper_included>false</upper_included>
                <lower_unbounded>false</lower_unbounded>
                <upper_unbounded>false</upper_unbounded>
                <lower>0.0</lower>
                <upper>1000.0</upper>
            </magnitude>
            <units>mm[Hg]</units>
        </list>
    </children>
</attributes>

<node_id>at0005</node_id>
<attributes xsi:type="C_SINGLE_ATTRIBUTE">
    <rm_attribute_name>value</rm_attribute_name>
    <existence> </existence>
    <children xsi:type="C_DV_QUANTITY">
        <rm_type_name>DV_QUANTITY</rm_type_name>
        <occurrences></occurrences>
        <node_id/>
        <property></property>
        <list>
            <magnitude>
                <lower_included>true</lower_included>
                <upper_included>false</upper_included>
                <lower_unbounded>false</lower_unbounded>
                <upper_unbounded>false</upper_unbounded>
                <lower>100.9</lower>
                <upper>998.7</upper>
            </magnitude>
            <units>mm[Hg]</units>
        </list>
    </children>
</attributes>

</definition>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.