3

My XML file looks like this:

<eLinkResult>
  <LinkSet>
    <DbFrom>nuccore</DbFrom>
    <IdList>
      <Id>133909243</Id>
    </IdList>
    <LinkSetDb>
      <DbTo>taxonomy</DbTo>
      <LinkName>nuccore_taxonomy</LinkName>
      <Link>
        <Id>417290</Id>
      </Link>
      <Link>
        <Id>417289</Id>
      </Link>
      <Link>
        <Id>405948</Id>
      </Link>
    </LinkSetDb>
  </LinkSet>
</eLinkResult>

I am looking to get all <Id> information, and I know how to extract if there was one <Id> like so:

my $test="Some URL;
      my $Result = get ($test);
      my $Data = $Parser->XMLin($Result);
my $x=0;
if (exists($Data->{LinkSet}{LinkSetDb}->[0]->{Link}{Id})) {
    $TaxId=$Data->{LinkSet}{LinkSetDb}{Link}->[0]->{Id};

or just

if (exists($Data->{LinkSet}{LinkSetDb}{Link}{Id})) {
    $TaxId=$Data->{LinkSet}{LinkSetDb}{Link}{Id};
}

However when i try to use the XML file above I get Not a HASH reference

I also tried

foreach  (@{$Data->{LinkSet}{LinkSetDb}{Link}{Id}}) {
Print $_;
}

But Still I get an error, is there a way so I can get all the <Id> without specifying which one I want?

1
  • You need to iterate over all the Link elements, then grab the Id in the loop body. Commented Jun 5, 2013 at 22:12

2 Answers 2

2

Try with parser XML::Twig.

Content of script.pl:

#!/usr/bin/env perl

use warnings;
use strict;
use XML::Twig;

my $twig = XML::Twig->new(
    twig_handlers => {
        'LinkSet/LinkSetDb/Link/Id' => sub {
            printf qq|%s\n|, $_->text_only;
        },  
    },  
)->parsefile( shift );

Run it with the xml file as input argument, like:

perl script.pl xmlfile

That yields:

417290
417289
405948
Sign up to request clarification or add additional context in comments.

Comments

2

XML::Simple is rarely a good choice for processing XML. It does not accurately represent XML data structures, and in my experience it is far from simple to use, because the Perl data structure it creates is difficult to predict and awkward to navigate.

XML::LibXML and XML::Twig are good candidates, and, although XML::Twig can be used to process large XML files piece by piece, there is no reason to use it that way.

This short program uses XML::Twig to read the complete data structure and print the text values of all the Id elements.

use strict;
use warnings;

use XML::Twig;

my $twig = XML::Twig->new;
$twig->parsefile('xml.xml');
print $_->text, "\n" for $twig->findnodes('//Id');

output

133909243
417290
417289
405948

Update

If you want only the Id elements from within the LinkSetDb part of the data and not those inside IdList, then change the findnodes call to $twig->findnodes('//Link/Id')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.