0

Can someone show me how to parse out specific information from an xml file? Am I suppose to use a regex?

I am using XML::Simple to view my test.xml file.

For example, I want to search for the string test-out-00000, if it exists then give me/print the size = 135860644

Data:

$VAR1 = {
          'recursive' => 'no',
          'version' => '0.20.202.1.1101050227',
          'time' => '2011-09-30T02:49:39+0000',
          'filter' => '.*',
          'file' => {
                    'owner' => 'test_act',
                    'replication' => '3',
                    'blocksize' => '134217728',
                    'permission' => '-rw-------',
                    'path' => '/source/feeds/customer/test/test-out-00000',
                    'modified' => '2011-09-30T02:48:41+0000',
                    'size' => '135860644',
                    'group' => '',
                    'accesstime' => '2011-09-30T02:48:41+0000'
                  },
          'path' => '/source/customer/test',
          'directory' => {
                         'owner' => 'test_act',
                         'group' => '',
                         'permission' => 'drwx------',
                         'path' => '/source/feeds/customer/test',
                         'accesstime' => '1970-01-01T00:00:00+0000',
                         'modified' => '2011-09-30T02:48:41+0000'
                       },
          'exclude' => ''
        };
recursive:no
version:0.20.202.1.1101050227
time:2011-09-30T02:49:39+0000
filter:.*
file:HASH(0x84c841c)
path:/source/customer/test
directory:HASH(0x84c7648)
exclude:

Working perl script:

use strict;
use warnings;
use XML::Simple;
use Data::Dumper;

my $xml = $ARGV [0]; 
my $data = XMLin($xml);
print Dumper( $data );

foreach my $attributes (keys %{$data}){
  print"$attributes:${$data}{$attributes}\n";
}

XML file test.xml:

<?xml version="1.0" encoding="UTF-8"?>
<listing time="2011-09-30T02:49:39+0000" recursive="no" path="/source/customer/test" exclude="" filter=".*" version="0.20.202.1.1101050227">
<directory path="/source/feeds/customer/test" modified="2011-09-30T02:48:41+0000" accesstime="1970-01-01T00:00:00+0000" permission="drwx------" owner="test_act" group=""/>
<file path="/source/feeds/customer/test/test-out-00000" modified="2011-09-30T02:48:41+0000" accesstime="2011-09-30T02:48:41+0000" size="135860644" replication="3" blocksize="134217728" permission="-rw-------" owner="test_act" group=""/>
</listing>

1 Answer 1

3

I'm assuming you're always looking for the text string in the filename? If so, this is one way of doing it:

use strict;
use warnings;
use XML::Simple;

my $xml = $ARGV [0]; 
my $data = XMLin($xml);

my $size = 0;    

if (exists $data->{file}->{path} and $data->{file}->{path} =~ /test-out-00000/) {
    $size = $data->{file}->{size};
}

If your data follows this format you could also use XML::LibXML to just grab the values using an XPATH expression.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you. Yes, I am always looking for a particular file name within the path. This works for me.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.