0

I am having a very long xml and I wish to update the attribute value of one of the tag which is very deep nested so don't want to go node by node. Also structure is not same for the intended node always as can be seen below: Input XML is:

<Re>
<Co Class="Parameter" ID="CSCP001" Status="Available">
<FileSpec URL="c://mine/testfiles/wln/c.txt"/>
<CoOp Operation="Tag" SourceCS="RGB" SourceObjects="All">
<FileSpec Resource="SourceProfile" URL="c://mine/testfiles/wln/d.txt"/>
</CoOp>
</Co>
<Ru Class="Parameter" ID="IDR002" PartIDKeys="Run" Status="Available">
<Ru EndOfDocument="true" Pages="0" Run="1" RunTag="First">
<La>
<FileSpec URL="c://mine/testfiles/wln/e.txt"/>
</La>
</Ru>
</Ru>
</Re>

and I wish to have output xml as

<Re>
<Co Class="Parameter" ID="CSCP001" Status="Available">
<FileSpec URL="d://yours/wln/c.txt"/>
<CoOp Operation="Tag" SourceCS="RGB" SourceObjects="All">
<FileSpec Resource="SourceProfile" URL="d://yours/wln/d.txt"/>
</CoOp>
</Co>
<Ru Class="Parameter" ID="IDR002" PartIDKeys="Run" Status="Available">
<Ru EndOfDocument="true" Pages="0" Run="1" RunTag="First">
<La>
<FileSpec URL="d://yours/wln/e.txt"/>
</La>
</Ru>
</Ru>
</Re>

I tried using xml simple, xmllib but not able to do the required. I am new in perl programming.

use XML::LibXML qw( );
use XML::LibXML;
use Data::Dumper;  

my $xml = "a.txt";
my $xpath_expression = 'FileSpec';

my $parser = XML::LibXML->new();
my $doc = $parser->parse_file($xml) or warn "Could not";

my $parser1 = XML::LibXML::Element->new($xml);


for my $FileSpec1 ($doc->getElementsByTagName('FileSpec')) 
{
print $FileSpec1;
my $xpath = '$FileSpec1/@URL';
my ($attr) = $doc->findnodes($xpath);    
$attr->setValue('dfdsa'); 
my ($URL1) = $FileSpec1->findvalue('@URL');
print $URL1;
}

I tried using $node->setAttribute( $aname, $avalue ); but this is throwing exceptions. Please advice.

2 Answers 2

4

Your code is too complicated. You need no parser, no elements, just find the urls and change them:

#!/usr/bin/perl
use warnings;
use strict;

use XML::LibXML;

my $xml = 'XML::LibXML'->load_xml(location => 'a.xml') ;

for my $url ($xml->findnodes('//FileSpec/@URL')) {
    my $value = $url->getValue;
    $value =~ s{c://mine/testfiles}{d://yours};
    $url->setValue($value);
}

$xml->toFile('new.xml');
Sign up to request clarification or add additional context in comments.

6 Comments

This is not working for me. I even tried printing $value after my $value = $url->getValue; but this is not returning anything. new.xml is getting created with old details only.
@user2786324: it's definitely working with the sample XML you provided in the question. So are you using a different XML for testing?
yes my bad...this is working for the sample xml. I am using another xml which is quite log but having the similar structure. let me find out.
@user2786324: Wild guess: namespaces involved?
@user2786324: You must use namespace prefixes in XPath expressions. You might need to use XML::LibXML::XPathContext
|
1

You can try with XML::Twig module. It has the twig_handlers option that selects the tags you want and triggers a handler. The default variable $_ has the element and its method set_att() lets you change its value easily:

#!/usr/bin/env perl

use warnings;
use strict;
use XML::Twig;

my $new_url = q{d://yours/wln/d.txt};

my $twig = XML::Twig->new(
        twig_handlers => {
                'FileSpec' => sub { $_->set_att( 'URL', $new_url ) }
         },
        pretty_print => 'indented',
)->parsefile( shift )->print();

Run it like:

perl script.pl xmlfile

That yields:

<Re>
  <Co Class="Parameter" ID="CSCP001" Status="Available">
    <FileSpec URL="d://yours/wln/d.txt"/>
    <CoOp Operation="Tag" SourceCS="RGB" SourceObjects="All">
      <FileSpec Resource="SourceProfile" URL="d://yours/wln/d.txt"/>
    </CoOp>
  </Co>
  <Ru Class="Parameter" ID="IDR002" PartIDKeys="Run" Status="Available">
    <Ru EndOfDocument="true" Pages="0" Run="1" RunTag="First">
      <La>
        <FileSpec URL="d://yours/wln/d.txt"/>
      </La>
    </Ru>
  </Ru>
</Re>

EDIT: Mirod's version pointed out in comments of a more efficient parsing using twig_roots():

#!/usr/bin/env perl

use warnings;
use strict;
use XML::Twig;

my $new_url = q{d://yours/wln/d.txt};

my $twig = XML::Twig->new(
        twig_roots => {
                'FileSpec' => sub { $_->set_att( 'URL', $new_url ); $_->flush }
        },
        twig_print_outside_roots => 1,
        pretty_print => 'indented',
)->parsefile( shift );

3 Comments

if you replace twig_handlers by twig_roots, add $_->flush at the end of the handler and add the twig_print_outside_roots => 1 option to the constructor, which becomes my $twig = XML::Twig->new( twig_roots => { 'FileSpec' => sub { $_->set_att( 'URL', $new_url ); $_-.flush; } }, twig_print_outside_roots => 1,pretty_print => 'indented',) then the file will be parsed but not loaded entirely in memory, so the memory footprint should be minimal.
@mirod: Thanks. I'm not used to use twig_roots() but it's an option to take into account. I've added it to the answer to be easier to read, but you should create your own answer with it. Advise me in that case to remove it.
using twig_roots is not required unless the file is really big and doesn't fit in memory. In most cases you answer is fine. I just commented because of the "very long XML" in the question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.