1

I have multiple XML files in a folder "c:\srini\perl\in\" ... the structure of all these files are the same ... I need to search for two tags in each XML and if that TAG values has "@@@" in it ...it has to be replaced with "&" ... it has to check for two tag values SHORT_DESC and XXX_NAME ...if any of the TAG value has "@@@" in it ..it has to be replaced with "&".. Below is the XML file ....

<TOPHEADER>
<HEADER>
<NAME>ABC LTD</NAME>
<SHORT_DESC>ABC COMPY @@@ LTD</SHORT_DESC> 
<XXX_NAME>ABC COMPANY FOR XXX AND YYY </XXX_NAME> 
</HEADER>
<HEADER>
<NAME>XYZ LTD</NAME>
<SHORT_DESC>XYZ COMPY @@@ LTD</SHORT_DESC> 
<XXX_NAME>XYZ COMPANY FOR @@@</XXX_NAME> 
</HEADER>
<HEADER>
<NAME>DEF LTD</NAME>
<SHORT_DESC>DEF COMPY AND LTD</SHORT_DESC> 
<XXX_NAME>DEF COMPANY FOR @@@</XXX_NAME> 
</HEADER>
</TOPHEADER>

I'm using the below code to replace the tag value for a single file .. but wanted to know if there is a better way to handle multiple files ....

open (my $input_file, '<', 'c:\srini\perl\in\test1.xml') or die "unable to open $input_file $!\n";
open (my $output_file, '>', 'c:\srini\perl\in\test1_out.xml') or die "unable to open $output_file $!\n";

my $input;
{
local $/;               #Set record separator to undefined.
$input = <$input_file>; #This allows the whole input file to be read at once.
}
$input =~ s/@@@/&/g;

print {$output_file} $input;

close $input_file or die $!;
close $output_file or die $!;
1
  • also is there a way we can edit the same file and replace the value .. i don't want new files to be created with _out extension ... Commented May 9, 2013 at 7:10

2 Answers 2

2

You realize that your output will not be valid XML right? The & needs to be escaped in XML. Hopefully it was just an example and not the real value.

That said, I you want to do this "The XML way"™, for example using XML::Twig, that's pretty simple:

#!/usr/bin/perl

use strict;
use warnings;

use XML::Twig;

my $dir= shift @ARGV or die "usege: $0 <dir>\n";

foreach my $file ( glob( "$dir/*.xml"))
  { XML::Twig->new( twig_roots => { SHORT_DESC => \&replace, # only those elements will be checked
                                    XXX_NAME   => \&replace,
                                  },
                    twig_print_outside_roots => 1,           # the rest will be output as-is
                    keep_spaces => 1,
                  )
             ->parsefile_inplace( $file);                    # the original file will be updated
  }

exit;

sub replace
  { my( $t, $elt)= @_;
    $elt->subs_text( qr/@@@/, '&')->print;
  }

The output will be well-formed XML (ie it will look like <SHORT_DESC>ABC COMPY &amp; LTD</SHORT_DESC>). If you do need the & not to be escaped, the line in the sub should be $elt->subs_text( qr/@@@/, '&')->set_asis( 1)->print;, the call to set_asis prevents the text of the element to be escaped.

Make sure your original XML is well-formed though, or it will not be processed (you won't lose the data though).

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the update & code mirod .. I ran the code with the actual XMl and "&" value is replaced as "&amp;" ... is there a way I can replace "&" and not "&amp;"
0

The opendir/readdir/closedir functions let you iterate over the file systemobjects of a directoy:

my $dir = ***dir goes here***;
my $d = opendir();
map {
    if (
        -f "$dir/$_"
        && ($_ =~ "\.xml$")
    ) {
        open (my $input_file, '<', ) or die "unable to open $input_file $!\n";

        my $input;
        {
            local $/;               #Set record separator to undefined.
            $input = <$input_file>; #This allows the whole input file to be read at once.
        }
        close $input_file;

        $input =~ s/@@@/&/g;

        open (my $output_file, '>', "$dir/$_") or die "unable to open $output_file $!\n";
        print {$output_file} $input;

        close $output_file or die $!;
    }
} readdir($d);
closedir($d);

1 Comment

Hi ... Thx for the code ... but I'm getting the below error while executing the code .. Not enough arguments for opendir at replace2.pl line 2, near "$dir or" Final $ should be \$ or $name at replace2.pl line 6, within string syntax error at replace2.pl line 6, near "=~ "\*.xml$"" syntax error at replace2.pl line 20, near "}" Execution of replace2.pl aborted due to compilation errors.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.