2

My input file is

TBLA      COLA      A    B    
TBLA      COLB      D    E    
TBLB      COLX      M    N     
TBLB      COLD      A    B   
TBLC      COLD      A    B 

The output to be created in xml format as

<Data>    
    <TBLA>    
        <COLA>
            <oldvalue>A</oldvalue>
            <newvalue>B</newvalue>    
        </COLA>         
        <COLB>    
            <oldvalue>D</oldvalue>    
            <newvalue>E</newvalue>     
        </COLB>       
    </TBLA>    
    <TBLB>     
        <COLX>    
            <oldvalue>M</oldvalue>    
            <newvalue>N</newvalue>    
        </COLX>       
        <COLD>    
            <oldvalue>A</oldvalue>   
            <newvalue>B</newvalue>     
        </COLD>       
    </TBLB>     
    <TBLC>    
        <COLD>    
            <oldvalue>A</oldvalue>    
            <newvalue>B</newvalue>     
        </COLD>   
    </TBLC>  
</Data>     

Can anyone suggest what would be the best way to do this. Should i convert this text file to hash of hashes first and then try using pltoxml(). does this make sense. Can XML::Simple or XML::Writer suffice this.

This is the first time I am working on xml and not sure which approach will help efficicently my solution.
A small example wrt to my req would be appreciated.

*Input file will always be sorted on first field

4
  • Your output is not XML. What is the root element? Commented Apr 13, 2013 at 8:53
  • The TBLB element contains a COLX node, where the input has COLA. Is this a feature or a bug? Please also define behaviour when the input is not sorted. E.g how would the additional lines TBLA COLA A B\nTBLC COLA 1 2 change the output? Commented Apr 13, 2013 at 9:01
  • Corrected the question and added that input file always be sorted. One thing more I wanted to clarify, this file is created at run time by perl program. Is there any need to create the file and then create the xml file. While writing the file, its always inserted at first field level i.e. the data while writing in file is in sequential as per table order. So first field will always be sorted.Can directly xml file created. I am writing into file by print FILE "TBLA\tCOLA\tA\tB" Commented Apr 13, 2013 at 9:08
  • I have updated the requirement details. Can someone suggest whether i should prepare XML by small program in which looping each file and then reading required values and prepare XML OR this can be achived by some perl provided solutions like XML::Simple Commented Apr 16, 2013 at 9:33

2 Answers 2

2

Given the very simple data structure, It seems a bit unneccessary to use a whole XML writer. However, I'll assume that that the table and column names are valid XML tag names.

Here is a simple script that reads through the data without storing it in an intermediary data structure. It works with perl5 v10 and better.

use strict; use warnings; use feature 'say';

my $last_table;
say '<Data>';
while(<>) {
  chomp;
  my ($table, $col, $old, $new) = split /\t/;
  s/&/&amp;/g, s/</&lt;/g for $old, $new;
  # I'll assume $table and $col have sane names
  if (not defined $last_table) {
    say "  <$table>";
  } elsif ($last_table ne $table) {
    say "  </$last_table>";
    say "  <$table>";
  }
  $last_table = $table;
  say "    <$col>";
  say "      <oldvalue>$old</oldvalue>";
  say "      <newvalue>$new</newvalue>";
  say "    </$col>";
}
say "  </$last_table> if defined $last_table;
say '</Data>';
Sign up to request clarification or add additional context in comments.

Comments

1

Recommend is to use XML::Simple instead of writing an selfmade XML Parser. You just need to set:

use XML::Simple;
my $xml = XMLout($hashref, RootName => 'Data');

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.