2

I am having difficulty getting my head around this considering variable declarations. Scenario: I have a file with ten words, one per line. First I want to loop through the file and create new files based on the data. Example

banana
apple
coconut
strawberry

-->

banana.txt
apple.txt
coconut.txt
strawberry.txt

The first problem that I'm having is: how do I assign a unique variable for the file handle for each file in the loop? I would write something like this but I don't know if that's the way to go:

open(my $tokensfh, '<', $tokensfile)
  or die "cannot open file $tokensfile";
chomp(my @tokenslines = <$tokensfh>);
close $tokensfh;

foreach my $token(@tokenslines) {
  open(my $token.'fh', '>>', $token."data.txt");
} 

A bit further down the line I match other data against the $token, but I'm unsure how to deal with the variables:

foreach my $somedata(@data) {
    my $datatoken = $somedata=~  /<fruit>(.+)<\/fruit>/;

    # Do I need a new variable name here?
    foreach my $tokensline(@tokenslines) {
        if ($datalinetoken eq $datatoken ) {
          # print $somedata to specific file
          print $tokensline.'fh' "average run time\n";
        }
    }
}

Do I need a new variable name? If not, how can I re-use the earlier variable without getting variable assignment issues? Is there a better way to do this? (Please answer all questions.)

6
  • my %fh; foreach my $token(@tokenslines) { open(my $fh{$token}, '>>', $token."data.txt") or die "A horrible death"; }? Commented Mar 19, 2016 at 19:23
  • A clean solution might come with a good OO flavouring thrown into the kettle ;-) Commented Mar 19, 2016 at 19:31
  • Oh, and some sample data would be good - this looks like XML, so an XML parser is recommended. Commented Mar 19, 2016 at 20:34
  • Please always use strict and use warnings 'all'. Your code would throw several warnings telling you that your $token variable is uninitialized, with the result that you are reopening the file data.txt for output multiple times on the file handle fh Commented Mar 19, 2016 at 20:39
  • @Borodin I did use those two, but seems to clutter the core to the problem here so I didn't include it in the question. I typed the example from scratch and didn't test it because I was just throwing out the general idea. Commented Mar 19, 2016 at 21:07

2 Answers 2

3

You can use the same global variable name repeatedly as long as they are declared in different scopes. Perl will warn you if you declare the same variable twice. I have used the same name $fh for a file handle in my code below without any consequences

In this case you need the file handles to be opened for most of the program, so you need a whole set of them, and it looks like it's easiest to use a hash, so that you can just pick the correct file handle by indexing the hash with the token string

It would look something like this. Note that I've used use autodie to avoid having to check the status of every IO operation explicitly. You may also want to consider whether you will need to handle the difference between apple, APPLE and Apple, which at the moment will create three file handles (and would confuse Windows dreadfully!)

Oh, and by the way, it's far nicer just to process each file line by line with a while instead of reading it all into an array and processing the data from there

use strict;
use warnings 'all';
use v5.14.1; # For autodie
use autodie;

use constant TOKENS_FILE => 'tokens.txt';
use constant XML_FILE    => 'data.xml';

my %token_fh;

{
    open my $fh, '<', TOKENS_FILE;

    while ( <$fh> ) {
        chomp;
        open $token_fh{$_}, '>', "${_}data.txt";
    }
}

{
    open my $fh, '<', XML_FILE;

    while ( <$fh> ) {

        next unless my ($token) = m|<fruit>(.+)</fruit>|;
        next unless my $fh = $token_fh{$token};

        print $fh "average run time\n";
    }
}

close $_ for values %token_fh;



An alternative way would be to forget about the tokens file altogether, and just open files as they are encountered in the XML. That would look like this

use strict;
use warnings 'all';
use v5.14.1; # For autodie
use autodie;

use constant XML_FILE    => 'data.xml';

my %token_fh;

open my $fh, '<', XML_FILE;

while ( <$fh> ) {

    next unless my ($token) = m|<fruit>(.+)</fruit>|;

    unless ( exists $token_fh{$token} ) {
        open $token_fh{$token}, '>', "${token}data.txt";
    }
    my $fh = $token_fh{$token};

    print $fh "average run time\n";
}

close $_ for values %token_fh;
Sign up to request clarification or add additional context in comments.

Comments

3

Don't do this. It's very nasty to use a variable variable name. See this link for a more detailed explanation of why: http://perl.plover.com/varvarname.html

You're much better off if you need named filehandles, to use a hash of filehandles. A hash is a portable namespace which is exactly what you need here.

So:

my %fh_for; 
foreach my $token ( @tokenlines ) { 
   open ( my $fh_for{$token}, '>', "$token.txt" ) or die $!; 
}

foreach my $datalinetoken (@tokenslines) {
    if ($datalinetoken eq $datatoken ) {
      # print $somedata to specific file
      print {$fh_for{$datalinetoken}} "average run time\n";
    }
}

Then you can write to a filehandle keyed by your token name, without needing the icky messyness of dynamic variable naming. Note, I've included your fh in {} - it's necessary to tell perl to 'evaluate this'.

5 Comments

From print, it says: If you're storing handles in an array or hash, or in general whenever you're using any expression more complex than a bareword handle or a plain, unsubscripted scalar variable to retrieve it, you will have to use a block returning the filehandle value instead, in which case the LIST may not be omitted:. So, I think this says the block is mandatory.
So, even though $token and $datatoken would be the same thing, we cannot use the same variable name? (Note that you forgot the embedded for loop.)
Yeah, good point. I wasn't sure - I know I've hit places where it was, I just couldn't remember if this was one.
You can use the same variable name if you wish. I just used what you were using. As long as the variable contains 'coconut' it will print to the coconut filehandle. You should probably include an 'or warn' after the print, to make sure it does work.
"It's very nasty to use a variable variable name" It's actually wrong here because open my $token.'fh', '>>', $token."data.txt" will declare a new $token set to undef, and then do open 'fh', '>>', 'data'. It will warn about the uninitialised value in each case of course

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.