7

I'm a perl newbie. I have a code in which a variable is loaded with several values during a foreach loop. What I want to do is to perform some operation on that variable only if its in that array. What is the most efficient way to do this in perl as the data I am working on is very large.

A simple example of my question is, say I have an array of fruits I want

@fruits_i_like = qw (mango banana apple);

But I have a $fruit variable in a foreach loop which gets the name of fruits from a data file that has all different types of fruits. How would I pick only those cases of $fruit that are in my @fruits_i_like array?

5
  • well the file I need to read is about 50MB. Commented Oct 29, 2010 at 11:43
  • @sfactor : That's not too bad then. Commented Oct 29, 2010 at 11:54
  • If the data file is stable, put the data in an SQLite database, create indexes, then use DBI to issue the appropriate SELECT statement. Commented Oct 29, 2010 at 12:23
  • @Sinan : That might be overkill for a newbie, especially seeing that the file isn't too large. Commented Oct 29, 2010 at 13:29
  • When you iterate over your file, make sure you use while ( my $line = <$fh> ) { blah } instead of for my $line ( <$fh> ) { blah } The difference between these is context, while operates in a scalar (boolean) context. for operates in list context and will load the entire file into memory before beginning to process it. Commented Oct 30, 2010 at 2:45

3 Answers 3

11

Perl 5.10 or higher?

use strict;
use warnings;
use 5.10.0;
my @fruits_i_like = qw/mango banana apple/;
my $this_fruit = 'banana';
if ( $this_fruit ~~ \@fruits_i_like ) {
     say "yummy, I like $this_fruit!";
}

Before 5.10:

use strict;
use warnings;
my @fruits_i_like = qw/mango banana apple/;
my $this_fruit = 'banana';
if ( scalar grep $this_fruit eq $_, @fruits_i_like ) {
     print "yummy, I like $this_fruit!\n";
}

The downside is that the whole array is parsed through to find matches. This may not be the best option, in which case you can use List::MoreUtils' any(), which returns true once it matches a value and doesn't continue going through the array.

use strict;
use warnings;
use List::MoreUtils qw/any/;
my @fruits_i_like = qw/mango banana apple/;
my $this_fruit = 'banana';
if ( any { $this_fruit eq $_ } @fruits_i_like ) {
     print "yummy, I like $this_fruit!\n";
}

Happy hacking!

Sign up to request clarification or add additional context in comments.

Comments

10

You can use a hash like this :

my %h = map {$_ => 1 } @fruits_i_like;
if (exists $h{$this_fruit}) {
    # do stuff
}

Here is a benchmark that compare this way vs mfontani solution

#!/usr/bin/perl 
use warnings;
use strict;
use Benchmark qw(:all);

my @fruits_i_like = qw/mango banana apple/;
my $this_fruit = 'banana';
my %h = map {$_ => 1 } @fruits_i_like;
my $count = -3;
my $r = cmpthese($count, {
    'grep' => sub {
         if ( scalar grep $this_fruit eq $_, @fruits_i_like ) {
             # do stuff
         }
    },
    'hash' => sub {
        if (exists $h{$this_fruit}) {
             # do stuff
        }
    },
});

Output:

          Rate grep hash
grep 1074911/s   -- -76%
hash 4392945/s 309%   --

3 Comments

Change the sub{} to q{} and run that benchmark again. The subroutine call overhead can change the numbers too much.
If you create %h just for this purpose, shouldn't it be part of the benchmark?
@Øyvind Skaar: I don't think so, because OP wants to match fruits many times. %h is created only once and used many times. It's different from the grep solution where the grep is done for every different fruit.
9

This is effectively a lookup problem. It'd be faster to lookup the values of @fruits_i_like in a hash like %fruits_i_like (which is O(1) vs the O(n) of an array).

Convert the array to a hash using the following operation:

open my $data, '<', 'someBigDataFile.dat' or die "Unable to open file: $!";

my %wantedFruits;
@wantedFruits{@fruits_i_like} = ();  # All fruits_i_like entries are now keys

while (my $fruit = <$data>) {        # Iterates over data file line-by-line

     next unless exists $wantedFruits{$fruit};  # Go to next entry unless wanted

     # ... code will reach this point only if you have your wanted fruit
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.