What is the most efficient method to iterate over a Perl array?

Question

I am reading an ordered file for which I must count by-hour, by-minute or by-second occurrences. If requested, I must print times with 0 occurrences (normalized output) or skip them (non-normalized output). The output must obviously be ordered.

I first thought using an array. When the output is non normalized, I am doing roughly the equivalent of:

@array[10] = 100;
@array[10000] = 10000;

And to print the result:

foreach (@array) {
  print if defined;
}

Is there a way to reduce iterations to only elements defined in the array? In the previous example, that would mean doing only two iterations, instead of 10000 as using $#array implies. Then I would also need a way to know the current array index in a loop. Does such a thing exist?

I am thinking more and more to use a hash instead. Using a hash solves my problem and also eliminates the need to convert hh:mm:ss times to index and vice-versa.

Or do you have a better solution to suggest for this simple problem?

When the "key" or "index" range is relatively large compared to the number of meaningful elements (ie, a sparse structure), a hash is better suited. If the number of meaningful elements is high relative to the range of indices (a dense structure), and the cost of computing indices is low, an array can be more time efficient since it avoids the overhead of the hashing algorithm. — DavidO
– DavidO, Commented Oct 30, 2012 at 16:34
Question is, I think, why you use an array in the first place? Are the indexes part of your data? If not, why bother with them? — TLP
– TLP, Commented Oct 30, 2012 at 17:27
Use a hash, or a two-dimensional array, is my advice. E.g. push @array [ 10, 100 ]. No sense keeping empty array elements. — TLP
– TLP, Commented Oct 30, 2012 at 18:34

Phil H · Accepted Answer · 2012-10-30 13:58:24Z

6

Yes, use a hash. You can iterate over the ordered array of the keys of the hash if your keys sort correctly.

answered Oct 30, 2012 at 13:58

Phil H

20.3k8 gold badges73 silver badges105 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Philippe A. Over a year ago

I settled for a solution using an array. The advantage of the hash is not obvious because my data is generally well distributed over the day. Especially when producing stats per-hour. With a hash I wouldn't need to sort keys because they represent time values. I simply have to iterate over time and print corresponding hash values. A hash would save some memory depending on situations. Maybe it would also be faster (for this problem) but I'd have to test.

Phil H Over a year ago

Is your data regularly spaced or irregular? If it is irregular, you will end up with gaps which waste space; the more granularity you need in ability to say '0 events in this period', the more spaces you'll have. Perhaps if you give a couple of real examples it will be clearer.

choroba · Accepted Answer · 2012-10-30 14:10:26Z

You can also remember just the pairs of numbers in an array:

#!/usr/bin/perl
use warnings;
use strict;

my @ar = ( [  10, 100 ],
           [ 100,  99 ],
           [  12,   1 ],
           [  13,   2 ],
           [  15,   1 ],
         );

sub normalized {
    my @ar = sort { $a->[0] <=> $b->[0] } @_;
    map "@$_", @ar;
}

sub non_normalized {
    my @ar = sort { $a->[0] <=> $b->[0] } @_;
    unshift @ar, [0, 0] unless $ar[0][0] == 0;
    my @return;
    for my $i (0 .. $#ar) {
        push @return, "@{ $ar[$i] }";
        push @return, $_ . $" . 0 for 1 + $ar[$i][0] .. $ar[$i + 1][0] - 1;
    }
    return @return;
}


print join "\n", normalized(@ar), q();
print "\n";
print join "\n", non_normalized(@ar), q();

Collectives™ on Stack Overflow

What is the most efficient method to iterate over a Perl array?

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related