Correct way to compare array elements

Question

I'm writing a piece of code that extracts some numbers from an input file, which holds information for two conditions. The code therefore extracts two numbers for each line, and compares them against each other. The snippet below works fine, but I'm having trouble understanding which of the below approaches is 'correct', and why:

Input:

gi|63100484|gb|BC094950.1|_Xenopus_tropicalis_cDNA_clone_IMAGE:7022272  C1:XLOC_017431_0.110169:4.99086,_Change:5.5015,_p:0.00265,_q:0.847141 [95.08]   C2:XLOC_020690_0.050681:9.12527,_Change:7.49228,_p:0.0196,_q:0.967194 [95.08]
gi|6572468|emb|AJ251750.1|_Xenopus_laevis_mRNA_for_frizzled_4_protein_(fz4_gene)        C1:XLOC_027664_1.61212:4.37413,_Change:1.44003,_p:0.00515,_q:0.999592 [99.40]   C2:XLOC_032999_2.94775:14.2322,_Change:2.27147,_p:5e-05,_q:0.0438548 [99.40]
gi|68533737|gb|BC098974.1|_Xenopus_laevis_RDC1_like_protein,_mRNA_(cDNA_clone_MGC:114801_IMAGE:4632706),_complete_cds   C1:XLOC_036220_0.565861:6.52476,_Change:3.52741,_p:0.00015,_q:0.21728 [99.95]   C2:XLOC_043165_0.157752:2.52129,_Change:3.99843,_p:0.02115,_q:0.99976 [99.95]
gi|70672087|gb|DQ096846.1|_Xenopus_laevis_degr03_mRNA,_complete_sequence        C1:XLOC_031048_0.998437:4.20942,_Change:2.07588,_p:0.01365,_q:0.999592 [99.87]  C2:XLOC_037051_1.1335:4.36819,_Change:1.94624,_p:0.01905,_q:0.9452 [99.87]
gi|70672102|gb|DQ096861.1|_Xenopus_laevis_rexp44_mRNA,_complete_sequence        C1:XLOC_049520_12.3353:6.30193,_Change:-0.968926,_p:0.04935,_q:0.999592 [92.90] C2:XLOC_058958_13.0419:5.10275,_Change:-1.35381,_p:0.0373,_q:0.99976 [92.90]
gi|7110523|gb|AF231711.1|_Xenopus_laevis_7-transmembrane_receptor_frizzled-1_mRNA,_complete_cds C1:XLOC_038309_0.784476:2.37536,_Change:1.59835,_p:0.0079,_q:0.999592 [99.94]   C2:XLOC_045678_0.692883:3.52599,_Change:2.34735,_p:0.00125,_q:0.341583 [99.94]


#!/usr/bin/perl 
use strict;
use warnings;
use File::Slurp;
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;

my @intersect = read_file('text.txt');

my (@q1, @q2, @change_q, @q_values, @q_value1, @q_value2);
foreach (@intersect) {
    chomp;
    @q_value1 = ($_ =~ /C1:.*?q:(\d+\.\d+)/);
    @q_value2 = ($_ =~ /C2:.*?q:(\d+\.\d+)/);
    push @q_values, "C1:@q_value1\tC2:@q_value2";
        if (abs $q_value1[@_] < abs $q_value2[@_]) {
            push @change_q, $q_value1[@_];
        }
        elsif (abs $q_value2[@_] < abs $q_value1[@_]) {
            push @change_q, $q_value2[@_];
        }
}

print Dumper (\@q_values);
print Dumper (\@change_q);

Output:

$VAR1 = [
          'C1:0.847141  C2:0.967194',
          'C1:0.999592  C2:0.0438548',
          'C1:0.21728   C2:0.99976',
          'C1:0.999592  C2:0.9452',
          'C1:0.999592  C2:0.99976',
          'C1:0.999592  C2:0.341583'
        ];
$VAR1 = [
          '0.847141',
          '0.0438548',
          '0.21728',
          '0.9452',
          '0.999592',
          '0.341583'
        ];

This works perfectly, outputting the smaller 'q-value' for each condition. However replacing @_ with $#_ also works.

As does this approach:

foreach (@intersect) {
    chomp;
    @q_value1 = ($_ =~ /C1:.*?q:(\d+\.\d+)/);
    @q_value2 = ($_ =~ /C2:.*?q:(\d+\.\d+)/);
    push @q_values, "C1:@q_value1\tC2:@q_value2";
        my $q_value1 = $q_value1[0] // $q_value1[1];
        my $q_value2 = $q_value2[0] // $q_value2[1];
        if (abs $q_value1 < abs $q_value2) {
            push @change_q, $q_value1;
        } 
        elsif (abs $q_value2 < abs $q_value1) {
            push @change_q, $q_value2;
        }
}
print Dumper (\@q_values);
print Dumper (\@change_q);

Output:

$VAR1 = [
          'C1:0.847141  C2:0.967194',
          'C1:0.999592  C2:0.0438548',
          'C1:0.21728   C2:0.99976',
          'C1:0.999592  C2:0.9452',
          'C1:0.999592  C2:0.99976',
          'C1:0.999592  C2:0.341583'
        ];
$VAR1 = [
          '0.847141',
          '0.0438548',
          '0.21728',
          '0.9452',
          '0.999592',
          '0.341583'

TLP · Accepted Answer · 2013-08-27 14:12:29Z

5

"This works perfectly" is putting it a bit strong. It works coincidentally would be a better description. You are using the @_ array, its highest index $#_ and the number zero, getting the same result every time. What you are not realizing is that @_ is actually empty, because it is only used when passing arguments to subroutines. So when you say

$foo[@_]

You are really saying

$foo[0]

And when you are saying

$foo[$#_]

You are really saying

$foo[-1]

For extra fun, -1 is also a valid array element, meaning the last element in the array, so for an array of size 1 or 2, it probably seems to work fine.

Because in scalar context, an array @_ will return its size, which in this case is 0. $#_ will return -1 when @_ is empty, because there is no highest index.

So, to answer your question: Because using @_ is wrong and only works on accident, using fixed numbers 0 and 1 is the better solution.

edited Aug 27, 2013 at 14:12

answered Aug 27, 2013 at 13:59

TLP

68.3k10 gold badges97 silver badges156 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Correct way to compare array elements

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related