2

I am a beginner in perl. I have a text file with text similar to as below. i need to extract VALUE="<NEEDED VALUE>". Say for SPINACH, i should be getting SALAD alone.

How to use perl regex to get the value. i need to parse multiple lines to get it. ie between each #ifonly --- #endifonly

$cat check.txt

while (<$file>)
{
   if (m/#ifonly .+ SPINACH .+ VALUE=(")([\w]*)(") .+ #endifonly/g)
{
    my $chosen = $2;
   }
}


#ifonly APPLE CARROT SPINACH
VALUE="SALAD" REQUIRED="yes" 
QW RETEWRT OIOUR
#endifonly
#ifonly APPLE MANGO ORANGE CARROT
VALUE="JUICE" REQUIRED="yes" 
as df fg
#endifonly

5 Answers 5

5
use strict;
use warnings;
use 5.010;

while (<DATA>) {
   my $rc = /#ifonly .+ SPINACH/ .. (my ($value) = /VALUE="([^"]*)"/);
   next unless $rc =~ /E0$/;
   say $value;
}

__DATA__
#ifonly APPLE CARROT SPINACH
VALUE="SALAD" REQUIRED="yes" 
QW RETEWRT OIOUR
#endifonly
#ifonly APPLE MANGO ORANGE CARROT
VALUE="JUICE" REQUIRED="yes" 
as df fg
#endifonly

This uses a small trick described by brian d foy here. As the link describes, it uses the scalar range operator / flipflop.

Sign up to request clarification or add additional context in comments.

3 Comments

Also, somewhat shorter: next unless (/#ifonly .+ SPINACH/ .. (my ($value) = /VALUE="([^"]*)"/)) =~ /E0$/; But frankly, it breaks my indent, so I wouldn't use it. : ) There's also quite a bit going on there, which may not be the best for maintainability.
Pretty cool approach, and (once more) the link you posted has taught me something, so thanks for that!
@canavanin I have the links. All of them! You are welcome - the Effective Perler is my favorite Perl blog, so it's always a pleasure to direct people there.
1

In case your file is very big (or you want to read it line by line for some other reason) you could do it as follows:

#!/usr/bin/perl

use strict;
use warnings;
use Getopt::Long;

my ($file, $keyword);

# now get command line options (see Usage note below)
GetOptions(
            "f=s" => \$file,
            "k=s" => \$keyword,
          );

# if either the file or the keyword has not been provided, display a
# help text and exit
if (! $file || ! $keyword) {
   print STDERR<<EOF;

   Usage: script.pl -f filename -k keyword

EOF
   exit(1);
}

my $found;         # indicator that the keyword has been found
my $returned_word; # will store the word you want to retrieve

open FILE, "<$file" or die "Cannot open file '$file': $!";
while (<FILE>) {
   if (/$keyword/) {
      $found = 1;
   }

   # the following condition will be true between all lines that
   # start with '#ifonly' or '#endifonly' - but only if the keyword 
   # has been found!
   if (/^#ifonly/ .. /^#endifonly/ && $found) {
      if (/VALUE="(\w+)"/) { 
         $returned_word = $1;
         print "looking for $keyword --> found $returned_word\n";

         last; # if you want to get ALL values after the keyword
               # remove the 'last' statement, as it makes the script
               # exit the while loop
      }
   }
}
close FILE;

Comments

0

You can read the file contents in a string and then search for the pattern in the string:

my $file;    
$file.=$_ while(<>);    
if($file =~ /#ifonly.+?\bSPINACH\b.+?VALUE="(\w*)".+?#endifonly/s) {
        print $1;
}

Your original regex needs some tweaking:

  • You need to make your quantifiers non-greedy.
  • Use the s modifier to make . match newline as-well.

Ideone Link

Comments

0

Here's another answer based on the flip-flop operator:

use strict;
use warnings;
use 5.010;

while (<$file>)
{
  if ( (/^#ifonly.*\bSPINACH\b/ .. /^#endifonly/) &&
       (my ($chosen) = /^VALUE="(\w+)"/) )
  {
    say $chosen;
  }
}

This solution applies the second test to all of the lines in the range. The trick @Hugmeir used to exclude the start and end lines isn't needed because the "inner" regex, /^VALUE="(\w+)"/, can never match them anyway (I added the ^ anchor to all regexes to make doubly sure of that).

Comments

0

These two lines in one answer given two days ago

my $file;
$file.=$_ while(<>);

are not very efficient. Perl will likely read the file in big chunks, break those chunks into lines of text for the <> and then the .= will join those lines back to make a big string. It would be more efficient to slurp the file. The basic style is to alter \$ the input record separator.

undef $/;
$file = <>;

The module File::Slurp; (see perldoc File::Slurp) may be even better.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.