2

I'm trying to create a regex as following :

print $time . "\n"; --> match only print because time is a variable ($ before)

$epoc = time(); --> match only time

My regex for the moment is /(?-xism:\b(print|time)\b)/g but it match time in $time in the first example.

Check here.

I tried things like [^\$] but then it doesn't match print anymore.

(I will have more keyword like print|time|...|...)

Thanks

2
  • 1
    I'm not sure if what you're doing is right but as it seems to me, you only need a negative lookbehind: (?<!\$). See demo Commented Apr 13, 2014 at 14:20
  • 1
    Thank you it's exactly this. Post it as an answer, I will validate it. Commented Apr 13, 2014 at 14:24

2 Answers 2

7

Parsing perl code is a common and useful teaching tool since the student must understand both the parsing techniques and the code that they're trying to parse.

However, to do this properly, the best advice is to use PPI

The following script parses itself and outputs all of the barewords. If you wanted to, you could compare the list of barewords to the ones that you're trying to match. Note, this will avoid things within strings, comments, etc.

use strict;
use warnings;

use PPI;

#my $src = do {local $/; <DATA>};  # Could analyze the smaller code in __DATA__ instead
my $src = do {
    local @ARGV = $0;
    local $/;
    <>;
};

# Load a document
my $doc = PPI::Document->new( \$src );

# Find all the barewords within the doc
my $barewords = $doc->find( 'PPI::Token::Word' );
for (@$barewords) {
    print $_->content, "\n";
}

__DATA__
use strict;
use warnings;

my $time = time;

print $time . "\n";

Outputs:

use
strict
use
warnings
use
PPI
my
do
local
local
my
PPI::Document
new
my
find
for
print
content
__DATA__
Sign up to request clarification or add additional context in comments.

5 Comments

+1 for not using regex
That's looks great but I can't use it (Can't locate PPI.pm in @INC). And I'm allowed to use only modules already installed.
I've been noticing your fine regex style, but this solution is not regex! Upvoting for original and instructive solution... Perl is mysterious to me. :)
@zx81 Thank you for the compliment. I'm glad you were able to learn something. I actually picked up PPI from another thread, but discovered that it is quite powerful and the preferred technique for this type of problem.
picked it up from another thread... the magic of stackoverflow. We're all learning from somewhere. Nice meeting you and thanks for your message. :)
3

What you need is a negative lookbehind (?<!\$), it's zero-width so it doesn't "consume" characters.

(?<!\$)a means match a if not preceded with a literal $. Note that we escaped $ since it means end of string (or line depending on the m modifier).

Your regex will look like (?-xism:\b(?<!\$)(print|time)\b).

I'm wondering why you are turning off the xism modifiers. They are off by default.
So just use /\b(?<!\$)(?:print|time)\b/g as pattern.

Online demo SO regex reference

1 Comment

I'm using xism because in my perl code I'm doing $var = qr/\b(?<!\$)($words)\b/

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.