1

Input

[security] [client 198.66.91.7] [domain testphp.example.com] [200] [/apache/20160503/20160503-0636/20160503-063628-Vyh-LH8AAAEAAE6zC@AAAAAD] (null)

Desired output

/apache/20160503/20160503-0636/20160503-063628-Vyh-LH8AAAEAAE6zC@AAAAAD

here is what I have so far

'.*?\[.*?\].*?\[.*?\].*?\[.*?\].*?\[.*?\].*?\[(.*?)\]'

My Perl code.

#!/usr/bin/perl
use feature 'say';

$txt='[modsecurity] [client 199.66.91.7] [domain testphp.vulnweb.com] [200] [/apache/20160503/20160503-0636/20160503-063628-Vyh-LH8AAAEAAE6zC@AAAAAD] (null)';


$re=''.*?\[.*?\].*?\[.*?\].*?\[.*?\].*?\[.*?\].*?\[(.*?)\]'';

if ($txt =~ m/$re/is)
{
    $sbraces1=$1;
    say $1; 
}

output

/apache/20160503/20160503-0636/20160503-063628-Vyh-LH8AAAEAAE6zC@AAAAAD

I think my regex is messy? maybe another way?

Thanks

3
  • 2
    You should use a split approach. Commented May 3, 2016 at 11:28
  • 1
    Your comment should be an answer Commented May 3, 2016 at 11:35
  • @Deano I added the answer. Commented May 3, 2016 at 11:38

3 Answers 3

3

I would use a split too... or a more general regex than the one you are using:

#!/usr/bin/env perl

use strict;
use warnings;
use Data::Dumper;

my $data = '[security] [client 198.66.91.7] [domain testphp.example.com] [200] [/apache/20160503/20160503-0636/20160503-063628-Vyh-LH8AAAEAAE6zC@AAAAAD] (null)';

my @fields = $data =~ /(?:\[(.*?)\])+/g;

print Dumper(\@fields);

The output you get is:

$VAR1 = [
          'security',
          'client 198.66.91.7',
          'domain testphp.example.com',
          '200',                                                                                                                               
          '/apache/20160503/20160503-0636/20160503-063628-Vyh-LH8AAAEAAE6zC@AAAAAD'                                                            
        ];         

So the fifth element of the returned array is what you want.

Sign up to request clarification or add additional context in comments.

Comments

1

Use character class negation. Because it is performance is better than the non greedy assertions.

my $txt = '[security] [client 198.66.91.7] [domain testphp.example.com] [200] [/apache/20160503/20160503-0636/20160503-063628-Vyh-LH8AAAEAAE6zC@AAAAAD] (null)';

my @array = $txt =~ /\[([^\]]+)\]/g;

print "@array\n";

Here demo for character class negation.

Here demo for non greedy quantifier.

1 Comment

I like your solution. You should include the capture groups: /\[([^\]]+)\]/g
0

I created this regex demo:

\[\d{3}\]\s+\[(\S+)\]

My answer is based on the assumption that the url that you want to match will always be followed by a HTTP status code.

Since it is HTTP status code we could also write (as in this SO post):

\[[1-5][0-9]{2}\]\s+\[(\S+)\]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.