2

I have a string that looks like this:

abc[1,2,3].something.here,foo[10,6,34].somethingelse.here,def[1,2].another

I want to split this string into an array that consists of:

abc[1,2,3].something.here
foo[10,6,34].somethingelse.here
def[1,2].another

But splitting on the comment won't work so my next idea is to first replace the commas that reside between the square brackets with something else so I can split on the comma, then replace after the fact.

I've tried a few approaches with little success.. Any suggestions?

5 Answers 5

4

You can use look-ahead assertion in the pattern:

my $s = "abc[1,2,3].something.here,foo[10,6,34].somethingelse.here,def[1,2].another";
my @a = split /,(?=\w+\[)/, $s;
Sign up to request clarification or add additional context in comments.

2 Comments

Excellent, I will research look-ahead! Thanks!
Adding host1.something.here will break this regex. Where would be a good reference for this look-ahead assertion?
1

When things get that complex, I like the parser approach.

#!/usr/bin/perl
use strict;
use warnings;

my $statement  =  "abc[1,2,3].something.here,foo[10,6,34].somethingelse.here,def[1,2].another";

my $index      = qr/\[(?:\d+)(?:,\d+)*\]/;
my $variable   = qr/\w+$index?/;
my $expression = qr/$variable(?:\.$variable)*/;

my @expressions = ($statement =~ /($expression)/g);

print "$_\n" for @expressions;

Comments

0

Iterate through the characters in the string like this (pseudocode):

found_closing_bracket = 0;
buffer = ''
array = []

foreach c in str:

   if c == ']'
      found_closing_bracket = 1

   if c == ',' && found_closing_bracket == 1
     push(array, buffer)
     buffer = ''
     found_closing_bracket = 0

   else
     buffer = buffer + c

Sure, you could use regular expressions, but personally I rather aim for a simpler solution even if it's more hackish. Regular expressions are a pain to read sometimes.

Comments

0

An alternative to eugene y's answer:

my $s = "abc[1,2,3].something.here,foo[10,6,34].somethingelse.here,def[1,2].another";
my @a = ($s =~ /[^,]+\[[\d,]*\]/g);
print join("\n", @a,"")

Comments

0

This question gave me excuse to take a look at Regexp::Grammars I wanted for some time. Following snippet works for your input:

use Regexp::Grammars;
use Data::Dump qw(dd);

my $input
    = 'abc[1,2,3].something.here,foo[10,6,34].somethingelse.here,def[1,2].another';

my $re = qr{
    <[tokens]> ** (,)  # comma separated tokens

    <rule: tokens>     <.token>*
    <rule: token>      \w+ | [.] | <bracketed>
    <rule: bracketed>  \[ <.token> ** (,) \]
}x;

dd $/{tokens}
    if $input =~ $re;

# prints
# [
#   "abc[1,2,3].something.here",
#   "foo[10,6,34].somethingelse.here",
#   "def[1,2].another",
# ]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.