0

I'm brand new to Perl, but based on the documentation that I have read, it looks like the split function in Perl asks for a regex pattern rather than a string delimiter as the first parameter, but I found that using something like print +(split(' ', $string))[0] will still split the string correctly.

Based on that, I was trying to use a variable delimiter (ex. print +(split($var, $string))[0] where $var = ' ') and found that it did not work. What am I doing wrong?

Thanks!

EDIT: Sorry for the terrible question. I was running this against a string with leading spaces and found that the split function didn't like the leading spaces. For example:

my $var = ' '; print +(split($var, ' abc ddddd'))[0] gives a blank output. Is $var being interpreted as /$var/ inside the split function?

versus

print +(split(' ', ' abc ddddd'))[0] which gives an output of abc

So when I read the docs I was assuming my variable would be considered a literal string, when in reality it was not, and therefore the leading whitespace was not stripped.

4
  • 2
    Works fine ... eval.in/240865 Commented Jan 9, 2015 at 16:01
  • 4
    Any time you have a question about one of Perl's built-in functions, I highly recommend you check perldoc. If it's installed on your system, you can run perldoc -f <function>, or in this case, perldoc -f split (I linked to the online version for convenience). The documentation is excellent. Your uncertainty about split using regex vs. string is explained in detail, including the special case of split ' '. Commented Jan 9, 2015 at 16:04
  • What does "doesn't work" mean? "Doesn't work" is an inadequate description for us to understand the problem. What happened when you tried it? Did you get an error message? Did you get incorrect results? Did you get no results? If the results were incorrect, what made them incorrect? What were you expecting instead? Did you get any correct results? If so, what were they? Don't make us guess. Commented Jan 9, 2015 at 16:08
  • 3
    Because split ' ' will use a literal space as delimiter, invoking the special case described in the perldoc. $var = ' '; split $var will be equivalent of split / /, which is a regex split, not the same thing. Commented Jan 9, 2015 at 16:09

2 Answers 2

7

Explanation

When you split on a literal space

split ' '

You invoke the special case, described in the documentation. When you use a variable

my $var = ' ';
split $var;

It is the same as putting that variable inside a regex:

split /$var/;

This will split on single whitespace, not the same thing. If for example you have this code:

my $string = "foo bar   baz";
my @literal = split ' ', $string;
my @space = split / /, $string;

Then @literal will contain "foo", "bar", "baz", and @space will contain "foo", "bar", "", "", "baz" -- empty fields where it has split on the single spaces.


Documentation

This is how the documentation describes it:

As another special case, split emulates the default behavior of the command line tool awk when the PATTERN is either omitted or a literal string composed of a single space character (such as ' ' or "\x20" , but not e.g. / / ). In this case, any leading whitespace in EXPR is removed before splitting occurs, and the PATTERN is instead treated as if it were /\s+/ ; in particular, this means that any contiguous whitespace (not just a single space character) is used as a separator. However, this special treatment can be avoided by specifying the pattern / / instead of the string " " , thereby allowing only a single space character to be a separator. In earlier Perls this special case was restricted to the use of a plain " " as the pattern argument to split, in Perl 5.18.0 and later this special case is triggered by any expression which evaluates as the simple string " " .

Workaround

Note that if you are looking for a way to dynamically emulate the ' ' splitting by using a variable, you might use /\s+/ instead. It is not quite the same, in that it will not strip leading whitespace, but otherwise should work as expected.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you! And way to understand my issue exactly :)
This seems to have changed in newer Perl versions. Currently, split ' ' behaves exactly the same way as split $var where $var=' '. If you want the behavior similar to split / /, you should define $var=qr/ /.
0

your code works fine, I think

my $text = "botolo";
my $separator = "o";
print +(split($separator, $text))[0];  
#uglyness with + necessary because Perl

Although, at the cost of one extra line, I would rather write that last line as:

my @parts = split($separator, $text);
print $parts[0];

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.