2

I was trying to split a string into groups of three. Hopefully someone can explain why the array contains the rubbish in the results in the code below.

"1234567890".split(/(\d{3})/)
# => ["", "123", "", "456", "", "789", "0"] 

I know it's better to use scan in order to get the groups, I was just curious about this specific situation.

1
  • LOL, how can this be flagged as "unclear what you are asking" after the answer has already been clearly given? Commented Jul 19, 2013 at 13:33

1 Answer 1

7

It has to do with your grouping, compare a simpler version:

"12:34:56:78:90".split(/(:)/)
=> ["12", ":", "34", ":", "56", ":", "78", ":", "90"]

"12:34:56:78:90".split(/:/)
=> ["12", "34", "56", "78", "90"]

Usually with the split function, the delimiter is left out of the result. The grouping parens causes it to keep the delimiter in the result. Without the groups, you would have:

"1234567890".split(/\d{3}/)
=> ["", "", "", "0"]

Which makes sense, there is nothing between the delimiters until the last 0. Then when you add the grouping, it intersperses the delimiters with the "in between" that is the usual result of split. The empty strings aren't the scrap, the groups of numbers are.

And lastly, having actually looked at the documentation, we read:

If pattern is a Regexp, str is divided where the pattern matches. Whenever the pattern matches a zero-length string, str is split into individual characters. If pattern contains groups, the respective matches will be returned in the array as well.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.