6

I'm a Java developer who tried Kotlin and found a counterintuitive case between these two languages.

Consider the given code in Kotlin:

"???".split("")  # gives ["", "?", "?", "?", ""]

and the same in Java:

"???".split("")  # gives ["?", "?", "?"]

Why does Kotlin produce a leading and trailing empty space string in the resulting array? Or does Java always removes these empty strings, and I just wasn't aware of that?

I know that there is the toCharArray() method on each Kotlin String, but still, it's not very intuitive (maybe the Java developers should give up old habits from Java which they were hoping to reuse in a new language?).

2
  • 3
    "Let's discuss" No, Stack Overflow is not a place for discussions. Commented Feb 1, 2022 at 13:46
  • 1
    @Sweeper While this is technically true, it's still an interesting question that will probably have a very concrete answer and therefor it suits the QA format very well. I once had a similar "why standard library is the way it is" and I added it here softwareengineering.stackexchange.com. Still not sure if this was the right place for it though. At least no one complained. Commented Feb 1, 2022 at 13:51

2 Answers 2

8

This is because the Java split(String regex) method explicitly removes them:

This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.

split(String regex, int limit) mentions:

When there is a positive-width match at the beginning of this string then an empty leading substring is included at the beginning of the resulting array. A zero-width match at the beginning however never produces such empty leading substring.

"" is a zero-width match. Not sure why you consider toCharArray() to not be intuitive here, splitting by an empty string to iterate over all characters is a roundabout way of doing things. split() is intended to pattern match and get groups of Strings.

PS: I checked JDK 8, 11 and 17, behavior seems to be consistent for a while now.

Sign up to request clarification or add additional context in comments.

1 Comment

My concerns about toCharArray() was from that I assumed (as you see incorrectly) Kotlin API behaves the same as Java one. Thank you for your clarification - from now I will be more careful with such assumptions.
1

You need to filter out the first and last element:

"???".split("").drop(1).dropLast(1)

Check out this example:

"".split("")   // [, ]

Splits this char sequence to a list of strings around occurrences of the specified delimiters.

See https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/split.html

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.