3

I need to convert a list of ids to array of ids. I can do it in many ways but not sure which one should be used.

Say,

1. ids.stream().toArray(Id[]::new)
2. ids.toArray(new Id[ids.length])

Which one is more efficient and why?

8
  • 2
    Unless you actually want to use the stream API, I can't think of a good reason to use the first approach. I can't imagine the performance difference would justify the decreased readability of the code. Commented Mar 29, 2019 at 9:05
  • 2
    Do you really have performance issues? How many lists are we talking about? Can't that conversion be avoided in the first place? Commented Mar 29, 2019 at 9:07
  • 4
    First : What do you mean by efficient ? less memory use, less CPU use, fastest ? Second : Is your array big enough to really make a difference ? If no, use the more readable, not the more efficient. Commented Mar 29, 2019 at 9:10
  • Efficient means execution speed is faster. The array may contain 6 million entries Commented Mar 29, 2019 at 9:20
  • 1
    @Eugene that's not what I said, as I agree with you, the more readable is not he fastest. What I mean is you should prioritize what you are coding. If we talk about an array of 20 entries, such optimization should not be considered, and we should use the more readable. If (as OP) you use millions of data, then optimization become interesting and may take the lead on the readability (that where commenting code become useful) Commented Mar 29, 2019 at 9:31

1 Answer 1

6

java-11 introduced Collection::toArray that has this implementation:

default <T> T[] toArray(IntFunction<T[]> generator) {
    return toArray(generator.apply(0));
}

To make it simpler in your case, it is actually doing : ids.toArray(new Id[0]); that is - it is not specifying the total expected size.

This is faster than specifying the size and it's non-intuitive; but has to do with the fact that if the JVM can prove that the array that you are allocating is going to be overridden by some copying that is immediately followed, it does not have to do the initial zeroing of the array and that proves to be faster then specifying the initial size (where the zeroing has to happen).

The stream approach will have (or try to guess an estimate) an initial size that the stream internals will compute, because:

 ids.stream().toArray(Id[]::new)

is actually:

 ids.stream().toArray(size -> Id[size]);

and that size is either known or estimated, based on the internal characteristics that a Spliterator has. If the stream reports SIZED characteristic (like in your simple case), then it's easy, size is always known. On the other hand if this SIZED is not present, stream internals will only have an estimate of how many elements will be present and in such a case, an hidden new collection will be used to capture elements, called SpinedBuffer.

You can read more here, but the approach ids.toArray(new Id[0]) will be the fastest.

Sign up to request clarification or add additional context in comments.

1 Comment

Not every Collection produces a SIZED stream, i.e. concurrent collections won’t. But for concurrent collections, ids.toArray(new Id[ids.size()]) would be even broken unless the application can preclude concurrent modifications during the operation. So it boils down to ids.toArray(new Id[0]) being the simplest, least error-prone, and most efficient solution.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.