9

I've tried to find a good way to set up initial capacity of collector in java stream api. The simplest example is there:

data.stream()
        .collect(Collectors.toList());

I just want to pass an int with size of list into collector in order not to resize internal array. The first intention is to do it in such way:

data.stream()
        .collect(Collectors.toList(data.size()));

But unfortunately toList isn't overloaded to work with parameter. I found one solution but it smells:

 data.stream()
        .collect(Collectors.toCollection(() -> new ArrayList<>(data.size())));

Is there any way to express it simplier?

6
  • Why do you think this is a smelly solution? Commented Dec 9, 2016 at 21:30
  • Because i definitely know that i want to collect the data to list but i need to use toCollection method and then specify more concrete type, also there's unnecessary lamblda expression... Commented Dec 9, 2016 at 21:33
  • 3
    toList() doesn't specify the list it returns. Why do you assume it is ArrayList, or that the list returned understand a concept of "initial capacity"? (Yes, it currently is an ArrayList, but this is an implementation detail, and not all lists have an initial capacity). Commented Dec 9, 2016 at 21:33
  • Good point. I also thought about it.I just think that ArrayList is used overwhelmingly and there should be more simple way to express the same. Do you know any? Commented Dec 9, 2016 at 21:38
  • 2
    What is complicated in the given solution? It's "just" one lambda, which body invokes a constructor. You can't use a convenient method-reference ArrayList::new since you need to pass the initial capacity... Make your own method returning the Supplier<List<T>> if you find the number of closing parenthesis too much. Commented Dec 9, 2016 at 21:42

3 Answers 3

4

I'd take your inelegant

Collectors.toCollection(() -> new ArrayList<>(data.size()))

and wrap it in a static method

public static <T> Collector<T, ?, List<T>> toList(int size) {
    return Collectors.toCollection(() -> new ArrayList<T>(size));
}

then call it (with a static import)

stream.collect(toList(size))

!inelegant?

edit (This does make it an ArrayList) is this bad?

Sign up to request clarification or add additional context in comments.

1 Comment

I dont think its bad, the reason being you are telling the Initial Capacity when creating the ArrayList so its predictable
1

I do not know of any straightforward way in the API to ensure the capacity of the mutable container used under the hood to collect the data. I may guess that at least one of the many reasons is the support for parallelism by calling parallelStream().

So - if your data is processed in parallel there is no much sense to give initial capacity even if you know that the underlying container (e.g. ArrayList) supports capacity. Multiple containers will be created by different threads and later combined and the capacity will at least harm the overall performance.

If you want to be truly specific and elegant you may also try to implement your own collector. It is not difficult.

Comments

1
data.stream().collect(Collectors.toCollection(() -> new HashSet<>(100)))
data.stream().collect(Collectors.collectingAndThen(
                        Collectors.toCollection(() -> new HashSet<>(100)), Collections::unmodifiableSet))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.