52

Is there any way in Java 8 to group the elements in a java.util.stream.Stream without collecting them? I want the result to be a Stream again. Because I have to work with a lot of data or even infinite streams, I cannot collect the data first and stream the result again.

All elements that need to be grouped are consecutive in the first stream. Therefore I like to keep the stream evaluation lazy.

4
  • 2
    If your data is already "pre-grouped" (by being consecutive), why do you need it in the grouped form? Providing some context might help provide a better answer to this question Commented Aug 18, 2016 at 8:35
  • You mean using groupBy from Collectors without collecting? Commented Aug 18, 2016 at 8:39
  • 1
    Sounds like a job for a queue, not a stream. Consume the consecutive elements from the queue until you detect the beginning if the next group, add the group to the next queue containing the groups. Commented Aug 18, 2016 at 8:50
  • 2
    I have to start with a stream as I get the data that way (most of the time but not always out of jOOQ grin after doing a JOIN in the database) and the contract with the consumer object is a stream as well. I have to group the data to recreate the 1:n relation of the two entites represented by the joined tables. So what I try to achieve is to group the database records based on what becomes the same object on the 1-side of the relation. And in a further step I will map the grouped records to this object and the list of objects on the n-side of the relation. Commented Aug 18, 2016 at 9:08

1 Answer 1

49

There's no way to do it using standard Stream API. In general you cannot do it as it's always possible that new item will appear in future which belongs to any of already created groups, so you cannot pass your group to downstream analysis until you process all the input.

However if you know in advance that items to be grouped are always adjacent in input stream, you can solve your problem using third-party libraries enhancing Stream API. One of such libraries is StreamEx which is free and written by me. It contains a number of "partial reduction" operators which collapse adjacent items into single based on some predicate. Usually you should supply a BiPredicate which tests two adjacent items and returns true if they should be grouped together. Some of partial reduction operations are listed below:

  • collapse(BiPredicate): replace each group with the first element of the group. For example, collapse(Objects::equals) is useful to remove adjacent duplicates from the stream.
  • groupRuns(BiPredicate): replace each group with the List of group elements (so StreamEx<T> is converted to StreamEx<List<T>>). For example, stringStream.groupRuns((a, b) -> a.charAt(0) == b.charAt(0)) will create stream of Lists of strings where each list contains adjacent strings started with the same letter.

Other partial reduction operations include intervalMap, runLengths() and so on.

All partial reduction operations are lazy, parallel-friendly and quite efficient.

Note that you can easily construct a StreamEx object from regular Java 8 stream using StreamEx.of(stream). Also there are methods to construct it from array, Collection, Reader, etc. The StreamEx class implements Stream interface and 100% compatible with standard Stream API.

Sign up to request clarification or add additional context in comments.

2 Comments

I'll check out your library. This seems to be exactly what I need. Thank you for the suggestion.
Hmm, interesting. Would be very nice to see this applied to the OPs actual code, @MatthiasWimmer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.