17

I need to generate a unique friend list from a list that can have duplicates by merging all duplicate entries into one object
Example - Friends are fetched from different social feeds and put into 1 big list
1. Friend - [name: "Johnny Depp", dob: "1970-11-10", source: "FB", fbAttribute: ".."]
2. Friend - [name: "Christian Bale", dob: "1970-01-01", source: "LI", liAttribute: ".."]
3. Friend - [name: "Johnny Depp", dob: "1970-11-10", source: "Twitter", twitterAttribute: ".."]
4. Friend - [name: "Johnny Depp", dob: "1970-11-10", source: "LinkedIn", liAttribute: ".."]
5. Friend - [name: "Christian Bale", dob: "1970-01-01", source: "LI", liAttribute: ".."]

Expected output
1. Friend - [name: "Christian Bale", dob: "1970-01-01", liAttribute: "..", fbAttribute: "..", twitterAttribute: ".."]
2. Friend - [name: "Johnny Depp", dob: "1970-11-10", liAttribute: "..", fbAttribute: "..", twitterAttribute: ".."]

Question - How can i merge without using any intermediate container? I can easily use an intermediate map and apply reduce on each value of the entry.

List<Friend> friends;
Map<String, List<Friend>> uniqueFriendMap
    = friends.stream().groupingBy(Friend::uniqueFunction);
List<Friend> mergedFriends = uniqueFriendMap.entrySet()
    .stream()
    .map(entry -> {
           return entry.getValue()
                .stream()
                .reduce((a,b) -> friendMergeFunction(a,b));
    })
    .filter(mergedPlace -> mergedPlace.isPresent())
    .collect(Collectors.toList());

I like to do this without using the intermediate Map uniqueFriendMap. Any suggestions?

1 Answer 1

19

The groupingBy operation (or something similar) is unavoidable, the Map created by the operation is also used during the operation for looking up the grouping keys and finding the duplicates. But you can combine it with the reduction of the group elements:

Map<String, Friend> uniqueFriendMap = friends.stream()
    .collect(Collectors.groupingBy(Friend::uniqueFunction,
        Collectors.collectingAndThen(
            Collectors.reducing((a,b) -> friendMergeFunction(a,b)), Optional::get)));

The values of the map are already the resulting distinct friends. If you really need a List, you can create it with a plain Collection operation:

List<Friend> mergedFriends = new ArrayList<>(uniqueFriendMap.values());

If this second operation still annoys you, you can hide it within the collect operation:

List<Friend> mergedFriends = friends.stream()
    .collect(Collectors.collectingAndThen(
        Collectors.groupingBy(Friend::uniqueFunction, Collectors.collectingAndThen(
            Collectors.reducing((a,b) -> friendMergeFunction(a,b)), Optional::get)),
        m -> new ArrayList<>(m.values())));

Since the nested collector represents a Reduction (see also this answer), we can use toMap instead:

List<Friend> mergedFriends = friends.stream()
    .collect(Collectors.collectingAndThen(
        Collectors.toMap(Friend::uniqueFunction, Function.identity(),
            (a,b) -> friendMergeFunction(a,b)),
        m -> new ArrayList<>(m.values())));

Depending on whether friendMergeFunction is a static method or instance method, you may replace (a,b) -> friendMergeFunction(a,b) with DeclaringClass::friendMergeFunction or this::friendMergeFunction.


But note that even within your original approach, several simplifications are possible. When you only process the values of a Map, you don’t need to use the entrySet(), which requires you to call getValue() on each entry. You can process the values() in the first place. Then, you don’t need the verbose input -> { return expression; } syntax, as input -> expression is sufficient. Since the groups of the preceding grouping operation can not be empty, the filter step is obsolete. So your original approach would look like:

Map<String, List<Friend>> uniqueFriendMap
    = friends.stream().collect(Collectors.groupingBy(Friend::uniqueFunction));
List<Friend> mergedFriends = uniqueFriendMap.values().stream()
    .map(group -> group.stream().reduce((a,b) -> friendMergeFunction(a,b)).get())
    .collect(Collectors.toList());

which is not so bad. As said, the fused operation doesn’t skip the Map creation as that’s unavoidable. It only skips the creations of the Lists representing each group, as it will reduce them to a single Friend in-place.

Sign up to request clarification or add additional context in comments.

6 Comments

Hi Holger, maybe you could replace Collectors.groupingBy with a Collectors.toMap, something like this Map<String, Friend> uniqueFriendMap = friends.stream().collect(Collectors.toMap(Friend::uniqueFunction, Function.identity(), this::friendMergeFunction));, it is much simpler don't you think ?
In regards to Nicolas's comment I believe that the uniqueFunction could produce collisions on the hash which would force you to write an equals method that does not really represent the full object.
@Dave I don’t know what you mean with “uniqueFunction could produce collisions on the hash”. That function returns strings and merging those Friend instances for which the uniqueFunction returns the same string, as the task of this question.
@Holger If the Friend instances that you are merging have different sources but you still want to merge them than you need to write hashcode and equals methods that do not represent the full object. Functional in this case, but not a standard thing to do.
@Dave It seems, you are confusing things here. For a HashMap, only the hash code and equals methods of the keys are relevant and, as already said in my previous comment, here, the keys are the String instances returned by the method uniqueFunction. There is no “full object” used as key, just a single property. What will be merged, are the values of the map, whose equals or hash code implementation is entirely irrelevant. And after you haven’t explained your term “produce collisions on the hash”, you introduced another one, to “have different sources”…
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.