Comparing and removing object from ArrayLists using Java 8

Question

My apologies if this is a simple basic info that I should be knowing. This is the first time I am trying to use Java 8 streams and other features.

I have two ArrayLists containing same type of objects. Let's say list1 and list2. Let's say the lists has Person objects with a property "employeeId".

The scenario is that I need to merge these lists. However, list2 may have some objects that are same as in list1. So I am trying to remove the objects from list2 that are same as in list1 and get a result list that then I can merge in list1.

I am trying to do this with Java 8 removeIf() and stream() features. Following is my code:

public List<PersonDto> removeDuplicates(List<PersonDto> list1, List<PersonDto> list2) {
        List<PersonDto> filteredList = list2.removeIf(list2Obj -> {
            list1.stream()
                .anyMatch( list1Obj -> (list1Obj.getEmployeeId() == list2Obj.getEmployeeId()) );
        } );
    }

The above code is giving compile error as below:

The method removeIf(Predicate) in the type Collection is not applicable for the arguments (( list2Obj) -> {})

So I changed the list2Obj at the start of "removeIf()" to (<PersonDto> list2Obj) as below:

public List<PersonDto> removeDuplicates(List<PersonDto> list1, List<PersonDto> list2) {
            List<PersonDto> filteredList = list2.removeIf((<PersonDto> list2Obj) -> {
                list1.stream()
                    .anyMatch( list1Obj -> (list1Obj.getEmployeeId() == list2Obj.getEmployeeId()) );
            } );
        }

This gives me an error as below:

Syntax error on token "<", delete this token for the '<' in (<PersonDto> list2Obj) and Syntax error on token(s), misplaced construct(s) for the part from '-> {'

I am at loss on what I really need to do to make it work.

Would appreciate if somebody can please help me resolve this issue.

Eugene · Accepted Answer · 2017-06-09 16:33:36Z

2

I've simplified your function just a little bit to make it more readable:

public static List<PersonDto> removeDuplicates(List<PersonDto> left, List<PersonDto> right) {
    left.removeIf(p -> {
        return right.stream().anyMatch(x -> (p.getEmployeeId() == x.getEmployeeId()));
    });
    return left;
}

Also notice that you are modifying the left parameter, you are not creating a new List.

You could also use: left.removeAll(right), but you need equals and hashcode for that and it seems you don't have them; or they are based on something else than employeeId.

Another option would be to collect those lists to a TreeSet and use removeAll:

TreeSet<PersonDto> leftTree = left.stream()
      .collect(Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(PersonDto::getEmployeeId))));

TreeSet<PersonDto> rightTree = right.stream()
      .collect(Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(PersonDto::getEmployeeId))));

leftTree.removeAll(rightTree);

edited Jun 9, 2017 at 16:33

answered Jun 9, 2017 at 16:25

Eugene

122k17 gold badges219 silver badges335 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Gauzy Over a year ago

thanks for the quick reply, appreciated! I wasn't aware of removeAll() but you are right the equals and hashcode are based on something else. In you code, what is the first return returning? I mean, I know it's to return the filtered list but the placement of return keyword is throwing me off. I would think that first return before left.removeIf(.... The second return is returning the original list when there are no duplicates.

Eugene Over a year ago

@Gauzy removeIf takes a Predicate as input; thus that returns a boolean.

fps Over a year ago

Eugene, Hi! In your second snippet, I think you don't need to stream the lists in order to create the TreeSets. Just create the TreeSets with the comparator, and then call addAll with each list.

Eugene Over a year ago

@FedericoPeraltaSchaffner hey! that would indeed merge them, but the OP wanted So I am trying to remove the objects from list2 that are same as in list1, so I guess it really depends on what the OP actually needs

Eugene Over a year ago

@FedericoPeraltaSchaffner first its really hard to type your name from the phone :) second, yes, could be done like this too

|

fps · Accepted Answer · 2017-06-12 13:21:41Z

2

I understand you are trying to merge both lists without duplicating the elements that belong to the intersection. There are many ways to do this. One is the way you've tried, i.e. remove elements from one list that belong to the other, then merge. And this, in turn, can be done in several ways.

One of these ways would be to keep the employee ids of one list in a HashSet and then use removeIf on the other list, with a predicate that checks whether each element has an employee id that is contained in the set. This is better than using anyMatch on the second list for each element of the first list, because HashSet.contains runs in O(1) amortized time. Here's a sketch of the solution:

// Determine larger and smaller lists
boolean list1Smaller = list1.size() < list2.size();
List<PersonDto> smallerList = list1Smaller ? list1 : list2;
List<PersonDto> largerList = list1Smaller ? list2 : list1;

// Create a Set with the employee ids of the larger list
// Assuming employee ids are long
Set<Long> largerSet = largerList.stream()
    .map(PersonDto::getEmployeeId)
    .collect(Collectors.toSet());

// Now remove elements from the smaller list
smallerList.removeIf(dto -> largerSet.contains(dto.getEmployeeId()));

The logic behind this is that HashSet.contains will take the same time for both a large and a small set, because it runs in O(1) amortized time. However, traversing a list and removing elements from it will be faster on smaller lists.

Then, you are ready to merge both lists:

largerList.addAll(smallerList);

edited Jun 12, 2017 at 13:21

answered Jun 9, 2017 at 18:47

fps

34.5k8 gold badges63 silver badges116 bronze badges

4 Comments

Holger Over a year ago

The set of employee ids surely hasn’t the type Set<PersonDto>. And I would assume Collectors.toSet() sufficient. There is no need for a guaranteed HashSet.

fps Over a year ago

@Holger Thanks, I've fixed that. But can we be sure that the set returned by Collectors.toSet() supports hashing?

Holger Over a year ago

We can be sure that the JRE developers will not change it to something worse just for fun or in other words, we can expect that toSet() will return a reasonably efficient Set implementation. It doesn’t have to use hashing, e.g. very small sets may perform better than HashSet. Collectors.toSet() allows the JRE to pick the best, i.e. future versions may return immutable sets being better prepared for even faster lookup than the mutable HashSet…

fps Over a year ago

@Holger Agreed, let's trust the JDK guys, then. Thank you for all your feedback not only in this answer but also in the others.

Collectives™ on Stack Overflow

Comparing and removing object from ArrayLists using Java 8

2 Answers 2

7 Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related