37

I have a list/collection of objects that may or may not have the same property values. What's the easiest way to get a distinct list of the objects with equal properties? Is one collection type best suited for this purpose? For example, in C# I could do something like the following with LINQ.

var recipients = (from recipient in recipientList
                 select recipient).Distinct();

My initial thought was to use lambdaj (link text), but it doesn't appear to support this.

1
  • 1
    Use this LINQ-like library (github.com/nicholas22/jpropel-light) and do List<Recipient> recipientList; recipientList.distinct(); This is exactly what the code you posted does. Commented Oct 8, 2011 at 10:26

9 Answers 9

50
return new ArrayList(new HashSet(recipients));
Sign up to request clarification or add additional context in comments.

2 Comments

You actually answered the question (instead of just listing the tools to answer the question).
Ha, didn't crossed my mind to use this quick "trick" to get a unique list of items :)
33

Use an implementation of the interface Set<T> (class T may need a custom .equals() method, and you may have to implement that .equals() yourself). Typically a HashSet does it out of the box : it uses Object.hashCode() and Object.equals() method to compare objects. That should be unique enough for simple objects. If not, you'll have to implement T.equals() and T.hashCode() accordingly.

See Gaurav Saini's comment below for libraries helping to implement equals and hashcode.

7 Comments

HashSet also uses equals if there's a hash collision.
This is incorrect - Object.hashCode() checks for identity , not meaningful equality. For 2 objects - different references - which are meaningfully equal Object.hashCode() will return false. Always implement hashCode() and equals() for objects which will be used in Sets or as keys in Maps.
Not quite correct. Let me give you an example: If hashCode() returns 1 (perfectly legal), then this results in a hash collision and thus equals will then be called by HashSet (actually it's backed by a HashMap).
@Robert : No need to downvote me for wrong reasons, read the javacode from Object.hashCode() : If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
Alternatively, use standard way of overriding hashCode() and equals() method using HashCodeBuilder and EqualsBuilder provided by Apache's Commons. HashCodeBuilder: commons.apache.org/lang/api-2.3/org/apache/commons/lang/builder/… EqualsBuilder: commons.apache.org/lang//api-2.4/org/apache/commons/lang/…
|
25

Place them in a TreeSet which holds a custom Comparator, which checks the properties you need:

SortedSet<MyObject> set = new TreeSet<MyObject>(new Comparator<MyObject>(){

    public int compare(MyObject o1, MyObject o2) {
         // return 0 if objects are equal in terms of your properties
    }
});

set.addAll(myList); // eliminate duplicates

4 Comments

Only works if the comparator is consistent with equals(), at which point you're better off with a HashSet anyway.
Why should it be consistent with equals() ? That it the usually the case but at the moment we need to remove some duplicates based on a custom condition. A Comparator is the most unobstrusive way I can think of.
Interestingly, the TreeSet docs indicate that there will be problems if the Comparator is inconsistent with equals(), while the docs for TreeMap (which TreeSet is based on) merely say that equals() will be ignored in that case. I now think this is the best answer. +1
The problem with inconsistent comparator is when you use a TreeSet, pass it around as a Set - which the consumer expects to obey hashCode() and equals(). Then all hell breaks loose :-)
17

Java 8:

recipients = recipients.stream()
    .distinct()
    .collect(Collectors.toList());

See java.util.stream.Stream#distinct.

Comments

11

order preserving version of the above response

return new ArrayList(new LinkedHashSet(recipients));

Comments

7

If you're using Eclipse Collections, you can use the method distinct().

ListIterable<Integer> integers = Lists.mutable.with(1, 3, 1, 2, 2, 1);
Assert.assertEquals(
    Lists.mutable.with(1, 3, 2),
    integers.distinct());

The advantage of using distinct() instead of converting to a Set and then back to a List is that distinct() preserves the order of the original List, retaining the first occurrence of each element. It's implemented by using both a Set and a List.

MutableSet<T> seenSoFar = Sets.mutable.with();
int size = list.size();
for (int i = 0; i < size; i++)
{
    T item = list.get(i);
    if (seenSoFar.add(item))
    {
        targetCollection.add(item);
    }
}
return targetCollection;

If you cannot convert your original List into an Eclipse Collections type, you can use ListAdapter to get the same API.

MutableList<Integer> distinct = ListAdapter.adapt(integers).distinct();

Note: I am a committer for Eclipse Collections.

Comments

4

You can use a Set. There's couple of implementations:

  • HashSet uses an object's hashCode and equals.
  • TreeSet uses compareTo (defined by Comparable) or compare (defined by Comparator). Keep in mind that the comparison must be consistent with equals. See TreeSet JavaDocs for more info.

Also keep in mind that if you override equals you must override hashCode such that two equals objects has the same hash code.

Comments

3

The ordinary way of doing this would be to convert to a Set, then back to a List. But you can get fancy with Functional Java. If you liked Lamdaj, you'll love FJ.

recipients = recipients
             .sort(recipientOrd)
             .group(recipientOrd.equal())
             .map(List.<Recipient>head_());

You'll need to have defined an ordering for recipients, recipientOrd. Something like:

Ord<Recipient> recipientOrd = ord(new F2<Recipient, Recipient, Ordering>() {
  public Ordering f(Recipient r1, Recipient r2) {
    return stringOrd.compare(r1.getEmailAddress(), r2.getEmailAddress());
  }
});

Works even if you don't have control of equals() and hashCode() on the Recipient class.

2 Comments

Why is it necessary to add the ordering for the objects?
You need the ordering so that the sort method knows how to order them.
2

Actually lambdaj implements this feature through the selectDistinctArgument method

http://lambdaj.googlecode.com/svn/trunk/html/apidocs/ch/lambdaj/Lambda.html#selectDistinctArgument(java.lang.Object,%20A)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.