Getting unique items from a list [duplicate]

Question

What is the fastest / most efficient way of getting all the distinct items from a list?

I have a List<string> that possibly has multiple repeating items in it and only want the unique values within the list.

The title of this question is misleading. Selecting unique items is about selecting items that occur just once in the list, against selecting each distinct element,once. Given ["A", "B", "C", "C", "D", "D"], unique items would return ["A","B"], whereas distinct items would return ["A", "B", "C", "D"]. — Eduardo Pignatelli
– Eduardo Pignatelli, Commented Jun 28, 2018 at 11:12
@EduardoPignatelli Quite picky, but the question could be reworded unambiguously. The intent of this question as normally encountered means: "Given a list of values, how do I get a list of those values without duplicating any?" — Suncat2000
– Suncat2000, Commented Sep 5, 2018 at 17:28

LukeH · Accepted Answer · 2009-09-07 09:15:17Z

208

You can use the Distinct method to return an IEnumerable<T> of distinct items:

var uniqueItems = yourList.Distinct();

And if you need the sequence of unique items returned as a List<T>, you can add a call to ToList:

var uniqueItemsList = yourList.Distinct().ToList();

edited Sep 7, 2009 at 9:15

answered Sep 7, 2009 at 9:10

LukeH

271k59 gold badges373 silver badges411 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Noldorin Over a year ago

The OP was looking for a fast/efficient method. This is not it. Calling yourList.Distinct().ToList() requires two full iterations over the enumerable, and additionally is based off IEqualityComparer, which is slower than GetHashCode.

Vinay Sajip Over a year ago

Is this faster/more efficient than a HashSet<T>? I don't think so. Not bothering to downvote, though :-)

reavowed Over a year ago

@Noldorin: I know this is old, but it shows up easily on Google and you're wrong (at least, as of .NET 4 - I haven't checked in older versions). yourList.Distinct().ToList() performs one enumeration, new HashSet<T>(yourList).ToList() performs two. And the implementations of HashSet and Distinct's internal Set class are almost identical. They both use GetHashCode, and they both use IEqualityComparers (which they have to, as equal hashcodes don't (in general) guarantee equal objects).

reavowed Over a year ago

@Noldorin: How would a performance benchmark make any argument for or against what I said? You can verify what I said by pulling up System.Linq.Enumerable.DistinctIterator<T> and System.Linq.Set<T> in Reflector (or other .NET decompiler), independent of relative performance.

Noldorin Over a year ago

@IainM: Sorry, you're right. I was reading into your post and taking the implication that they are similar in speed. I am still very interested if they actually are. I suspect the difference is still there, though it has possibly gone down since .NET 4.0.

|

Noldorin · Accepted Answer · 2009-09-07 09:20:44Z

168

Use a HashSet<T>. For example:

var items = "A B A D A C".Split(' ');
var unique_items = new HashSet<string>(items);
foreach (string s in unique_items)
    Console.WriteLine(s);

prints

A
B
D
C

edited Sep 7, 2009 at 9:20

Noldorin

148k56 gold badges273 silver badges308 bronze badges

answered Sep 7, 2009 at 9:10

Vinay Sajip

100k15 gold badges184 silver badges196 bronze badges

3 Comments

Noon Silk Over a year ago

Must agree; others solve the problem, yours solves the cause :)

LukeH Over a year ago

A HashSet won't maintain any ordering, which may or may not be an issue for the OP.

domgreen Over a year ago

thanks guys, I don't require the items to be ordered. This works great.

aku · Accepted Answer · 2009-09-07 09:10:24Z

7

You can use Distinct extension method from LINQ

answered Sep 7, 2009 at 9:10

aku

124k33 gold badges177 silver badges204 bronze badges

Comments

Vinko Vrsalovic · Accepted Answer · 2009-09-07 09:49:18Z

5

Apart from the Distinct extension method of LINQ, you could use a HashSet<T> object that you initialise with your collection. This is most likely more efficient than the LINQ way, since it uses hash codes (GetHashCode) rather than an IEqualityComparer).

In fact, if it's appropiate for your situation, I would just use a HashSet for storing the items in the first place.

edited Sep 7, 2009 at 9:49

Vinko Vrsalovic

342k55 gold badges341 silver badges374 bronze badges

answered Sep 7, 2009 at 9:12

Noldorin

148k56 gold badges273 silver badges308 bronze badges

5 Comments

LukeH Over a year ago

A HashSet won't maintain any ordering, which may or may not be an issue for the OP.

Noldorin Over a year ago

@Luke: Even so, ordering would have no meaning after calling Distinct...

Vinay Sajip Over a year ago

@Luke: The question asks about fastest/most efficient, and doesn't require ordering to be maintained.

LukeH Over a year ago

@Noldorin: Why not? Distinct should/does iterate the list in order (although I'm not sure if that's actually guaranteed in any spec).

Noldorin Over a year ago

@Luke: Oh, I was thinking of indexing really. And anyway, efficiency was mentioned in the OP, while order wasn't (though that's open question) - HashSet is the way to go if you want good performance.

Murilo Beltrame · Accepted Answer · 2010-12-11 13:41:01Z

5

In .Net 2.0 I`m pretty sure about this solution:

public IEnumerable<T> Distinct<T>(IEnumerable<T> source)
{
     List<T> uniques = new List<T>();
     foreach (T item in source)
     {
         if (!uniques.Contains(item)) uniques.Add(item);
     }
     return uniques;
}

answered Dec 11, 2010 at 13:41

Murilo Beltrame

811 silver badge1 bronze badge

1 Comment

Timo Over a year ago

Please use a collection with faster random access than List, such as a Dictionary or HashSet. Because currently, if source contains 100,000 items with many duplicates, then in every one of the 100,000 iterations you will be scanning a list on the order of 100,000 items, meaning you are scanning on the order of 100,000 * 100,000 items. Quadratic time complexity can become quite slow.

Collectives™ on Stack Overflow

Getting unique items from a list [duplicate]

5 Answers 5

7 Comments

3 Comments

Comments

5 Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

7 Comments

3 Comments

Comments

5 Comments

1 Comment

Linked

Related