Way to detect duplicate key values in a dict array in python?

Question

Given the code below:

a = [{"name": "Sport"}, {"name": "Games"}, {"name": "Videos"}, {"name": "Sport"}]

How can I find out if another dict in the a variable has the same name value? In the example above, the result should return "Sport".

Thanks in advance.

Adam Smith · Accepted Answer · 2015-05-26 23:39:36Z

9

Lots of great ways to do it. The canonical one in Python is probably to use a collections.Counter

from collections import Counter

c = Counter([d['name'] for d in a])
for value,count in c.items():
    if count > 1:
        print(value)

If all you need to know is whether or not something is a duplicate (not how many times it has been duplicated), you can simplify by just using a seen set.

seen = set()
for d in a:
    val = d['name']
    if val in seen:
        print(val)
    seen.add(val)

Probably the most over-engineered way to do it would be to sort the dicts by their "name" value, then run a groupby and check each group's length.

from itertools import groupby
from operator import itemgetter

namegetter = itemgetter('name')

new_a = sorted(a, key=namegetter)

groups = groupby(new_a, namegetter)

for groupname, dicts in groups:
    if len(list(dicts)) > 1:
        print(groupname)

edited May 26, 2015 at 23:39

answered May 26, 2015 at 23:30

Adam Smith

54.6k13 gold badges85 silver badges120 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Óscar López · Accepted Answer · 2015-05-26 23:40:48Z

2

This builds on @AdamSmith's answer, but is a bit shorter thanks to the use of list comprehensions:

from collections import Counter

a = [{"name": "Sport"}, {"name": "Games"}, {"name": "Videos"}, {"name": "Sport"}]
[name for name, count in Counter(x['name'] for x in a).items() if count > 1]

As a result we'll get a list of the duplicates:

['Sport']

edited May 26, 2015 at 23:40

answered May 26, 2015 at 23:38

Óscar López

237k38 gold badges321 silver badges391 bronze badges

2 Comments

Malik Brahimi Over a year ago

Again, not just the name key. Test anything that may be contained.

Adam Smith Over a year ago

@MalikBrahimi OP explicitly asks to test for duplicate names.

Stefan Pochmann · Accepted Answer · 2015-05-27 00:36:34Z

2

A couple more ways:

>>> seen = set()
>>> {n for n in (d['name'] for d in a) if n in seen or seen.add(n)}
{'Sport'}

>>> seen = set()
>>> {n for d in a for n in [d['name']] if n in seen or seen.add(n)}
{'Sport'}

>>> k, seen = 'name', set()
>>> {d[k] for d in a if d[k] in seen or seen.add(d[k])}
{'Sport'}

>>> seen = {}
>>> {d['name'] for i, d in enumerate(a) if seen.setdefault(d['name'], i) != i}
{'Sport'}

>>> seen = {}
>>> {d['name'] for d in a if seen.setdefault(d['name'], id(d)) != id(d)}
{'Sport'}

>>> x = set(), set()
>>> for n in (d['name'] for d in a): x[n in x[0]].add(n)
>>> x[1]
{'Sport'}

edited May 27, 2015 at 0:36

answered May 26, 2015 at 23:41

Stefan Pochmann

29k9 gold badges48 silver badges117 bronze badges

1 Comment

Adam Smith Over a year ago

A little difficult to follow, but nice use of the "or" shortcircuit

Shashank · Accepted Answer · 2015-05-26 23:44:43Z

0

Sets are a useful data structure to use here because they have constant time insertion and containment tests, unlike lists.

def first_duplicate_name(dictlist):
    seen = set()
    seen_add = seen.add
    for dct in dictlist:
        k = dct['name']
        if k in seen: # constant time AKA O(1)
            return k
        else:
            seen_add(k) # also O(1)

edited May 26, 2015 at 23:44

answered May 26, 2015 at 23:34

Shashank

13.9k5 gold badges39 silver badges63 bronze badges

Collectives™ on Stack Overflow

Way to detect duplicate key values in a dict array in python?

4 Answers 4

Comments

2 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

2 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related