1

I have a list of dictionaries, where some "term" values are repeated:

terms_dict = [{'term': 'potato', 'cui': '123AB'}, {'term': 'carrot', 'cui': '222AB'}, {'term': 'potato', 'cui': '456AB'}]

As you can see the term 'potato' value appears more than once. I would like to store this 'term' for future reference as a variable. Then, remove all of those repeated terms from the terms_dict, leaving only the term 'carrot' dictionary in the list.

Desired output:

repeated_terms = ['potato'] ## identified and stored terms that are repeated in terms_dict. 

new_terms_dict = [{'term': 'carrot', 'cui': '222AB'}] ## new dict with the unique term.

Idea:

I can certainly create a new dictionary with unique terms, however, I am stuck with actually identifying the "term" that is repeated and storing it in a list.

Is there a pythonic way of finding/printing/storing the repeated values ?

1

2 Answers 2

2

You can use collections.Counter for the task:

from collections import Counter

terms_dict = [
    {"term": "potato", "cui": "123AB"},
    {"term": "carrot", "cui": "222AB"},
    {"term": "potato", "cui": "456AB"},
]

c = Counter(d["term"] for d in terms_dict)

repeated_terms = [k for k, v in c.items() if v > 1]
new_terms_dict = [d for d in terms_dict if c[d["term"]] == 1]

print(repeated_terms)
print(new_terms_dict)

Prints:

['potato']
[{'term': 'carrot', 'cui': '222AB'}]
Sign up to request clarification or add additional context in comments.

Comments

1

You can use drop_duplicates and duplicated from pandas:

>>> import pandas as pd
>>> df = pd.DataFrame(terms_dict)
>>> df.term[df.term.duplicated()].tolist() # repeats
['potato']
>>> df.drop_duplicates('term', keep=False).to_dict('records') # without repeats
[{'term': 'carrot', 'cui': '222AB'}]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.