Creating multiple dictionaries based on other dictionary values in python

Question

I have a list that contains many dictionaries. Each dictionary represents a change that has occurred within my application. The "change" dictionary has the following entries:

userid: The user ID for a user
ctype: A reference to a change type in my application
score: A score

The ctype can be one of about 12 different strings to include "deletion", "new", "edit" and others. Here is an example of one of the "change" dictionaries:

{'userid':2, 'score':10, 'ctype':'edit'}

My question is, how can I create a dictionary that will aggregate all of the change types for each user within this large list of dictionaries? I would like to add the score from each change dictionary to create a total score and add each ctype instance together to get a count of each instance. The goal is to have a list of dictionaries with each dictionary looking like this:

{'userid':2, 'score':325, 'deletion':2, 'new':4, 'edit':9}

I have been trying to work this out but I am pretty new to python and I wasn't sure how to count the actual change types. The other part that gets me is how to refer to a dictionary based on 'userid'. If someone can present an answer I am sure that all of this will become very apparent to me. I appreciate any and all help.

jsbueno · Accepted Answer · 2011-03-04 12:34:21Z

1

The key thing to agregate data here is to have a dictionary where each key is the userid, and each entry is the data relevant to that userid.

final_data = {}
for entry in data:
    userid = entry["userid"]
    if userid not in final_data:
        final_data[userid] = {"userid": userid, "score": 0} 
    final_data[userid]["score"] += entry["score"]
    if not entry["ctype"] in final_data[userid]:
        final_data[userid][entry["ctype"]] = 1
    else:
        final_data[userid][entry["ctype"]] += 1

If you want the result as a list of dictionaries, just use final_data.values()

answered Mar 4, 2011 at 12:34

jsbueno

114k11 gold badges159 silver badges239 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Jakob Bowyer · Accepted Answer · 2011-03-04 12:20:06Z

0

Could you have

(Mock up not real python.)

{userid : {score : 1, ctype : ''}}

You can nest dict's as values in python dictionaries.

answered Mar 4, 2011 at 12:20

Jakob Bowyer

34.8k8 gold badges80 silver badges92 bronze badges

Comments

Eric Fortin · Accepted Answer · 2011-03-04 12:42:08Z

0

It could look like so:

change_types = ['deletion', 'new', 'edit', ...]
user_changes = {}
for change in change_list:
    userid = change['userid']
    if not userid in user_changes:
        aggregate = {}
        aggregate['score'] = 0
        for c in change_types:
            aggregate[c] = 0
        aggregate['userid'] = userid
        user_changes[userid] = aggregate
    else:
        aggregate = user_changes[userid]

    change_type = change['ctype']
    aggregate[change_type] = aggregate[change_type] + 1
    aggregate['score'] = aggregate['score'] + change['score']

Actually making a class for the aggregates would be a good idea.

edited Mar 4, 2011 at 12:42

answered Mar 4, 2011 at 12:29

Eric Fortin

7,6232 gold badges29 silver badges35 bronze badges

Comments

pberkes · Accepted Answer · 2011-03-04 13:02:09Z

0

To index the dictionaries with respect to userid, you can use a dictionary of dictionaries:

from collections import defaultdict

dict1 = {'userid': 1, 'score': 10, 'ctype': 'edit'}
dict2 = {'userid': 2, 'score': 13, 'ctype': 'other'}
dict3 = {'userid': 1, 'score': 1, 'ctype': 'edit'}
list_of_dicts = [dict1, dict2, dict3]

user_dict = defaultdict(lambda: defaultdict(int))
for d in list_of_dicts:
    userid = d['userid']
    user_dict[userid]['score'] += d['score']
    user_dict[userid][d['ctype']] += 1


# user_dict is now
# defaultdict(<function <lambda> at 0x02A7DF30>,
#  {1: defaultdict(<type 'int'>, {'edit': 2, 'score': 11}),
#   2: defaultdict(<type 'int'>, {'score': 13, 'other': 1})})

In the example, I used a defaultdict to avoid checking at every iteration if the key d['ctype'] exists.

edited Mar 4, 2011 at 13:02

answered Mar 4, 2011 at 12:28

pberkes

5,3801 gold badge28 silver badges23 bronze badges

2 Comments

Björn Pollex Over a year ago

Prefer userid in user_dict over has_key.

Mark Longair Over a year ago

You could also do user_dict = defaultdict(lambda: defaultdict(int))

Collectives™ on Stack Overflow

Creating multiple dictionaries based on other dictionary values in python

4 Answers 4

Comments

Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related