0

I have a list that contains many dictionaries. Each dictionary represents a change that has occurred within my application. The "change" dictionary has the following entries:

userid: The user ID for a user
ctype: A reference to a change type in my application
score: A score

The ctype can be one of about 12 different strings to include "deletion", "new", "edit" and others. Here is an example of one of the "change" dictionaries:

{'userid':2, 'score':10, 'ctype':'edit'}

My question is, how can I create a dictionary that will aggregate all of the change types for each user within this large list of dictionaries? I would like to add the score from each change dictionary to create a total score and add each ctype instance together to get a count of each instance. The goal is to have a list of dictionaries with each dictionary looking like this:

{'userid':2, 'score':325, 'deletion':2, 'new':4, 'edit':9}

I have been trying to work this out but I am pretty new to python and I wasn't sure how to count the actual change types. The other part that gets me is how to refer to a dictionary based on 'userid'. If someone can present an answer I am sure that all of this will become very apparent to me. I appreciate any and all help.

4 Answers 4

1

The key thing to agregate data here is to have a dictionary where each key is the userid, and each entry is the data relevant to that userid.

final_data = {}
for entry in data:
    userid = entry["userid"]
    if userid not in final_data:
        final_data[userid] = {"userid": userid, "score": 0} 
    final_data[userid]["score"] += entry["score"]
    if not entry["ctype"] in final_data[userid]:
        final_data[userid][entry["ctype"]] = 1
    else:
        final_data[userid][entry["ctype"]] += 1

If you want the result as a list of dictionaries, just use final_data.values()

Sign up to request clarification or add additional context in comments.

Comments

0

Could you have

(Mock up not real python.)

{userid : {score : 1, ctype : ''}}

You can nest dict's as values in python dictionaries.

Comments

0

It could look like so:

change_types = ['deletion', 'new', 'edit', ...]
user_changes = {}
for change in change_list:
    userid = change['userid']
    if not userid in user_changes:
        aggregate = {}
        aggregate['score'] = 0
        for c in change_types:
            aggregate[c] = 0
        aggregate['userid'] = userid
        user_changes[userid] = aggregate
    else:
        aggregate = user_changes[userid]

    change_type = change['ctype']
    aggregate[change_type] = aggregate[change_type] + 1
    aggregate['score'] = aggregate['score'] + change['score']

Actually making a class for the aggregates would be a good idea.

Comments

0

To index the dictionaries with respect to userid, you can use a dictionary of dictionaries:

from collections import defaultdict

dict1 = {'userid': 1, 'score': 10, 'ctype': 'edit'}
dict2 = {'userid': 2, 'score': 13, 'ctype': 'other'}
dict3 = {'userid': 1, 'score': 1, 'ctype': 'edit'}
list_of_dicts = [dict1, dict2, dict3]

user_dict = defaultdict(lambda: defaultdict(int))
for d in list_of_dicts:
    userid = d['userid']
    user_dict[userid]['score'] += d['score']
    user_dict[userid][d['ctype']] += 1


# user_dict is now
# defaultdict(<function <lambda> at 0x02A7DF30>,
#  {1: defaultdict(<type 'int'>, {'edit': 2, 'score': 11}),
#   2: defaultdict(<type 'int'>, {'score': 13, 'other': 1})})

In the example, I used a defaultdict to avoid checking at every iteration if the key d['ctype'] exists.

2 Comments

Prefer userid in user_dict over has_key.
You could also do user_dict = defaultdict(lambda: defaultdict(int))

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.