1

Let's say that I have a Dictionary like this

dict1 = [{
           'Name': 'Team1',
           'id': '1',
           'Members': [
                      {
                          'type': 'user',
                          'id': '11'
                      },
                      {
                          'type': 'user',
                          'id': '12' 
                      }
                      ]
        },
        {
            'Name': 'Team2',
            'id': '2',
            'Members': [
                         {
                            'type': 'group'
                            'id': '1'
                         },
                         {
                          'type': 'user',
                          'id': '21' 
                      }
                       ]
        },
        {
            'Name': 'Team3',
            'id': '3',
            'Members': [
                         {
                            'type': 'group'
                            'id': '2'
                         }
                       ]
        }]

and I want to get an output that can replace all the groups and nested groups with all distinct users. In this case the output should look like this:

dict2 = [{
           'Name': 'Team1',
           'id': '1',
           'Members': [
                      {
                          'type': 'user',
                          'id': '11'
                      },
                      {
                          'type': 'user',
                          'id': '12' 
                      }
                      ]
        },
        {
            'Name': 'Team2',
            'id': '2',
            'Members': [
                         {
                           'type': 'user',
                           'id': '11'
                         },
                         {
                          'type': 'user',
                          'id': '12' 
                         }
                         {
                          'type': 'user',
                          'id': '21' 
                         }
                       ]
        },
        {
            'Name': 'Team3',
            'id': '3',
            'Members': [
                         {
                           'type': 'user',
                           'id: '11'
                         },
                         {
                          'type': 'user',
                          'id': '12' 
                         }
                         {
                          'type': 'user',
                          'id': '21' 
                         }
                       ]
        }]

Now let's assume that I have a large dataset to perform these actions on. (approx 20k individual groups)

What would be the best way to code this? I am attempting recursion, but I am not sure about how to search through the dictionary and lists in this manner such that it doesn't end up using too much memory

10
  • 1
    Is this json data by any chance? Commented Feb 4, 2021 at 5:40
  • Recursion/iteration would be fine. Even with 20000 groups, ram usage will remain under 100mb according to my calculations. Commented Feb 4, 2021 at 5:50
  • @pavel Definitely not JSON format. " and variables. But Python dictionary and JSON are usually inter-changeable. Commented Feb 4, 2021 at 6:25
  • @TanishqBanyal being a bit unfamiliar with python, I keep having issues when trying to do a recursion loop using this data. The main issue I'm having is how can I search through the dictionary and list, and how can I create the final dictionary.. I wonder if there is a link/ another qs (was not able to find one similar to my qs) asked that I could refer to? Commented Feb 4, 2021 at 6:31
  • 1
    @RohitJainani Your data descriptions are syntactically wrong. Are they wrapped as a list, e.g., dict1 = [...]? Commented Feb 4, 2021 at 6:53

1 Answer 1

1

I do not think you need recursion. Looping is enough.

I think you can simply evaluate each Memberss, fetch users if group type, and make them unique. Then you can simply replace Members's value with distinct_users.

You might have a dictionary for groups like:

group_dict = { 
    '1': [
        {'type': 'user', 'id': '11'},
        {'type': 'user', 'id': '12'}
    ],  
    '2': [
        {'type': 'user', 'id': '11'},
        {'type': 'user', 'id': '12'},
        {'type': 'user', 'id': '21'}
    ],
    '3': [
        {'type': 'group', 'id': '1'},
        {'type': 'group', 'id': '2'},
        {'type': 'group', 'id': '3'}  # recursive
    ]
    ...   
}

You can try:

def users_in_group(group_id):
    users = []
    groups_to_fetch = []
    for user_or_group in group_dict[group_id]:
        if user_or_group['type'] == 'group':
            groups_to_fetch.append(user_or_group)
        else:  # 'user' type
            users.append(user_or_group)
    
    groups_fetched = set()  # not to loop forever
    while groups_to_fetch:
        group = groups_to_fetch.pop()
        if group['id'] not in groups_fetched:
            groups_fetched.add(group['id'])
            for user_or_group in group_dict[group['id']]:
                if user_or_group['type'] == 'group' and user_or_group['id'] not in groups_fetched:
                    groups_to_fetch.append(user_or_group)
                else:  # 'user' type
                    users.append(user_or_group)

    return users


def distinct_users_in(members):
    distinct_users = []

    def add(user):
        if user['id'] not in user_id_set:
            distinct_users.append(user)
        user_id_set.add(user['id'])

    user_id_set = set()
    for member in members:
        if member['type'] == 'group':
            for user in users_in_group(member['id']):
                add(user)
        else:  # 'user'
            user = member
            add(user)

    return distinct_users


dict2 = dict1  # or `copy.deepcopy`
for element in dict2:
    element['Members'] = distinct_users_in(element['Members'])

Each Members is re-assigned by distinct_users returned by the corresponding function. The function takes Members and fetches users from each if member type. If user type, member itself is a user. While (fetched) users are appended to distinct_user, you can use their ids for uniquity.

When you fetch users_in_group, you can use two lists; groups_to_fetch and groups_fetched. The former is a stack to recursively fetch all groups in a group. The latter is not to fetch an already fetched group again. Or, it could loop forever.

Finally, if your data are already in memory, this approach may not exhaust memory and work.

Sign up to request clarification or add additional context in comments.

3 Comments

Would this implementation still work on nested groups? For eg, in my above example group 3 needs group 2, which contains group1, so group 3 should get all members of group 1 and 2.
@RohitJainani Ah, that's why you considered recursion?
yes, since the dataset is quite large, around 20k groups, some of which have their own subgroups, which have their own etc. Think of it like a mimic of an active directory

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.