-1

Just having trouble with itertools.groupby. Given a list of dictionaries,

my_list= [ 
"AD01", "AD01AA", "AD01AB", "AD01AC", "AD01AD","AD02", "AD02AA", "AD02AB", "AD02AC"]

from this list, I expected to create a dictionary, where the key is the shortest name and the values ​​are the longest names

example

[
{"Legacy" : "AD01", "rphy" : ["AD01AA", "AD01AB", "AD01AC", "AD01AD"]},
{"Legacy" : "AD02", "rphy" : ["AD02AA", "AD02AB", "AD02AC"]},
]

could you help me please

1
  • Is 2 in the title a typo? Commented May 2, 2022 at 18:08

2 Answers 2

0

You can use itertools.groupby, with some nexts:

from itertools import groupby

my_list= ["AD01", "AD01AA", "AD01AB", "AD01AC", "AD01AD","AD02", "AD02AA", "AD02AB", "AD02AC"]

groups = groupby(my_list, len)
output = [{'Legacy': next(g), 'rphy': list(next(groups)[1])} for _, g in groups]

print(output)
# [{'Legacy': 'AD01', 'rphy': ['AD01AA', 'AD01AB', 'AD01AC', 'AD01AD']},
#  {'Legacy': 'AD02', 'rphy': ['AD02AA', 'AD02AB', 'AD02AC']}]

This is not robust to reordering of the input list.

Also, if there is some "gap" in the input, e.g., if "AD01" does not have corresponding 'rphy' entries, then it will throw a StopIteration error as you have found out. In that case you can use a more conventional approach:

from itertools import groupby

my_list= ["AD01", "AD02", "AD02AA", "AD02AB", "AD02AC"]

output = []
for item in my_list:
    if len(item) == 4:
        dct = {'Legacy': item, 'rphy': []}
        output.append(dct)
    else:
        dct['rphy'].append(item)

print(output)
# [{'Legacy': 'AD01', 'rphy': []}, {'Legacy': 'AD02', 'rphy': ['AD02AA', 'AD02AB', 'AD02AC']}]
Sign up to request clarification or add additional context in comments.

3 Comments

my list is much larger, when trying to enter it is throwing me the following error. Traceback (most recent call last): File "c:\Users\bgamboa\Documents\Proyectos\listnodos\main.py", line 37267, in <module> output = [{'Legacy': next(g), 'rphy': list(next(groups)[1])} for _, g in groups] File "c:\Users\bgamboa\Documents\Proyectos\listnodos\main.py", line 37267, in <listcomp> output = [{'Legacy': next(g), 'rphy': list(next(groups)[1])} for _, g in groups] StopIteration
If I wanted to omit all those that do not have a legacy, how would I do it?
but the example would not be according to what I need, it would be something like this: my_list= [ "AD01", "AD01AA", "AD01AB","AD02", "AD02AA", "AD02AB", "AD03AC","AD03AB", "AD03AC"] I would take all the ones that don't have legacy as "AD03AC" and save them in another list
0

One approach would be: (see the note at the end of the answer)

from itertools import groupby
from pprint import pprint

my_list = [
    "AD01",
    "AD01AA",
    "AD01AB",
    "AD01AC",
    "AD01AD",
    "AD02",
    "AD02AA",
    "AD02AB",
    "AD02AC",
]

res = []
for _, g in groupby(my_list, len):
    lst = list(g)
    if len(lst) == 1:
        res.append({"Legacy": lst[0], "rphy": []})
    else:
        res[-1]["rphy"].append(lst)

pprint(res)

output:

[{'Legacy': 'AD01', 'rphy': [['AD01AA', 'AD01AB', 'AD01AC', 'AD01AD']]},
 {'Legacy': 'AD02', 'rphy': [['AD02AA', 'AD02AB', 'AD02AC']]}]

This assumes that your data always starts with your desired key(the name which has the smallest name compare to the next values).

Basically in every iteration you check then length of the created list from groupby. If it is 1, this mean it's your key, if not, it will add the next items to the dictionary.

Note: This code would break if there aren't at least 2 names with the length larger than the keys between two keys.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.