python: flatten list while preserving nested structure for certain indexes

Question

I found several posts about flattening/collapsing lists in Python, but none which cover this case:

Input:

[a_key_1, a_key_2, a_value_1, a_value_2]
[b_key_1, b_key_2, b_value_1, b_value_2]
[a_key_1, a_key_2 a_value_3, a_value_4]
[a_key_1, a_key_3, a_value_5, a_value_6]

Output:

[a_key_1, a_key_2, [a_value1, a_value3], [a_value_2, a_value_4]]
[b_key_1, b_key_2, [b_value1], [b_value_2]]
[a_key_1, a_key_3, [a_value_5], [a_value_6]]

I want to flatten the lists so there is only one entry per unique set of keys and the remaining values are combined into nested lists next to those unique keys.

EDIT: The first two elements in the input will always be the keys; the last two elements will always be the values.

Is this possible?

It's simply based on position in the original input. So, position 0 and 1 are always keys, 2 and 3 are always values. — okoboko
– okoboko, Commented May 22, 2015 at 1:47

Tyson · Accepted Answer · 2015-05-22 02:21:57Z

Yes, it's possible. Here's a function (with doctest from your input/output) that performs the task:

#!/usr/bin/env python
"""Flatten lists as per http://stackoverflow.com/q/30387083/253599."""

from collections import OrderedDict


def flatten(key_length, *args):
    """
    Take lists having key elements and collect remainder into result.

    >>> flatten(1,
    ...         ['A', 'a1', 'a2'],
    ...         ['B', 'b1', 'b2'],
    ...         ['A', 'a3', 'a4'])
    [['A', ['a1', 'a2'], ['a3', 'a4']], ['B', ['b1', 'b2']]]

    >>> flatten(2,
    ...         ['A1', 'A2', 'a1', 'a2'],
    ...         ['B1', 'B2', 'b1', 'b2'],
    ...         ['A1', 'A2', 'a3', 'a4'],
    ...         ['A1', 'A3', 'a5', 'a6'])
    [['A1', 'A2', ['a1', 'a2'], ['a3', 'a4']], ['B1', 'B2', ['b1', 'b2']], ['A1', 'A3', ['a5', 'a6']]]
    """
    result = OrderedDict()
    for vals in args:
        result.setdefault(
            tuple(vals[:key_length]), [],
        ).append(vals[key_length:])
    return [
        list(key) + list(vals)
        for key, vals
        in result.items()
    ]


if __name__ == '__main__':
    import doctest
    doctest.testmod()

(Edited to work with both your original question and the edited question)

ssundarraj · Accepted Answer · 2015-05-22 07:15:59Z

1

data = [
    ["a_key_1", "a_key_2", "a_value_1", "a_value_2"],
    ["b_key_1", "b_key_2", "b_value_1", "b_value_2"],
    ["a_key_1", "a_key_2", "a_value_3", "a_value_4"],
    ["a_key_1", "a_key_3", "a_value_5", "a_value_6"],
]

from itertools import groupby
keyfunc = lambda row: (row[0], row[1])
print [
    list(key) + [list(zipped) for zipped in zip(*group)[2:]]
    for key, group
    in groupby(sorted(data, key=keyfunc), keyfunc)
]


# => [['a_key_1', 'a_key_2', ['a_value_1', 'a_value_3'], ['a_value_2', 'a_value_4']],
#     ['a_key_1', 'a_key_3', ['a_value_5'], ['a_value_6']],
#     ['b_key_1', 'b_key_2', ['b_value_1'], ['b_value_2']]]

For more information check the Python Docs

edited May 22, 2015 at 7:15

ssundarraj

8308 silver badges18 bronze badges

answered May 22, 2015 at 2:07

Amadan

200k23 gold badges252 silver badges321 bronze badges

2 Comments

okoboko Over a year ago

It is not possible to group where position 0 is the only key and the collapsed values are positions 1, 2, and 3? It seems like you can't just manipulate the lambda row: (row[0], row[1]) to lambda row: (row[0]) or else it breaks.

Amadan Over a year ago

(row[0]) is equal to row[0]. (row[0],) (with comma) is a tuple just like (row[0], row[1]); this way keeps the multikey and singlekey cases similar. Or, you could change to lambda row: row[0], and change list(key) into [key]; this is simpler, but is structurally different than the multikey case. In either case you will have to change [2:] into [1:].

Collectives™ on Stack Overflow

python: flatten list while preserving nested structure for certain indexes

2 Answers 2

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related