0

I am having two lists as follows:

list_1
['A-1','A-1','A-1','A-2','A-2','A-3']

list_2
['iPad','iPod','iPhone','Windows','X-box','Kindle']

I would like to split the list_2 based on the index values in list_1. For instance,

list_a1
['iPad','iPod','iPhone']

list_a2
['Windows','X-box']

list_a3
['Kindle']

I know index method, but it needs the value to be matched to be passed along with. In this case, I would like to dynamically find the indexes of the values in list_1 with the same value. Is this possible? Any tips/hints would be deeply appreciated.

Thanks.

3
  • Is list_1 guaranteed to be sorted (such that all the occurrences of each index value appear in a row)? Commented Nov 18, 2013 at 20:42
  • Yes...list_1 values are ordered...all A-1's first, then A-2 and so on...and the values in list_2 follows the respective order in list_1 Commented Nov 18, 2013 at 20:44
  • Thank you for the answers...one of the few moments when I want to select both the answers...:-) Commented Nov 18, 2013 at 20:55

5 Answers 5

4

There are a few ways to do this.

I'd do it by using zip and groupby.

First:

>>> list(zip(list_1, list_2))
[('A-1', 'iPad'),
 ('A-1', 'iPod'),
 ('A-1', 'iPhone'),
 ('A-2', 'Windows'),
 ('A-2', 'X-box'),
 ('A-3', 'Kindle')]

Now:

>>> import itertools, operator
>>> [(key, list(group)) for key, group in 
...  itertools.groupby(zip(list_1, list_2), operator.itemgetter(0))]
[('A-1', [('A-1', 'iPad'), ('A-1', 'iPod'), ('A-1', 'iPhone')]),
 ('A-2', [('A-2', 'Windows'), ('A-2', 'X-box')]),
 ('A-3', [('A-3', 'Kindle')])]

So, you just want each group, ignoring the key, and you only want the second element of each element in the group. You can get the second element of each group with another comprehension, or just by unzipping:

>>> [list(zip(*group))[1] for key, group in
...  itertools.groupby(zip(list_1, list_2), operator.itemgetter(0))]
[('iPad', 'iPod', 'iPhone'), ('Windows', 'X-box'), ('Kindle',)]

I would personally find this more readable as a sequence of separate iterator transformations than as one long expression. Taken to the extreme:

>>> ziplists = zip(list_1, list_2)
>>> pairs = itertools.groupby(ziplists, operator.itemgetter(0))
>>> groups = (group for key, group in pairs)
>>> values = (zip(*group)[1] for group in groups)
>>> [list(value) for value in values]

… but a happy medium of maybe 2 or 3 lines is usually better than either extreme.

Sign up to request clarification or add additional context in comments.

Comments

2

Usually I'm the one rushing to a groupby solution ;^) but here I'll go the other way and manually insert into an OrderedDict:

list_1 = ['A-1','A-1','A-1','A-2','A-2','A-3']
list_2 = ['iPad','iPod','iPhone','Windows','X-box','Kindle']

from collections import OrderedDict

d = OrderedDict()
for code, product in zip(list_1, list_2):
    d.setdefault(code, []).append(product)

produces a d looking like

>>> d
OrderedDict([('A-1', ['iPad', 'iPod', 'iPhone']), 
             ('A-2', ['Windows', 'X-box']), ('A-3', ['Kindle'])])

with easy access:

>>> d["A-2"]
['Windows', 'X-box']

and we can get the list-of-lists in list_1 order using .values():

>>> d.values()
[['iPad', 'iPod', 'iPhone'], ['Windows', 'X-box'], ['Kindle']]

If you've noticed that no one is telling you how to make a bunch of independent lists with names like list_a1 and so on-- that's because that's a bad idea. You want to keep the data together in something which you can (at a minimum) iterate over easily, and both dictionaries and list of lists qualify.

1 Comment

Great point in the last paragraph. Everyone else just assumed that instead of explaining it.
2

Maybe something like this?

#!/usr/local/cpython-3.3/bin/python

import pprint
import collections

def main():
    list_1 = ['A-1','A-1','A-1','A-2','A-2','A-3']
    list_2 = ['iPad','iPod','iPhone','Windows','X-box','Kindle']

    result = collections.defaultdict(list)
    for list_1_element, list_2_element in zip(list_1, list_2):
        result[list_1_element].append(list_2_element)

    pprint.pprint(result)


main()

Comments

2

Using itertools.izip_longest and itertools.groupby:

>>> from itertools import groupby, izip_longest
>>> inds = [next(g)[0] for k, g in groupby(enumerate(list_1), key=lambda x:x[1])]

First group items of list_1 and find the starting index of each group:

>>> inds
[0, 3, 5]

Now use slicing and izip_longest as we need pairs list_2[0:3], list_2[3:5], list_2[5:]:

>>> [list_2[x:y] for x, y in izip_longest(inds, inds[1:])]
[['iPad', 'iPod', 'iPhone'], ['Windows', 'X-box'], ['Kindle']]

To get a list of dicts you can something like:

>>> inds = [next(g) for k, g in groupby(enumerate(list_1), key=lambda x:x[1])]
>>> {k: list_2[ind1: ind2[0]] for (ind1, k), ind2 in
                                   zip_longest(inds, inds[1:], fillvalue=[None])}
{'A-1': ['iPad', 'iPod', 'iPhone'], 'A-3': ['Kindle'], 'A-2': ['Windows', 'X-box']}

Comments

0

You could do this if you want simple code, it's not pretty, but gets the job done.

list_1 = ['A-1','A-1','A-1','A-2','A-2','A-3']
list_2 = ['iPad','iPod','iPhone','Windows','X-box','Kindle']
list_1a = []
list_1b = []
list_1c = []
place = 0
for i in list_1[::1]:
    if list_1[place] == 'A-1':
        list_1a.append(list_2[place])
    elif list_1[place] == 'A-2':
        list_1b.append(list_2[place])
    else:
        list_1c.append(list_2[place])
    place += 1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.