Nested list comprehension in Python

Question

I've got a list comprehension I'm trying to get my head around and I just can't seem to get what I'm after and thought I'd see if anybody else knew how!

My basic data structure is this:

structure = [[np.array([[1,2,3],[4,5,6]]), np.array([[7,8,9],[10,11,12]])], [np.array([[13,14,15],[16,17,18]]), np.array([[19,20,21],[22,23,24]])]]

So I've got an overall list containing sublists of numpy arrays and my desired output is some sort of grouping (don't care if it's a list or an array) with the following elements paired:

[1, 13]
[4, 16]
[2, 14]
[5, 17]
[3, 15]
[6, 18]

I thought I'd got it with the following style construct:

output = [structure[i][0][j] for j in range(9) for i in range(len(structure))] but alas, no joy.

I don't really mind if it needs more than one stage - just want to get those elements grouped together!

(as a bit of background - I've got lists of probabilities outputted from various models and within those models I've got a training list and a validation list:

[[model_1], [model_2], ..., [model_n]]

where [model_1] is [[training_set], [validation_set], [test_set]]

and [training_set] is np.array([p_1, p_2, ..., p_n],[p_1, p_2, ..., p_n],...])

I'd like to group together the prediction for item 1 for each of the models and create a training vector out of it of length equal to the number of models I've got. I'd then like to do the same but for the second row of [training_set].

If that doesn't make sense let me know!

Don't get me wrong - it's not an ideal data structure! It's borne of the fact that it allows me to easily add an arbitrary number of models and then build a linear model based on the output of those models. I think it'd probably make more sense to have the outer groupings as: training_set = [[model_1], [model_2], ..., [model_n]] I'd happily split out my lists into separate variables as it'll likely make things easier! — Kali_89
– Kali_89, Commented Apr 17, 2015 at 22:17
Heh sorry I deleted my comment because I realized I was reading it wrong.. — Brendan Long
– Brendan Long, Commented Apr 17, 2015 at 22:18

hpaulj · Accepted Answer · 2015-04-18 04:11:43Z

3

Since all the arrays (and sublists) in structure are the same size you can turn it into one higher dimensional array:

In [189]: A=np.array(structure)
Out[189]: 
array([[[[ 1,  2,  3],
         [ 4,  5,  6]],

        [[ 7,  8,  9],
         [10, 11, 12]]],


       [[[13, 14, 15],
         [16, 17, 18]],

        [[19, 20, 21],
         [22, 23, 24]]]])

In [190]: A.shape
Out[190]: (2, 2, 2, 3)

Reshaping and swapaxes can give you all kinds of combinations.

For example, the values in your sample sublist can be selected with:

In [194]: A[:,0,:,:]
Out[194]: 
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[13, 14, 15],
        [16, 17, 18]]])

and reshape to get

In [197]: A[:,0,:,:].reshape(2,6)
Out[197]: 
array([[ 1,  2,  3,  4,  5,  6],
       [13, 14, 15, 16, 17, 18]])

and transpose to get the 6 rows of pairs:

In [198]: A[:,0,:,:].reshape(2,6).T
Out[198]: 
array([[ 1, 13],
       [ 2, 14],
       [ 3, 15],
       [ 4, 16],
       [ 5, 17],
       [ 6, 18]])

To get them in the 1,4,2,5.. order I can transpose first

In [208]: A[:,0,:,:].T.reshape(6,2)
Out[208]: 
array([[ 1, 13],
       [ 4, 16],
       [ 2, 14],
       [ 5, 17],
       [ 3, 15],
       [ 6, 18]])

edited Apr 18, 2015 at 4:11

answered Apr 17, 2015 at 23:58

hpaulj

233k14 gold badges260 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

emvee Over a year ago

Nice, keep it all in numpy!

Padraic Cunningham · Accepted Answer · 2015-04-17 23:16:58Z

2

Not sure exactly what full output you want but this may help:

imort numpy as np

structure = [[np.array([[1, 2, 3], [4, 5, 6]]), np.array([[7, 8, 9], [10, 11, 12]])],
             [np.array([[13, 14, 15], [16, 17, 18]]), np.array([[19, 20, 21], [22, 23, 24]])]]

from itertools import chain

zipped = (zip(*ele) for ele in zip(*next(zip(*structure))))

print (list(chain.from_iterable(zip(*zipped))))
[(1, 13), (4, 16), (2, 14), (5, 17), (3, 15), (6, 18)]

Ok a breakdown of the witchcraft:

# transpose sub arrays so column 0 is the first two sub elements from 
# each sub array
In [4]: start = zip(*structure)

In [5]: start
Out[5]: 
[(array([[1, 2, 3],
         [4, 5, 6]]), array([[13, 14, 15],
         [16, 17, 18]])), (array([[ 7,  8,  9],
         [10, 11, 12]]), array([[19, 20, 21],
         [22, 23, 24]]))]

# our interesting sub array's i.e colunm[0]
In [6]: first_col = next(start)

In [7]: first_col
Out[7]: 
(array([[1, 2, 3],
        [4, 5, 6]]), array([[13, 14, 15],
        [16, 17, 18]]))

# pair up corresponding sub array's
In [8]: intersting_pairs = zip(*first_col)

In [9]: intersting_pairs
Out[9]: 
[(array([1, 2, 3]), array([13, 14, 15])),
 (array([4, 5, 6]), array([16, 17, 18]))]

# pair them up (1, 13), (2, 14) ...
In [10]: create_final_pairings = [zip(*ele) for ele in intersting_pairs]

In [11]: create_final_pairings
Out[11]: [[(1, 13), (2, 14), (3, 15)], [(4, 16), (5, 17), (6, 18)]]

Finally chain all into a single flat list and get the order correct:

In [13]: from itertools import chain
# create flat list 
In [14]: flat_list = list(chain.from_iterable(zip(*create_final_pairings))

In [15]: flat_list
Out[15]: [(1, 13), (4, 16), (2, 14), (5, 17), (3, 15), (6, 18)]

A simple example of transposing with zip may help:

In [17]: l = [[1,2,3],[4,5,6]]

In [18]: zip(*l)
Out[18]: [(1, 4), (2, 5), (3, 6)]

In [19]: zip(*l)[0]
Out[19]: (1, 4)

In [20]: zip(*l)[1]
Out[20]: (2, 5)

In [21]: zip(*l)[2]
Out[21]: (3, 6)

For python2 you can use itertools.izip:

from itertools import chain, izip


zipped = (izip(*ele) for ele in izip(*next(izip(*structure))))
print (list(chain.from_iterable(izip(*zipped))))

[(1, 13), (4, 16), (2, 14), (5, 17), (3, 15), (6, 18)]

edited Apr 17, 2015 at 23:16

answered Apr 17, 2015 at 22:43

Padraic Cunningham

181k30 gold badges264 silver badges327 bronze badges

4 Comments

Kali_89 Over a year ago

That's awesome! Would it be too cheeky to ask for a bit of an explanation as at the moment it largely seems to be witchcraft!

Padraic Cunningham Over a year ago

@Kali_89. lol it seems likes witchcraft to me also. I have never zipped and transposed as much in six months! I will try break it down

Padraic Cunningham Over a year ago

@Kali_89, tried t break down each step as best i could. If the order does not matter you can remove the last zip*

rafaelc Over a year ago

@PadraicCunningham Bravo!

Brendan Long · Accepted Answer · 2015-04-17 23:08:35Z

I had to write the non-list-comprehension version first to get my head around this:

new_training_vector = []
for m1, m2 in zip(structure[0], structure[1]):
    for t1, t2 in zip(m1, m2):
        for d1, d2 in zip(t1, t2):
            new_training_vector.append([d1, d2])

The way it works is by creating two parallel iterators (using zip), one for each model, then creating two parallel iterators for each of the training sets and so on until we get to the actual data and can just stick it together.

Once we have that, it's not hard to go fold it into a list comprehension:

new_training_vector = [[d1, d2]
                       for m1, m2 in zip(structure[0], structure[1])
                       for t1, t2 in zip(m1, m2)
                       for d1, d2 in zip(t1, t2)]

You can also do this with a dictionary, if that works better for some reason. You would lose the order though:

import collections
d = collections.defaultdict(list)
for model in structure:
    for i, training_set in enumerate(model):
        for j, row in enumerate(training_set):
            for k, point in enumerate(row):
                d[(i, j, k)].append(point)

The trick to this one is that we just keep track of where we saw each point (except for at the model level), so they automatically go into the same dict item.

Kelvin17 · Accepted Answer · 2015-04-17 23:21:24Z

0

I think this is what you want like the format you have, it uses generators:



import numpy as np
structure = [[np.array([[1,2,3],[4,5,6]]), np.array([[7,8,9],[10,11,12]])], [np.array([[13,14,15],[16,17,18]]), np.array([[19,20,21],[22,23,24]])]]
struc = structure

my_gen = ([struc[i][j][k][l], struc[i+1][j][k][l]] for i in range(len(struc)-1)
                                     for j in range(len(struc[i]))
                                     for k in range(len(struc[i][j]))
                                     for l in range(len(struc[i][j][k])))

try:
    val = my_gen.next()
    while val != None:
        print val
        val = my_gen.next()
except:
    pass

answered Apr 17, 2015 at 23:21

Kelvin17

691 silver badge5 bronze badges

Collectives™ on Stack Overflow

Nested list comprehension in Python

4 Answers 4

1 Comment

4 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related