4

I've got a list comprehension I'm trying to get my head around and I just can't seem to get what I'm after and thought I'd see if anybody else knew how!

My basic data structure is this:

structure = [[np.array([[1,2,3],[4,5,6]]), np.array([[7,8,9],[10,11,12]])], [np.array([[13,14,15],[16,17,18]]), np.array([[19,20,21],[22,23,24]])]]

So I've got an overall list containing sublists of numpy arrays and my desired output is some sort of grouping (don't care if it's a list or an array) with the following elements paired:

[1, 13]
[4, 16]
[2, 14]
[5, 17]
[3, 15]
[6, 18]

I thought I'd got it with the following style construct:

output = [structure[i][0][j] for j in range(9) for i in range(len(structure))] but alas, no joy.

I don't really mind if it needs more than one stage - just want to get those elements grouped together!

(as a bit of background - I've got lists of probabilities outputted from various models and within those models I've got a training list and a validation list:

[[model_1], [model_2], ..., [model_n]]

where [model_1] is [[training_set], [validation_set], [test_set]]

and [training_set] is np.array([p_1, p_2, ..., p_n],[p_1, p_2, ..., p_n],...])

I'd like to group together the prediction for item 1 for each of the models and create a training vector out of it of length equal to the number of models I've got. I'd then like to do the same but for the second row of [training_set].

If that doesn't make sense let me know!

2
  • Don't get me wrong - it's not an ideal data structure! It's borne of the fact that it allows me to easily add an arbitrary number of models and then build a linear model based on the output of those models. I think it'd probably make more sense to have the outer groupings as: training_set = [[model_1], [model_2], ..., [model_n]] I'd happily split out my lists into separate variables as it'll likely make things easier! Commented Apr 17, 2015 at 22:17
  • Heh sorry I deleted my comment because I realized I was reading it wrong.. Commented Apr 17, 2015 at 22:18

4 Answers 4

3

Since all the arrays (and sublists) in structure are the same size you can turn it into one higher dimensional array:

In [189]: A=np.array(structure)
Out[189]: 
array([[[[ 1,  2,  3],
         [ 4,  5,  6]],

        [[ 7,  8,  9],
         [10, 11, 12]]],


       [[[13, 14, 15],
         [16, 17, 18]],

        [[19, 20, 21],
         [22, 23, 24]]]])

In [190]: A.shape
Out[190]: (2, 2, 2, 3)

Reshaping and swapaxes can give you all kinds of combinations.

For example, the values in your sample sublist can be selected with:

In [194]: A[:,0,:,:]
Out[194]: 
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[13, 14, 15],
        [16, 17, 18]]])

and reshape to get

In [197]: A[:,0,:,:].reshape(2,6)
Out[197]: 
array([[ 1,  2,  3,  4,  5,  6],
       [13, 14, 15, 16, 17, 18]])

and transpose to get the 6 rows of pairs:

In [198]: A[:,0,:,:].reshape(2,6).T
Out[198]: 
array([[ 1, 13],
       [ 2, 14],
       [ 3, 15],
       [ 4, 16],
       [ 5, 17],
       [ 6, 18]])

To get them in the 1,4,2,5.. order I can transpose first

In [208]: A[:,0,:,:].T.reshape(6,2)
Out[208]: 
array([[ 1, 13],
       [ 4, 16],
       [ 2, 14],
       [ 5, 17],
       [ 3, 15],
       [ 6, 18]])
Sign up to request clarification or add additional context in comments.

1 Comment

Nice, keep it all in numpy!
2

Not sure exactly what full output you want but this may help:

imort numpy as np

structure = [[np.array([[1, 2, 3], [4, 5, 6]]), np.array([[7, 8, 9], [10, 11, 12]])],
             [np.array([[13, 14, 15], [16, 17, 18]]), np.array([[19, 20, 21], [22, 23, 24]])]]

from itertools import chain

zipped = (zip(*ele) for ele in zip(*next(zip(*structure))))

print (list(chain.from_iterable(zip(*zipped))))
[(1, 13), (4, 16), (2, 14), (5, 17), (3, 15), (6, 18)]

Ok a breakdown of the witchcraft:

# transpose sub arrays so column 0 is the first two sub elements from 
# each sub array
In [4]: start = zip(*structure)

In [5]: start
Out[5]: 
[(array([[1, 2, 3],
         [4, 5, 6]]), array([[13, 14, 15],
         [16, 17, 18]])), (array([[ 7,  8,  9],
         [10, 11, 12]]), array([[19, 20, 21],
         [22, 23, 24]]))]

# our interesting sub array's i.e colunm[0]
In [6]: first_col = next(start)

In [7]: first_col
Out[7]: 
(array([[1, 2, 3],
        [4, 5, 6]]), array([[13, 14, 15],
        [16, 17, 18]]))

# pair up corresponding sub array's
In [8]: intersting_pairs = zip(*first_col)

In [9]: intersting_pairs
Out[9]: 
[(array([1, 2, 3]), array([13, 14, 15])),
 (array([4, 5, 6]), array([16, 17, 18]))]

# pair them up (1, 13), (2, 14) ...
In [10]: create_final_pairings = [zip(*ele) for ele in intersting_pairs]

In [11]: create_final_pairings
Out[11]: [[(1, 13), (2, 14), (3, 15)], [(4, 16), (5, 17), (6, 18)]]

Finally chain all into a single flat list and get the order correct:

In [13]: from itertools import chain
# create flat list 
In [14]: flat_list = list(chain.from_iterable(zip(*create_final_pairings))

In [15]: flat_list
Out[15]: [(1, 13), (4, 16), (2, 14), (5, 17), (3, 15), (6, 18)]

A simple example of transposing with zip may help:

In [17]: l = [[1,2,3],[4,5,6]]

In [18]: zip(*l)
Out[18]: [(1, 4), (2, 5), (3, 6)]

In [19]: zip(*l)[0]
Out[19]: (1, 4)

In [20]: zip(*l)[1]
Out[20]: (2, 5)

In [21]: zip(*l)[2]
Out[21]: (3, 6)

For python2 you can use itertools.izip:

from itertools import chain, izip


zipped = (izip(*ele) for ele in izip(*next(izip(*structure))))
print (list(chain.from_iterable(izip(*zipped))))

[(1, 13), (4, 16), (2, 14), (5, 17), (3, 15), (6, 18)]

4 Comments

That's awesome! Would it be too cheeky to ask for a bit of an explanation as at the moment it largely seems to be witchcraft!
@Kali_89. lol it seems likes witchcraft to me also. I have never zipped and transposed as much in six months! I will try break it down
@Kali_89, tried t break down each step as best i could. If the order does not matter you can remove the last zip*
@PadraicCunningham Bravo!
2

I had to write the non-list-comprehension version first to get my head around this:

new_training_vector = []
for m1, m2 in zip(structure[0], structure[1]):
    for t1, t2 in zip(m1, m2):
        for d1, d2 in zip(t1, t2):
            new_training_vector.append([d1, d2])

The way it works is by creating two parallel iterators (using zip), one for each model, then creating two parallel iterators for each of the training sets and so on until we get to the actual data and can just stick it together.

Once we have that, it's not hard to go fold it into a list comprehension:

new_training_vector = [[d1, d2]
                       for m1, m2 in zip(structure[0], structure[1])
                       for t1, t2 in zip(m1, m2)
                       for d1, d2 in zip(t1, t2)]

You can also do this with a dictionary, if that works better for some reason. You would lose the order though:

import collections
d = collections.defaultdict(list)
for model in structure:
    for i, training_set in enumerate(model):
        for j, row in enumerate(training_set):
            for k, point in enumerate(row):
                d[(i, j, k)].append(point)

The trick to this one is that we just keep track of where we saw each point (except for at the model level), so they automatically go into the same dict item.

Comments

0

I think this is what you want like the format you have, it uses generators:

import numpy as np
structure = [[np.array([[1,2,3],[4,5,6]]), np.array([[7,8,9],[10,11,12]])], [np.array([[13,14,15],[16,17,18]]), np.array([[19,20,21],[22,23,24]])]]
struc = structure

my_gen = ([struc[i][j][k][l], struc[i+1][j][k][l]] for i in range(len(struc)-1)
                                     for j in range(len(struc[i]))
                                     for k in range(len(struc[i][j]))
                                     for l in range(len(struc[i][j][k])))

try:
    val = my_gen.next()
    while val != None:
        print val
        val = my_gen.next()
except:
    pass

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.