3

I'm trying to combine an mxn array called data with a list of m elements called cluster_data such that each element in the list cluster_data is appended as the last element of each of the rows in data.

As an example, I would want something like to combine

data = [[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16],[17,18,19,20]]

cluster_data = [1,2,3,4,5]

Such that

final_data = [[1,2,3,4,1],[5,6,7,8,2],[9,10,11,12,3],[13,14,15,16,4],[17,18,19,20,5]]

I have written some code that does this, but I was hoping for a more Pythonic way.

data_with_clusters = []    
for i, row in enumerate(data):
    row.append(cluster_data[i])
    data_with_clusters.append(row)

My best guess so far, which doesn't work, is:

data_with_clusters = [row.append(cluster_data[i]) for i, row in enumerate(data)]
0

4 Answers 4

5

I think this is the most pythonic way

final_data = [i+[j] for i,j in zip(data, cluster_data)]

Sign up to request clarification or add additional context in comments.

Comments

2

The problem with your approach is that append doesn't return anything.

Instead of append, merge lists:

[row + [cluster_data[i]] for i, row in enumerate(data)]

or

[e[0] + [e[1]] for e in zip(data, cluster_data)]

3 Comments

Works perfectly. I used the first one. Can you explain why you need brackets around cluster_data[i]?
The + operator in this case merges two lists.
Please accept Yash's solution, it's really the most pythonic one.
2

A Pythonic way would be to use Array first thing first. Lists are commonly abused as array, because they share some similarity.

But a more Pythonic way, if you often work with numbers is to use NumPy. Which makes such operations a piece of cake.

The answers with list comprehension given previously are fine too, but they will be extremely slow for large arrays.

Here is your intro to NumPy:

In [2]: import numpy as np
In [3]: array = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16],[17,18,19,20]])

In [3]: array
Out[3]: 
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16],
       [17, 18, 19, 20]])

In [4]: col = np.array([[1],[2],[3],[4],[5]])
In [4]: col
Out[4]: 
array([[1],
       [2],
       [3],
       [4],
       [5]])

In [5]: np.append(array, col, axis=1)
Out[5]: 
array([[ 1,  2,  3,  4,  1],
       [ 5,  6,  7,  8,  2],
       [ 9, 10, 11, 12,  3],
       [13, 14, 15, 16,  4],
       [17, 18, 19, 20,  5]])

3 Comments

Also if you are crunching many numbers with python you should get familiar with IPython, which the code snippets in my answer are from
Thanks for your suggestion. About IPython and NumPy in particular. I was not aware that NumPy was so much more convenient!
Numpy is defacto the standard for working with array and matrix. If you do a lot of number crunching, take a look into scipy and pandas too.
1

row.append(cluster_data[i]) returns None, so that doesn't work,

Try instead: row + [cluster_data[i]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.