28

I want to load lists into columns of a pandas DataFrame but cannot seem to do this simply. This is an example of what I want using transpose() but I would think that is unnecessary:

In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: x = np.linspace(0,np.pi,10)
In [4]: y = np.sin(x)
In [5]: data = pd.DataFrame(data=[x,y]).transpose()
In [6]: data.columns = ['x', 'sin(x)']
In [7]: data
Out[7]: 
          x        sin(x)
0  0.000000  0.000000e+00
1  0.349066  3.420201e-01
2  0.698132  6.427876e-01
3  1.047198  8.660254e-01
4  1.396263  9.848078e-01
5  1.745329  9.848078e-01
6  2.094395  8.660254e-01
7  2.443461  6.427876e-01
8  2.792527  3.420201e-01
9  3.141593  1.224647e-16

[10 rows x 2 columns]

Is there a way to directly load each list into a column to eliminate the transpose and insert the column labels when creating the DataFrame?

3 Answers 3

37

Someone just recommended creating a dictionary from the data then loading that into the DataFrame like this:

In [8]: data = pd.DataFrame({'x': x, 'sin(x)': y})
In [9]: data
Out[9]: 
          x        sin(x)
0  0.000000  0.000000e+00
1  0.349066  3.420201e-01
2  0.698132  6.427876e-01
3  1.047198  8.660254e-01
4  1.396263  9.848078e-01
5  1.745329  9.848078e-01
6  2.094395  8.660254e-01
7  2.443461  6.427876e-01
8  2.792527  3.420201e-01
9  3.141593  1.224647e-16

[10 rows x 2 columns]

Note than a dictionary is an unordered set of key-value pairs. If you care about the column orders, you should pass a list of the ordered key values to be used (you can also use this list to only include some of the dict entries):

data = pd.DataFrame({'x': x, 'sin(x)': y}, columns=['x', 'sin(x)'])
Sign up to request clarification or add additional context in comments.

3 Comments

You can specify column order this way: In [9]: In [5]: data = pd.DataFrame({'x':x, 'sin(x)':y}, columns=['x','sin(x)'])
You're missing quotes in the dictionary initialization.
@StevenC.Howell As of pandas version 0.25.0: If data is a dict, column order follows insertion-order (docs)
8

Here's another 1-line solution preserving the specified order, without typing x and sin(x) twice:

data = pd.concat([pd.Series(x,name='x'),pd.Series(y,name='sin(x)')], axis=1)

Comments

5

If you don't care about the column names, you can use this:

pd.DataFrame(zip(*[x,y]))

run-time-wise it is as fast as the dict option, and both are much faster than using transpose.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.