9

I have a dataframe. I would like some of the data to be converted to a list of list. The columns I'm interested in are the index, Name, and Births. My code works, but it seems inefficient and for some reason the letter L is added to the end of each index.

My code:

import pandas as pd


data = [['Bob', 968, 'Male'], ['Jessica', 341, 'Female'], ['Mary', 77, 'Female'], ['John', 578, 'Male'], ['Mel', 434, 'Female']]
headers = ['Names', 'Births', 'Gender']
df = pd.DataFrame(data = data, columns=headers)
indexes = df.index.values.tolist()
mylist =  [[x] for x in indexes]

for x in mylist:
    x.extend([df.ix[x[0],'Names'], df.ix[x[0],'Births']])

print mylist

Desired Output:

[[0, 'Bob', 968], [1, 'Jessica', 341], [2, 'Mary', 77], [3, 'John', 578], [4, 'Mel', 434]]
3
  • I just ran your code as is using python 2.7.9 and pandas 0.16.2 and the output was exactly what you want. Commented Jul 7, 2015 at 14:09
  • 1
    @JulienGrenier. Yes I am looking for efficiency improvements to my code. Also the tolist() functionality seems to add an L to the end of things so the actual ouput is: [[0L, 'Bob', 968], [1L, 'Jessica', 341], [2L, 'Mary', 77], [3L, 'John', 578], [4L, 'Mel', 434]] Commented Jul 7, 2015 at 14:21
  • I would also love to see a solution to this... Commented Jul 3, 2016 at 13:54

2 Answers 2

11

Why not just use .values.tolist() as you mentioned?

import pandas as pd

# your data
# =================================================
data = [['Bob', 968, 'Male'], ['Jessica', 341, 'Female'], ['Mary', 77, 'Female'], ['John', 578, 'Male'], ['Mel', 434, 'Female']]
headers = ['Names', 'Births', 'Gender']
df = pd.DataFrame(data = data, columns=headers)

# nested list
# ============================
df.reset_index()[['index', 'Names', 'Births']].values.tolist()

Out[46]: 
[[0, 'Bob', 968],
 [1, 'Jessica', 341],
 [2, 'Mary', 77],
 [3, 'John', 578],
 [4, 'Mel', 434]]
Sign up to request clarification or add additional context in comments.

7 Comments

This outputs: [[0L, 'Bob', 968L], [1L, 'Jessica', 341L], [2L, 'Mary', 77L], [3L, 'John', 578L], [4L, 'Mel', 434L]]. Why are the L's being added?
@user2242044 L means 'Long' integer. I didn't see that 'L' appended to my output. Let me think about possible causes.
Also is there is an opposite function of drop() such as include? For example say my dataframe had 40 columns and I wanted two of them.
@user2242044: First of all, I don't see the L in my shell. Also the L only means that those are longs instead of integers.
@user2242044 can you try this df.reset_index()[['index', 'Names', 'Births']].values.astype(str).tolist()? Does it still produce something with L appended?
|
2

Ok, this works (based on Jianxun Li's answer and comments):

import pandas as pd

# Data
data = [['Bob', 968, 'Male'], ['Jessica', 341, 'Female'], ['Mary', 77, 'Female'], ['John', 578, 'Male'], ['Mel', 434, 'Female']]
headers = ['Names', 'Births', 'Gender']
df = pd.DataFrame(data = data, columns=headers)

# Output
print df.reset_index()[['index', 'Names', 'Births']].values.astype(str).tolist()

Thank you Jianxun Li, this also helped me :-)

In general, one can use the following to transform the complete dataframe into a list of lists (which is what I needed):

df.values.astype(str).tolist()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.