Nested List to Pandas Dataframe with headers

Question

Basically I am trying to do the opposite of How to generate a list from a pandas DataFrame with the column name and column values?

To borrow that example, I want to go from the form:

data = [
    ['Name','Rank','Complete'],
    ['one', 1, 1],
    ['two', 2, 1],
    ['three', 3, 1],
    ['four', 4, 1],
    ['five', 5, 1]
]

which should output:

      Rank Complete
 Name
  One    1        1
  Two    2        1
Three    3        1
 Four    4        1
 Five    5        1

However when I do something like:

pd.DataFrame(data)

I get a dataframe where the first list should be my column labels, and then the first element of each list should be the indices.

Anand S Kumar · Accepted Answer · 2015-09-30 04:26:27Z

55

One way to do this would be to take the column names as a separate list and then only give from 1st index for pd.DataFrame -

In [8]: data = [['Name','Rank','Complete'],
   ...:                ['one', 1, 1],
   ...:                ['two', 2, 1],
   ...:                ['three', 3, 1],
   ...:                ['four', 4, 1],
   ...:                ['five', 5, 1]]

In [10]: df = pd.DataFrame(data[1:],columns=data[0])

In [11]: df
Out[11]:
    Name  Rank  Complete
0    one     1         1
1    two     2         1
2  three     3         1
3   four     4         1
4   five     5         1

If you want to set the first column Name column as index, use the .set_index() method and send in the column to use for index. Example -

In [16]: df = pd.DataFrame(data[1:],columns=data[0]).set_index('Name')

In [17]: df
Out[17]:
       Rank  Complete
Name
one       1         1
two       2         1
three     3         1
four      4         1
five      5         1

edited Sep 30, 2015 at 4:26

answered Sep 30, 2015 at 4:08

Anand S Kumar

91.4k18 gold badges196 silver badges179 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

qwertylpc Over a year ago

what about the row names?

cottontail · Accepted Answer · 2023-03-27 00:45:57Z

To create the desired dataframe from construction, the list could be converted into a numpy array and indexed accordingly.

arr = np.array(data, dtype=object)
df = pd.DataFrame(arr[1:, 1:], index=pd.Index(arr[1:, 0], name=arr[0,0]), columns=arr[0, 1:], dtype=int)

Another method is, since the data looks like a csv file read into a Python list, it could be converted into an in-memory text buffer and have pd.read_csv called on it. A nice thing about read_csv is that it can set MultiIndex columns, indices etc. and can infer dtypes.

from io import StringIO
df = pd.read_csv(StringIO('\n'.join(['|'.join(map(str, row)) for row in data])), sep='|', index_col=[0])

A convenience function for the latter method:

from io import StringIO
def read_list(data, index_col=None, header=0):
    sio = StringIO('\n'.join(['|'.join(map(str, row)) for row in data]))
    return pd.read_csv(sio, sep='|', index_col=index_col, header=header)

df = read_list(data, index_col=[0])

Thangarajtest · Accepted Answer · 2024-06-09 11:57:08Z

0

Convert nested list to pandas dataframe:

import pandas as pd

# Sample data (replace with your `Final_data` if obtained from scraping)
data = [[['1', 'Walmart', 'https://www.walmart.com/'], ['2', 'Amazon', 'https://www.amazon.com/'], ['3', 'Exxon Mobil', 'https://corporate.exxonmobil.com/'], ['4', 'Apple', 'https://www.apple.com/'], ['5', 'UnitedHealth Group', 'https://www.unitedhealthgroup.com/'], ['6', 'CVS Health', 'https://www.cvshealth.com/'], ['7', 'Berkshire Hathaway', 'https://www.berkshirehathaway.com/'], ['8', 'Alphabet', 'https://abc.xyz/'], ['9', 'McKesson', 'https://www.mckesson.com/'], ['10', 'Chevron', 'https://www.chevron.com/']], [['11', 'AmerisourceBergen', 'https://www.amerisourcebergen.com/'], ['12', 'Costco Wholesale', 'https://www.costco.com/'], ['13', 'Microsoft', 'https://www.microsoft.com/'], ['14', 'Cardinal Health', 'https://www.cardinalhealth.com/'], ['15', 'Cigna', 'https://www.cigna.com/'], ['16', 'Marathon Petroleum', 'https://www.marathonpetroleum.com/'], ['17', 'Phillips 66', 'https://www.phillips66.com/'], ['18', 'Valero Energy', 'https://www.valero.com/'], ['19', 'Ford Motor', 'https://www.ford.com/'], ['20', 'Home Depot', 'https://www.homedepot.com/']]]

# Create a DataFrame from the list, flattening each sublist into rows
df = pd.DataFrame([item for sublist in data for item in sublist])

# Rename columns (assuming the first element in each sublist is the S.No)
df.columns = ['S. No', 'Name', 'URL']

print(df)

answered Jun 9, 2024 at 11:57

Thangarajtest

12 bronze badges

1 Comment

Community Over a year ago

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Collectives™ on Stack Overflow

Nested List to Pandas Dataframe with headers

3 Answers 3

1 Comment

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related