3

I have a dataframe with column headings (and for my real data multi-level row indexes). I want to add a second level index to the columns based on a list I have.

import pandas as pd

data = {"apple": [7,5,6,4,7,5,8,6],
       "strawberry": [3,5,2,1,3,0,4,2],
       "banana": [1,2,1,2,2,2,1,3],
        "chocolate" : [5,8,4,2,1,6,4,5],
        "cake":[4,4,5,1,3,0,0,3]
       }

df = pd.DataFrame(data)
food_cat = ["fv","fv","fv","j","j"]

I am wanting something that looks like this:

example desired output

I tried to use How to add a second level column header/index to dataframe by matching to dictionary values? - however couldn't get it working (and not ideal as I'd need to figure out how to automate the dictionary, which I don't have).

I also tried adding the list as a row in the dataframe and converting that row to a second level index as in this answer using

df.loc[len(df)] = food_cat
df = pd.MultiIndex.from_arrays(df.columns, df.iloc[len(df)-1])

but got the error Check if lengths of all arrays are equal or not, TypeError: Input must be a list / sequence of array-likes.

I also tried using df = pd.MultiIndex.from_arrays(df.columns, np.array(food_cat)) with import numpy as np but got the same error.

I feel like this should be a simple task (it is for rows), and there are a lot of questions asked, but I was struggling to find something I could duplicate to adapt to my data.

3
  • 2
    why not just use : df.columns = pd.MultiIndex.from_arrays([food_cat, df.columns]) Commented Aug 12, 2021 at 0:04
  • pd.MultiIndex.from_arrays the first parameter, arrays is a list of 1d "array-like" objects. Commented Aug 12, 2021 at 0:07
  • thanks @sammywemmy if you post that as an answer I will accept. I spent a couple hours trying to find a working example before I gave up and posted. I am unsurprised that it really was simple. Commented Aug 12, 2021 at 0:10

1 Answer 1

5

Pandas multi index creation requires a list(or list like) passed as an argument:

df.columns = pd.MultiIndex.from_arrays([food_cat, df.columns])

df

     fv                           j
  apple strawberry banana chocolate cake
0     7          3      1         5    4
1     5          5      2         8    4
2     6          2      1         4    5
3     4          1      2         2    1
4     7          3      2         1    3
5     5          0      2         6    0
6     8          4      1         4    0
7     6          2      3         5    3
Sign up to request clarification or add additional context in comments.

2 Comments

Can i specify on the columns that I want to apply the multi index for? In the example give, how do i apply fv only for the banana and chocolate columns?
@thentangler kindly create a new question, with a clear example of your expected output dataframe. You can tag me when you post the question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.