2

I need to generate a pd.DataFrame with columns being composed by a list and a Multiindex object, and I need to do it before filling the final dataframe with data.

Say the columns are ['one', 'two'] and the multiindex obtained from from_product:

import pandas as pd

col_21 = ['day', 'month']
col_22 = ['a', 'b']

mult_2 =  pd.MultiIndex.from_product([ col_21, col_22 ])

I would like to get a list of columns which looks like this:

'one' | 'two' | ('day','a') | ('day','b') | ('month','a') | ('month','b')

One possible solution would be to use two different and separate Multiindex, one with a dummy column, both generate by from_product


col_11 = ['one', 'two']
col_12 = ['']
col_21 = ['day', 'month']
col_22 = ['a', 'b']

mult_1 =  pd.MultiIndex.from_product([ col_11, col_12 ])
mult_2 =  pd.MultiIndex.from_product([ col_21, col_22 ])

How could I get to this?

(one, '') | (two, '') | ('day','a') | ('day','b') | ('month','a') |  ('month','b')

I have tried several trivial solutions, but each gave me a different error or a wrong result

mult_1+mult_2 #TypeError: cannot perform __add__ with this index type: MultiIndex
pd.merge #TypeError: Can only merge Series or DataFrame objects, a <class 'list'> was passed
pd.MultiIndex.from_arrays([ mult_1, mult_2 ]) #NotImplementedError: isna is not defined for MultiIndex

Thank you for your advices

1 Answer 1

1

If possible, the best is create MultiIndex in index by columns one, two and then MultiIndex in columns by pairs - so not mixed non multiindex with multindex values:

col_21 = ['day', 'month']
col_22 = ['a', 'b']
mult_2 =  pd.MultiIndex.from_product([ col_21, col_22 ])

one = range(5)
two = list('ABCDE')
mult_3 =  pd.MultiIndex.from_arrays([ one, two], names=['one','two'])

df = pd.DataFrame(0, columns=mult_2, index=mult_3)
print (df)
        day    month   
          a  b     a  b
one two                
0   A     0  0     0  0
1   B     0  0     0  0
2   C     0  0     0  0
3   D     0  0     0  0
4   E     0  0     0  0

Use Index.append:

print (mult_1.append(mult_2))

MultiIndex([(  'one',  ''),
            (  'two',  ''),
            (  'day', 'a'),
            (  'day', 'b'),
            ('month', 'a'),
            ('month', 'b')],
           )

Or Index.union with sort=False:

print (mult_1.union(mult_2, sort=False))

MultiIndex([(  'one',  ''),
            (  'two',  ''),
            (  'day', 'a'),
            (  'day', 'b'),
            ('month', 'a'),
            ('month', 'b')],
           )
Sign up to request clarification or add additional context in comments.

2 Comments

I tried index.append() on 2 multi-indices and it didn't work, are you sure it's not only for normal indices?
@Boris what is your pandas version?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.