3

I have a Pandas DataFrame with a column index like the one below:

+----+----+----+----+----+----+
|  A1|  A2|  A3|  B1|  B2|  B3|
+----+----+----+----+----+----+
...the data

What I would like to do is to change the column index of this DataFrame to a multi-index one, as shown in the format below, without modifying the data and just simply adding an upper level in the index (with values A and B).

+--------------+--------------+
|        A     |        B     |
+----+----+----+----+----+----+
|  A1|  A2|  A3|  B1|  B2|  B3|
+----+----+----+----+----+----+
...the data

I have tried to use the pandas.MultiIndex function but with no luck. How can this be solved?

1
  • So using pd.MultiIndex.from_arrays is necessary? Not df.columns = [df.columns.str[0], df.columns] ? Commented Sep 20, 2021 at 8:48

3 Answers 3

4

You could extract the first letter separately and create a MultiIndex -

multi_index_level_0 = [c[0] for c in df.columns]
multi_index = [multi_index_level_0, df.columns.values]
df.columns = pd.MultiIndex.from_arrays(multi_index)
Sign up to request clarification or add additional context in comments.

Comments

1

Simpliest is extract first value of columns and assign back in nested lists:

df = pd.DataFrame(0, columns=['A1','A2','A3','B1','B2','B3'], index=[0])

df.columns = [df.columns.str[0], df.columns]
print (df)
   A        B      
  A1 A2 A3 B1 B2 B3
0  0  0  0  0  0  0

If need extract all uppercases from start:

df = pd.DataFrame(0, columns=['ADa1','ADs2','AD3','B1','B2','B3'], index=[0])

df.columns = [df.columns.str.extract('(^[A-Z]+)', expand=False), df.columns]
print (df)

    AD           B      
  ADa1 ADs2 AD3 B1 B2 B3
0    0    0   0  0  0  0

If need set also colums names use MultiIndex.from_arrays:

df = pd.DataFrame(0, columns=['ADa1','ADs2','AD3','B1','B2','B3'], index=[0])

df.columns = pd.MultiIndex.from_arrays([df.columns.str.extract('(^[A-Z]+)', expand=False), 
                                       df.columns], 
                                       names=('a','b'))
print (df)

a   AD           B      
b ADa1 ADs2 AD3 B1 B2 B3
0    0    0   0  0  0  0

Comments

0

Here is an option using map

df.set_axis(df.columns.map(lambda x: (x[0],x)),axis=1)

Output:

   A        B      
  A1 A2 A3 B1 B2 B3
0  0  0  0  0  0  0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.