1

Given df 'AB':

A = pd.DataFrame([[1, 5, 2], [2, 4, 4], [3, 3, 1], [4, 2, 2], [5, 1, 4]],
         columns=['A', 'B', 'C'], index=[1, 2, 3, 4, 5])
B = pd.DataFrame([[3, 3, 3], [2, 2, 2], [4, 4, 4], [5, 5, 5], [6, 6, 6]],
         columns=['A', 'B', 'C'], index=[1, 2, 3, 4, 5])

A.columns = pd.MultiIndex.from_product([['A'], A.columns])
B.columns = pd.MultiIndex.from_product([['B'], B.columns])
AB = pd.concat([A, B], axis = 1)

I would like to add a column 'new' to the level 'B', based on a condition of column ['B', 'C']. I'm looking to specifically use df.loc, like this:

AB['B', 'new'] = 0
AB.loc[AB['B', 'C'] >= 3, 'new'] = 1

The problem is that this procedure creates a 'new' df instead of filling the column ['B', 'new'].

The desired output is:

    A           B   
    A   B   C   A   B   C   new 
1   1   5   2   3   3   3   1
2   2   4   4   2   2   2   0
3   3   3   1   4   4   4   1
4   4   2   2   5   5   5   1
5   5   1   4   6   6   6   1
0

1 Answer 1

4

Use tuples to reference the multilevel indexes/columns:

AB[('B', 'new')] = 0
AB.loc[AB[('B', 'C')] >= 3, ('B', 'new')] = 1

Alternatively, in a single line:

AB[('B', 'new')] = AB[('B', 'C')].ge(3).astype(int)

The resulting output:

   A        B          
   A  B  C  A  B  C new
1  1  5  2  3  3  3   1
2  2  4  4  2  2  2   0
3  3  3  1  4  4  4   1
4  4  2  2  5  5  5   1
5  5  1  4  6  6  6   1
Sign up to request clarification or add additional context in comments.

1 Comment

Cool, I've tried tuple before and for some reason I thought it didnt work. The bonus of your answer is learning about df.ge(). Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.