2

I have a simple multiindex dataframe in pandas. I'm trying to add additional subcolumns, but I'm being warned off with

SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

I cannot manage to get the right indexing incantation to make this work.

Attaching a code fragment that has a simple, non-hierarchical example of the sorts of columns I want to add. Then i have a hierchical example where I demonstrate how i can add new top-level colummns, but cannot properly manipulate individual sub-columns

import pandas as pd
import numpy as np


#simple example that works: add two columns to a non-hierarchical frame
sdf = pd.DataFrame(np.random.randn(6,4),columns=list('ABCD'))
sdf['E'] = 7
sdf['F'] = sdf['A'].diff(-1)


#hierarchical example
df = pd.DataFrame({('co1', 'price'): {0: 1, 1: 2, 2:12, 3: 14, 4: 15},\
('co1', 'size'): {0: 1, 1: 5, 2: 9, 3: 13, 4: 17},\
('co2', 'price'): {0: 2, 1: 6, 2: 10, 3: 14, 4: 18},\
('co2', 'size'): {0: 3, 1: 7, 2: 11, 3: 15, 4: 19}})

df.index.names = ['run']
df.columns.names = ['security', 'characteristic']

#I can add a new top level column
df['newTopLevel?'] = "yes"


#I cannot manipulate values of existing sub-level columns
"""A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead"""

df['co1']['size'] = "gross"
df['co1']['price'] = df['co1']['price']*2


#I cannot add a new sub-level column
df['co1']['new_sub_col'] = "fails"

I seem to be missing some fundamental understanding of this issue, which is frustrating as I've read pretty closely the O'Reilly "Python for Data Analysis" book written by the pandas author! ugh.

1 Answer 1

1

To avoid the warning/error use loc and do these in one assignment:

In [11]: df.loc[:, ('co1', 'size')] = "gross"

In [12]: df.loc[:, ('co1', 'price')] *= 2

In [13]: df.loc[:, ('co1', 'new_sub_col')] = "fails"  # not anymore

In [14]: df
Out[14]:
security         co1          co2      newTopLevel?         co1
characteristic price   size price size              new_sub_col
run
0                  2  gross     2    3          yes       fails
1                  4  gross     6    7          yes       fails
2                 24  gross    10   11          yes       fails
3                 28  gross    14   15          yes       fails
4                 30  gross    18   19          yes       fails
Sign up to request clarification or add additional context in comments.

1 Comment

Subsequent to filing this question, I discovered that I could write: df['co1', 'price'] = "gross". Is there any reason to prefer this or the answer you suggested?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.