0

I have code where a function/method accepts a Series (row from df) and is supposed to modify it in-place, such that changes are reflected in the original df. However, I seem unable to force the modification as a view rather than a copy. Information from the documentation and a related question on Stack Overflow do not resolve the issue as given by the example below:

import pandas as pd
pd.__version__ # 0.24.2

ROW_NAME = "r1"
COL_NAME = "B"
NEW_VAL = 100.0

# df I would like to modify in-place
df = pd.DataFrame({"A":[[1], [2], [3,4]], "B": [1.0, 2.0, 3.0]}, index=["r1", "r2", "r3"])

# a row (Series reference) is the input param to a function that should modify df in-place
record = df.loc[ROW_NAME]
record.loc[COL_NAME] = NEW_VAL
assert df.loc[ROW_NAME, COL_NAME] == NEW_VAL #False

The line starting with record.loc results in the familiar warning: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame, which might make sense, except that record appears to reference df and can be modified in-place under some circumstances. An example of this:

record = df.loc[ROW_NAME]
record.loc["A"].append(NEW_VALUE)
assert NEW_VALUE in df.loc["r1", "A"] # True

My question is: how can I force a modification the float value at df.loc[ROW_NAME, COL_NAME] in-place from the Series record? Bonus points for clarifying why it is possible to modify column A in-place but not column B in the examples above.

Other related questions:

2 Answers 2

1

I think this behavior is confusing because record in this case is a shallow copy of your data frame row.

If you refer to this stack post it sounds like .loc[] is generally expected to return a copy and not a view, and that assignment will not work if the .locs have been chained.

I did confirm if you modify the original data frame directly it will work.

df.loc[ROW_NAME, COL_NAME] = NEW_VAL
assert(df.loc[ROW_NAME, COL_NAME] == NEW_VAL) # True

And as for the .append still working, this is why I mentioned the "shallow" copy behavior. Your new record copy still contains a reference to the original list in column A. See this post for a refresher on the difference between binding to a new object vs mutating an existing object.

Sign up to request clarification or add additional context in comments.

3 Comments

this is a good start, but I pass the series (record) to a function which should modify the df in-place. Is there a way to force record to be a view to the original df row?
is there a way you can pass the ROW_NAME used to created the series into the function so that you can modify the original dataframe? I don't think pandas provides views.
Looking at other references, I think the shallow copy is critical, as you've pointed out. At least one other reference suggests it is not possible to force return a view. It looks like my best option is to change the function signature to take the df and row_name.
1

Based on the sources linked in the question and a thorough reading of the documentation, it does not appear possible to enforce returning a view vs copy of a Series generated from a DataFrame row.

As @Lilith Schneider points out, the original confusion over this comes from the fact that record = df.loc["r1"] returns a shallow copy - some hybrid of a copy and view that may cause confusion and lead to unexpected behavior.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.