0

I have the following code, I'm not sure how to rewrite it in order to avoid the SettingWithCopyWarning or should I just disable the warning?

The code is working I just want to assign the left attribute of pd.cut to a new column if the number is positive and the right attribute if negative

import numpy as np
import pandas as pd


bins = np.array([-1.5, -1.0, -0.5, 0.0, 0.5, 1.0, 1.5, 2.0])
test_data = [{"ID": 1, "Value": -0.5}, {"ID": 2, "Value": 1.5}]

df = pd.DataFrame(test_data)

df["Bin"] = 0.0
df["Bin"][df["Value"] > 0.0] = [d['left'] for d in [{fn: getattr(f, fn) for fn in ['left']} for f in pd.cut(df["Value"], bins)]]
df["Bin"][df["Value"] < 0.0] = [d['right'] for d in [{fn: getattr(f, fn) for fn in ['right']} for f in pd.cut(df["Value"], bins)]]

print(df)

Running the code produces

test.py:11: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["Bin"][df["Value"] > 0.0] = [d['left'] for d in [{fn: getattr(f, fn) for fn in ['left']} for f in pd.cut(df["Value"], bins)]]
e.py:12: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["Bin"][df["Value"] < 0.0] = [d['right'] for d in [{fn: getattr(f, fn) for fn in ['right']} for f in pd.cut(df["Value"], bins)]]
   ID  Value  Bin
0   1   -0.5 -0.5
1   2    1.5  1.0

2 Answers 2

1

Try this:

Edit:

In case of all +ve values pd.cut(df.loc[df["Value"]<0,'Value'], bins, labels=bins[1:]) gives an output of Series([], Name: Value, dtype: category - and hence an error on assignment.

But, a simple try except should avoid that:

from contextlib import suppress
with suppress(ValueError):
    df.loc[df["Value"] > 0.0,"Bin"] = pd.cut(df.loc[df["Value"]>0,'Value'], bins, labels=bins[:-1])
with suppress(ValueError):
    df.loc[df["Value"] < 0.0,"Bin"] = pd.cut(df.loc[df["Value"]<0,'Value'], bins, labels=bins[1:])

Btw here labels=bins[:-1] and labels=bins[1:] is doing the job of left and right in your original code.

Sign up to request clarification or add additional context in comments.

1 Comment

I tried your code and it removes the warning, however it introduces a new issue, lets say you have only positive numbers in test_data, the 2nd line in your code will generate an error "ValueError: Cannot set a Categorical with another, without identical categories", is there a workaround for that?
0

You should replace slicing with loc:

df.loc[df["Value"] > 0.0, "Bin"] = [d['left'] for d in [{fn: getattr(f, fn) for fn in ['left']} for f in pd.cut(df["Value"], bins)]]

3 Comments

I got ValueError: cannot set using a list-like indexer with a different length than the value
This means that the list you are creating isn't the same length as the values found in the dataframe that need to be filled. You might want to first check the number of rows fitting the condition, and then make sure the list is in the same length.
The thing is, this doesn't answer the question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.