Replace a value in the dataframe if it falls in between specific values

Question

I have a dataframe which can be constructed as:

df = pd.DataFrame({'A': [1, 4, 6, 3, 2, 3, 6, 8], 
                   'B': [4, 7, 1, 5, 6, 8, 3, 9], 
                   'C': [1, 5, 3, 1, 6, 8, 9, 0], 
                   'D': [6, 3, 7, 8, 9, 4, 2, 1]})

The df looks like:

    A   B   C   D
0   1   4   1   6
1   4   7   5   3
2   6   1   3   7
3   3   5   1   8
4   2   6   6   9
5   3   8   8   4
6   6   3   9   2
7   8   9   0   1

And there are 2 other variables which are to be used in substitution of values in the df:

mx = pd.core.series.Series([7, 8, 8, 7], index=["A", "B", "C", "D"])
dm = pd.core.series.Series([5, 8, 6, 4], index=["A", "B", "C", "D"])

PROBLEM: I want to replace all the values from the dataframe greater than the corresponding value in dm but less than that in mx with the values from dm. In other words, let's say for "D", I want to replace all the values between 4 and 7 with 4.

So the expected output would look something like:

    A   B   C   D
0   1   4   1   4
1   4   7   5   3
2   5   1   3   4
3   3   5   1   8
4   2   6   6   9
5   3   8   6   4
6   5   3   9   2
7   8   9   0   1

I have tried using df.apply and df.update but I'm unable to make the condition. It always throws a ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Is there an efficient way to achieve this? Any help would be appreciated.

jezrael · Accepted Answer · 2021-07-15 13:25:48Z

2

Use DataFrame.mask with compare DataFrame by Series by DataFrame.le and DataFrame.ge, chained mask by & for bitwise AND and replace by Series with parameter axis=1:

df = df.mask(df.ge(dm) & df.le(mx), dm, axis=1)
print (df)
   A  B  C  D
0  1  4  1  4
1  4  7  5  3
2  5  1  3  4
3  3  5  1  8
4  2  6  6  9
5  3  8  6  4
6  5  3  9  2
7  8  9  0  1

edited Jul 15, 2021 at 13:25

answered Jul 15, 2021 at 13:15

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Ron Serruya Over a year ago

This is definitely nicer than my way, just notice that according to the expected output you should use ge and le instead of gt and lt

Ron Serruya · Accepted Answer · 2021-07-15 13:23:18Z

0

I can't tell you if its the best way, and its probably isn't, but this works

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'A': [1, 4, 6, 3, 2, 3, 6, 8],
   ...:                    'B': [4, 7, 1, 5, 6, 8, 3, 9],
   ...:                    'C': [1, 5, 3, 1, 6, 8, 9, 0],
   ...:                    'D': [6, 3, 7, 8, 9, 4, 2, 1]})

In [3]: mx = pd.core.series.Series([7, 8, 8, 7], index=["A", "B", "C", "D"])
   ...: dm = pd.core.series.Series([5, 8, 6, 4], index=["A", "B", "C", "D"])

In [4]: df
Out[4]:
   A  B  C  D
0  1  4  1  6
1  4  7  5  3
2  6  1  3  7
3  3  5  1  8
4  2  6  6  9
5  3  8  8  4
6  6  3  9  2
7  8  9  0  1

In [5]: for col in df.columns:
   ...:     df[col] = df[col].apply(lambda x: x if not dm[col]<=x<=mx[col] else dm[col])
   ...:

In [6]: df
Out[6]:
   A  B  C  D
0  1  4  1  4
1  4  7  5  3
2  5  1  3  4
3  3  5  1  8
4  2  6  6  9
5  3  8  6  4
6  5  3  9  2
7  8  9  0  1

answered Jul 15, 2021 at 13:23

Ron Serruya

4,4963 gold badges21 silver badges33 bronze badges

1 Comment

jezrael Over a year ago

stackoverflow.com/questions/54432583/…

Collectives™ on Stack Overflow

Replace a value in the dataframe if it falls in between specific values

2 Answers 2

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related