0

This may be a simple question, but I am having trouble finding a solution. I have a variable named T_wall with a pandas Series containing numbers. When that value is over 2,000, I would like the T_wall to output 2,000.

I have tried an if statement but I continue to get errors. Any ideas? Thanks!

import pandas as pd
T_wall = pd.Series([1999.0, 2000.0, 2001.0, 2002.0, 2003.0])

if T_wall > 2000.0:
    T_wall = 2000.0

I am getting the following error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

6
  • what error you getting ? It should work Commented Jan 5, 2021 at 21:54
  • I am getting this error:ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). Commented Jan 5, 2021 at 21:55
  • 1
    So T_wall is a pandas Series. Commented Jan 5, 2021 at 21:57
  • 2
    It looks like T_wall is not a int or float. Please post a full example that can be copied into a python interpreter and reproduces the error. I googled the error message and the first hit points to stackoverflow: [stackoverflow.com/questions/36921951/… Commented Jan 5, 2021 at 21:57
  • The error message you quote in the above comment tells you exactly what the problem is, and exactly how to fix it. Be sure you include the message in your question itself, going forward; also, a few words about how you understand that message and how you tried to apply its advice would help folks understand where you're coming from and how to better tailor answers to that perspective. Commented Jan 5, 2021 at 21:59

4 Answers 4

1

Taking everything from the comments. Avoid using for in pandas as much as possible. Here you can go with masking:

T_wall[T_wall > 2000] = 2000

apply would also work

T_wall.apply(lambda x: 2000 if x > 2000 else x)
Sign up to request clarification or add additional context in comments.

2 Comments

Are you sure the apply examples is correct? Probably should be: T_wall.apply(lambda x: 2000 if x > 2000 else x)
@de1 you're right, it wasn't good. Updated
0

T_wall is a pandas series, so instead you can loop through the series and then do your if statement:

for i in range(0, len(T_wall)):
    if T_wall[i] > 2000.0:
        T_wall[i] = 2000.0

4 Comments

Consider an apply function :)
I would strongly advice against using for loop in pandas. apply approach will be much faster
Or using mask? T_wall.mask(T_wall > 2000, 2000)
That worked thank you. My apologies as I am new to python and how panda series work.
0

The most pandas-way answer would be:

T_wall[T_wall > 2000.0] = 2000.0

Example:

data = pandas.Series([1,2,3,4,5])
data[data > 2] = 5

data:

0    1
1    2
2    5
3    5
4    5

Pandas docs: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#the-where-method-and-masking

Comments

0

To complete the question with an example.

>>> import pandas as pd
>>> ser = pd.Series([1999, 2000, 2001, 2002, 2003])
>>> ser
0    1999
1    2000
2    2001
3    2002
4    2003
dtype: int64

Meaning of ser > 2000

>>> ser > 2000
0    False
1    False
2     True
3     True
4     True
dtype: bool

As you can see ser > 2000 returns a series itself, with True or False values, depending on whether the condition matched.

There are several ways to then use that condition.

The mask function

mask can accept the condition and returns a new Series that "replaces" the values with the provided value (the original series won't change unless you set inplace). (See also Mask User Guide section)

>>> ser.mask(ser > 2000, 2000)
0    1999
1    2000
2    2000
3    2000
4    2000
dtype: int64

That is somewhat equivalent to:

>>> [(2000 if x > 2000 else x) for x in ser]
[1999, 2000, 2000, 2000, 2000]

The where function

where is the inverse of mask, therefore you'd want to invert the condition to achieve the same effect. Here the second argument is other, providing the replacement value where the condition is False. (See also Where User Guide section)

>>> ser.where(ser <= 2000, 2000)
0    1999
1    2000
2    2000
3    2000
4    2000
dtype: int64

That is somewhat equivalent to:

>>> [(x if x <= 2000 else 2000) for x in ser]
[1999, 2000, 2000, 2000, 2000]

assignment via boolean indexing

You can also change the series directly via boolean indexing as indicated in other answers (adding for completeness):

>>> ser
0    1999
1    2000
2    2001
3    2002
4    2003
dtype: int64
>>> ser[ser > 2000] = 2000
>>> ser
0    1999
1    2000
2    2000
3    2000
4    2000
dtype: int64

(That would then be equivalent to ser.mask(ser > 2000, 2000, inplace=True))

The apply function

You could also use apply (also with an optional inplace parameter):

>>> ser = pd.Series([1999, 2000, 2001, 2002, 2003])
>>> ser.apply(lambda x: 2000 if x > 2000 else x)
0    1999
1    2000
2    2000
3    2000
4    2000
dtype: int64

That allows you to use a regular Python function or expression. But it won't be as efficient for large series as the other examples, as it will call the Python expression for each value rather than doing everything within Pandas (vectorized).

Similar questions

Comments