4

I have this df:

     X
0   13500
1   13600
2   BBOX-001
3   Mobi-1
4   15003
5   15004

I am trying to enter a new column. if x >15000 then the value is A, otherwise B. If X is non-numeric (BBOX-001, Mobi-1), then it should display the value in column X:

     X        Y
0   13500     B
1   13600     B
2   BBOX-001  BBOX-001
3   Mobi-1    Mobi-1
4   15003     A
5   15004     A

I have this below but how do I ignore the non-numeric values in column X?

df['Y'] = np.where(df['X'] > 15000, 'A', 'B')

2 Answers 2

3

When df['X'] contains a mix of numbers and strings, the dtype of the column will be object instead of a numeric dtype. The number-like items in df['X'] may be ints or floats or maybe even strings (it's unclear from your question). Many numeric operations such as df['X'] > 15000 may raise errors in this case.

To treat the number-like values as numbers, use pd.to_numeric to convert the column into a numeric dtype:

In [41]: numeric_X = pd.to_numeric(df['X'], errors='coerce')
In [43]: numeric_X
Out[43]: 
0    13500.0
1    13600.0
2        NaN
3        NaN
4    15003.0
5    15004.0
Name: X, dtype: float64

And you can also identify the string-like values by testing for NaNs:

is_stringlike = np.isnan(numeric_X)

import numpy as np
import pandas as pd

df = pd.DataFrame({'X': ['13500', '13600', 'BBOX-001', 'Mobi-1', '15003', '15004']})

numeric_X = pd.to_numeric(df['X'], errors='coerce')
is_stringlike = np.isnan(numeric_X)
conditions = [numeric_X > 15000, is_stringlike]
choices = ['A', df['X']]
df['Y'] = (np.select(conditions, choices, default='B'))
print(df)

yields

          X         Y
0     13500         B
1     13600         B
2  BBOX-001  BBOX-001
3    Mobi-1    Mobi-1
4     15003         A
5     15004         A
Sign up to request clarification or add additional context in comments.

Comments

1

You may achieve your goal with convert_objects:

import pandas as pd
import numpy as np

df = pd.DataFrame({'X': ['13500', '13600', 'BBOX-001', 'Mobi-1', '15003', '15004']})
# Convert only numeric value to put it in comparison
df['Y'] = np.where(df.X.convert_objects(convert_numeric=True) > 15000, 'A', 'B')

print (df)

Output:

         X  Y
0     13500  B
1     13600  B
2  BBOX-001  B
3    Mobi-1  B
4     15003  A
5     15004  A

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.