1

I Want a Separate column which returns "Yes" if the column "ID" contains all numeric values and 'No' if it contains alphabets or alphanumeric values.

ID      Result
3965      Yes
wyq8      No
RO_123    No
CMD_      No
2976      Yes

3 Answers 3

3

You can use pd.Series.str.isnumeric here.

df['Result'] = np.where(df['ID'].str.isnumeric(), 'YES', 'NO')

       ID Result
0    3965    YES
1    wyq8     NO
2  RO_123     NO
3    CMD_     NO
4    2976    YES

There's a caveat with using isnumeric it doesn't identify float numbers.

test = pd.Series(["9.0", "9"])
test.str.isnumeric()

0    False
1     True
dtype: bool

If you strictly mark YES for int then use isnumeric else you can use pd.Series.str.fullmatch(available from version 1.1.0) here.

df['Result'] = np.where(df['ID'].str.fullmatch(r"\d+|\d+\.\d+", 'YES', 'NO')

For version <1.1.0 you use re.fullmatch

numeric_pat = re.compile(r"\d+|\d+\.\d+")
def numeric(val):
    match = numeric_pat.fullmatch(val)
    if match: return 'YES'
    else: return 'NO'

df['Result'] = df['ID'].apply(numeric)

Or we can use pd.to_numeric with boolean masking using pd.Series.isna

m = pd.to_numeric(df['ID'], errors='coerce').isna()
df['Result'] = np.where(m, 'NO', 'YES')

With errors parameter set to 'coerce' values which cannot be turned into numeic value will set to Nan.

test = pd.Series(['3965', 'wyq8', 'RO_123', 'CMD_', '2976'])
pd.to_numeric(test)

0    3965.0
1       NaN
2       NaN
3       NaN
4    2976.0
Name: ID, dtype: float64

Or you can build a custom function

def numeric(val):
    try:
        float(val)     # Using just `float` would suffice as int can be 
        return 'YES'   # converted to `float` so both `int`
                       # and `float` wouldnot raise any error
    except ValueError:
        return 'NO'

df['Result'] = df['ID'].apply(numeric)

Note: float handles scientic notation too, float("1e6") -> 1000000.0.

test = pd.Series(['1e6', '1', 'a 10', '1E6'])
test.apply(numeric)

0    YES
1    YES
2     NO
3    YES
dtype: object
Sign up to request clarification or add additional context in comments.

Comments

3

Check if ID contains non-digitsand reverse the Boolean selection using ~. Using np.where, allocate option

df['Result']=np.where(~df.ID.str.contains('(\D+)'),'Yes','N0')

     ID Result
0    3965    Yes
1    wyq8     N0
2  RO_123     N0
3    CMD_     N0
4    2976    Yes

As noted by @Cameron Riddell. You could also skip inverting the boolean and do the following;

df['Result']=np.where(df.ID.str.contains('(\D+)'),'No','Yes')

3 Comments

Instead of inverting the boolean array, why not just make "No" the True value and "Yes" the False value? np.where(df.ID.str.contains('(\D+)'),'No','Yes')
@Cameron Riddell we actually could. My philosophy is to make answer as retraceable as possible for the OP if they choose to run break it down. Good point though.
This would fail when there's a string '123 SO', using re.fullmatch would counter that example.
1

You can use .isnumeric() method:

df3["Result"] = df3["ID"].str.isnumeric().apply(lambda x: "No" if x == False else "Yes")

[UPDATE]: This method works only with integers numbers, please view the @Ch3steR answer for other cases.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.