0

How do i add an extra column in a dataframe, so it could split and convert to integer types but np.nan for string types

Col1   
1|2|3
"string"

so

Col1      ExtraCol
1|2|3     [1,2,3]
"string"  nan

I tried long contorted way but failed

df['extracol'] = df["col1"].str.strip().str.split("|").str[0].apply(lambda x: x.astype(np.float) if x.isnumeric() else np.nan).astype("Int32")
1
  • I would suggest to write a specific function inside the lambda with some try-except with your specific case Commented Dec 28, 2022 at 16:26

2 Answers 2

1

Another possible solution:

import re

df['ExtraCol'] = df['Col1'].apply(lambda x: [int(y) for y in re.split(
    r'\|', x)] if x.replace('|', '').isnumeric() else np.nan)

Output:

     Col1   ExtraCol
0   1|2|3  [1, 2, 3]
1  string        NaN
Sign up to request clarification or add additional context in comments.

Comments

1

You can use regex and Series.str.match to find the rows whose value can be split into integer lists

df['ExtraCol'] = df.loc[df['Col1'].str.match(r'\|?\d+\|?'), 'Col1'].str.split('|')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.