2

I have a data frame df with column ID in the following pattern. What I want is to return a string column with the number after the dash sign. For the example below, I need 01,01,02. I used the command below and it failed. Since it is a very large data frame, I think it might be inefficient to do a loop and row by row extraction. Please advise, thanks

df['ID'].apply(lambda x: x.split('-')[1], axis=1)

error: () got an unexpected keyword argument 'axis'

DP00010-01
DP00020-01
..........
DP00010-02

Update: Edchum's solution

df['ID'].str.split('-').str[1] 

works for me

1 Answer 1

2

Use vectorised str method split if you have a recent version of pandas:

In [26]:
df['val'].str.split('-').str[1]
Out[26]:
0    01
1    01
2    02
dtype: object

If the dash position was fixed then you could slice it

In [28]:    
df['val'].str[8:]
Out[28]:
0    01
1    01
2    02
Name: val, dtype: object

As to why your method failed, you were calling apply on a Series (df['ID'] is a Series and not a df) and there is no axis param so the following works:

In [29]:
df['val'].apply(lambda x: x.split('-')[1])

Out[29]:
0    01
1    01
2    02
Name: val, dtype: object
Sign up to request clarification or add additional context in comments.

1 Comment

I need the numbers after the "-" and the position of "-" is not fixed. I can't use fixed position to slice it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.