I have the following dataframe:
contract
0 WTX1518X22
1 WTX1518X20.5
2 WTX1518X19
3 WTX1518X15.5
I need to add a new column containing everything following the last 'X' from the first column. So the result would be:
contract result
0 WTX1518X22 22
1 WTX1518X20.5 20.5
2 WTX1518X19 19
3 WTX1518X15.5 15.5
So I figure I first need to find the string index position of the last 'X' (because there may be more than one 'X' in the string). Then get a substring containing everything following that index position for each row.
EDIT:
I have managed to get the index position of 'X' as required:
df.['index_pos'] = df['contract'].str.rfind('X', start=0, end=None)
But I still can't seem to get a new column containing all characters following the 'X'. I am trying:
df['index_pos'] = df['index_pos'].convert_objects(convert_numeric=True)
df['result'] = df['contract'].str[df['index_pos']:]
But this just gives me an empty column called 'result'. This is strange because if I do the following then it works correctly:
df['result'] = df['contract'].str[8:]
So I just need a way to not hardcode '8' but to instead use the column 'index_pos'. Any suggestions?
df. contract.str.extract(".*X(.*)")probably already works.