0

I am having below ct_data dataframe

imjp_number,imct_id
182467224,'ed3baabac3ce4d86801d8490ea474963|pXJjGxufodMVq5FBSzHc2A'
307291224,'__gde66a472fe104ab381456ee059751d9d|Qujk8BKa0XkkpJMCstCYBw'
214278175,'mbKKbkKpiTsIAyCE8y07rw|e8133ceeca654d169532b4ad4de661d5'
tes123456,'tMyM0un_ptsHHC-lET6tkQ|87538a4436af47a7a9b8b9bc2b3ec5ba'
Not Found,'pXJjGxufodMVq5FBSzHc2A'

I am applying below logic but it's not working.

ct_data['imjp_number']  = ct_data.loc[ct_data['imjp_number'].apply(lambda x: isinstance(x,int)), 'imjp_number']

Please suggest me best way to select ct_data df having integer value only and remove 'tes12345' and 'Not found' value from imjp_number columns

5

1 Answer 1

1
>>> print(df.to_string()) 
  imjp_number                                                     imct_id
0   182467224     ed3baabac3ce4d86801d8490ea474963|pXJjGxufodMVq5FBSzHc2A
1   307291224  __gde66a472fe104ab381456ee059751d9d|Qujk8BKa0XkkpJMCstCYBw
2   214278175     mbKKbkKpiTsIAyCE8y07rw|e8133ceeca654d169532b4ad4de661d5
3   tes123456     tMyM0un_ptsHHC-lET6tkQ|87538a4436af47a7a9b8b9bc2b3ec5ba
4   Not Found                                      pXJjGxufodMVq5FBSzHc2A

>>> print(df.imjp_number.str.isdigit().to_string())
0     True
1     True
2     True
3    False
4    False

>>> print(df[df.imjp_number.str.isdigit()].to_string())
  imjp_number                                                     imct_id
0   182467224     ed3baabac3ce4d86801d8490ea474963|pXJjGxufodMVq5FBSzHc2A
1   307291224  __gde66a472fe104ab381456ee059751d9d|Qujk8BKa0XkkpJMCstCYBw
2   214278175     mbKKbkKpiTsIAyCE8y07rw|e8133ceeca654d169532b4ad4de661d5
>>>

From the second question I linked to in the comment.

>>> print(df.to_string())
  imjp_number                                                     imct_id
0   182467224     ed3baabac3ce4d86801d8490ea474963|pXJjGxufodMVq5FBSzHc2A
1   307291224  __gde66a472fe104ab381456ee059751d9d|Qujk8BKa0XkkpJMCstCYBw
2   214278175     mbKKbkKpiTsIAyCE8y07rw|e8133ceeca654d169532b4ad4de661d5
3   tes123456     tMyM0un_ptsHHC-lET6tkQ|87538a4436af47a7a9b8b9bc2b3ec5ba
4   Not Found                                      pXJjGxufodMVq5FBSzHc2A
>>>
>>> print(pd.to_numeric(df.imjp_number, errors='coerce').to_string())
0    182467224.0
1    307291224.0
2    214278175.0
3            NaN
4            NaN
>>>
>>> print(pd.to_numeric(df.imjp_number, errors='coerce').notnull().to_string())
0     True
1     True
2     True
3    False
4    False
>>>
>>> print(df[pd.to_numeric(df.imjp_number, errors='coerce').notnull()].to_string())
  imjp_number                                                     imct_id
0   182467224     ed3baabac3ce4d86801d8490ea474963|pXJjGxufodMVq5FBSzHc2A
1   307291224  __gde66a472fe104ab381456ee059751d9d|Qujk8BKa0XkkpJMCstCYBw
2   214278175     mbKKbkKpiTsIAyCE8y07rw|e8133ceeca654d169532b4ad4de661d5
>>>
>>> df = df[pd.to_numeric(df.imjp_number, errors='coerce').notnull()]              
>>> print(df.to_string())                                                           
  imjp_number                                                     imct_id
0   182467224     ed3baabac3ce4d86801d8490ea474963|pXJjGxufodMVq5FBSzHc2A
1   307291224  __gde66a472fe104ab381456ee059751d9d|Qujk8BKa0XkkpJMCstCYBw
2   214278175     mbKKbkKpiTsIAyCE8y07rw|e8133ceeca654d169532b4ad4de661d5
>>>
Sign up to request clarification or add additional context in comments.

2 Comments

where can i use inplace=True so this logic persist in my entire dataframe ?
@DharmendraYadav - none of those methods have an inplace parameter.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.