Conditional delete in pandas dataframe

Question

I want to delete any rows including specific string in dataframe.

I want to delete data rows with abnormal email address (with .jpg)

Here's my code, what's wrong with it?

df = pd.DataFrame({'email':['[email protected]', '[email protected]', '[email protected]', '[email protected]']})

df

             email
0    [email protected]
1    [email protected]
2       [email protected]
3  [email protected]

for i, r in df.iterrows():
    if df.loc[i,'email'][-3:] == 'com':
        df.drop(df.index[i], inplace=True) 

Traceback (most recent call last):

  File "<ipython-input-84-4f12d22e5e4c>", line 2, in <module>
    if df.loc[i,'email'][-3:] == 'com':

  File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1472, in __getitem__
    return self._getitem_tuple(key)

  File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 870, in _getitem_tuple
    return self._getitem_lowerdim(tup)

  File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 998, in _getitem_lowerdim
    section = self._getitem_axis(key, axis=i)

  File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1911, in _getitem_axis
    self._validate_key(key, axis)

  File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1798, in _validate_key
    error()

  File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1785, in error
    axis=self.obj._get_axis_name(axis)))

KeyError: 'the label [2] is not in the [index]'

sacuL · Accepted Answer · 2018-08-31 23:53:53Z

1

IIUC, you can do this rather than iterating through your frame with iterrows:

df = df[df.email.str.endswith('.com')]

which returns:

>>> df
             email
0    [email protected]
1    [email protected]
3  [email protected]

Or, for larger dataframes, it's sometimes faster to not use the str methods provided by pandas, but just to do it in a plain list comprehension with python's built in string methods:

df = df[[i.endswith('.com') for i in df.email]]

answered Aug 31, 2018 at 23:53

sacuL

51.6k9 gold badges88 silver badges115 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Wookeun Lee Over a year ago

Thank you very much! I will try this method. Anyway, what's the problem with my code?

sacuL Over a year ago

Besides the fact that iterrows is kind of slow and clunky, not much. It would work if your replaced == with != and df.drop(df.index[i], inplace=True) with df.drop(i, inplace=True)

Collectives™ on Stack Overflow

Conditional delete in pandas dataframe

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related