3

In my dataset, I have a few rows which contain characters. I only need rows which contain all integers. What is the best possible way to do this? Below data set: e.g I want to remove the rows 2nd and 3rd as they contain 051A, 04A, and 08B respectively.

1   2017    0   321     3   20  42  18
2   051A    0   321     3   5   69  04A
3   460     0   1633    16  38  17  08B
4   1811    0   822     8   13  65  18
2
  • Do you need to check for integers versus floats [and other non-string types]? Commented Mar 10, 2018 at 0:28
  • No, I was only looking for integers. Thanks! Commented Mar 10, 2018 at 3:07

5 Answers 5

8

Not sure if apply can be avoided here

df.apply(lambda x: pd.to_numeric(x, errors = 'coerce')).dropna()

    0   1   2   3   4   5   6   7
0   1   2017.0  0   321 3   20  42  18.0
3   4   1811.0  0   822 8   13  65  18.0
Sign up to request clarification or add additional context in comments.

1 Comment

an option to avoid apply, though I not as good pd.to_numeric(df.stack(), 'coerce').unstack().dropna() (-: That said, this doesn't address floats. OP said rows of all ints. If a float happened to be in a row, this would not drop it.
6

This very similar to @jpp's solution but differs in the technique to check if digit.

df[df.applymap(lambda x: str(x).isdecimal()).all(1)].astype(int)

   0     1  2    3  4   5   6   7
0  1  2017  0  321  3  20  42  18
3  4  1811  0  822  8  13  65  18

Thanks to @jpp for suggesting isdecimal as opposed to isdigit

1 Comment

Since you mentioned isdecimal, that gave me an idea... not sure how fast it would be though.
5

For this task, as stated, try / except is a solution which should deal with all cases.

pd.DataFrame.applymap applies a function to each element in the dataframe.

def CheckInt(s):
    try: 
        int(s)
        return True
    except ValueError:
        return False

res = df[df.applymap(CheckInt).all(axis=1)].astype(int)

#    0     1  2    3  4   5   6   7
# 0  1  2017  0  321  3  20  42  18
# 3  4  1811  0  822  8  13  65  18

3 Comments

I like this answer a lot! df[df.applymap(lambda x: str(x).isdigit()).all(1)].astype(int)
'1.1'.isdigit() resolves to False for me. Also, when I said I like "this" answer. I meant yours (-:
Added an answer.
4

As an alternative to the other good answers, this solution uses the stack + unstack paradigm to avoid a loopy solution.

v = df.stack().astype(str)
v.where(v.str.isdecimal()).unstack().dropna().astype(int)

   0     1  2    3  4   5   6   7
0  1  2017  0  321  3  20  42  18
3  4  1811  0  822  8  13  65  18

1 Comment

This is definitely interesting though the problem with isdecimal is that it doesn't handle float. +1 :)
2

In one line, I think you can use convert_objects function from pandas. With this, we convert object to integer, which will result in NA. We finally drop na.

df = df.convert_objects(convert_numeric=True).dropna()

You can check more information here on pandas documentation.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.