2

To my surprise, I couldn't find this being asked before.

My input data.frame

a <- c(1,5.3,3,1,-8,6,-1)
b <- c(4,-2,1.0,"1 2","-","1.2.3","x")
df <- data.frame(a,b)
df
   a     b
1  1     4
2  5.3  -2
3  3     1
4  1   1 2
5 -8     -
6  6 1.2.3
7 -1     x

Desired output

    a  b
1 1.0  4
2 5.3 -2
3 3.0  1

What I came up with

df[apply(df, 1, function(r) !any(is.na(as.numeric(r)))) ,]

It works but it throws some ugly warnings

    a  b
1 1.0  4
2 5.3 -2
3 3.0  1
Warning messages:
1: In FUN(newX[, i], ...) : NAs introduced by coercion
2: In FUN(newX[, i], ...) : NAs introduced by coercion
3: In FUN(newX[, i], ...) : NAs introduced by coercion
4: In FUN(newX[, i], ...) : NAs introduced by coercion

Any better idea using R base if possible?

3
  • Is there any reason why your source data frame contains both text and non text data in the same column? Commented Nov 23, 2021 at 1:30
  • It's just to avoid wrong input from the user. Note that all values in b are considered as text, but some of them can (and should) be coerced to numeric. Commented Nov 23, 2021 at 8:27
  • This may look a bit silly now, but I just realized that suppressWarnings( df[apply(df, 1, function(r) !any(is.na(as.numeric(r)))) ,] ) is probably the best way to do this. Commented Dec 17, 2021 at 11:29

2 Answers 2

2

note: edited after you altered your question

I don't think the warnings are a big problem. They just tell you what you know already; that 4 character values return NA when coerced to numeric.

You could filter the data frame for only positive or negative digit values, then convert to numeric, using dplyr:

library(dplyr)

df %>% 
  filter(if_all(everything(), ~ grepl("^-?\\d+(\\.\\d+)?$", .x))) %>% 
  mutate(across(everything(), ~ as.numeric(.x)))

Result:

    a  b
1 1.0  4
2 5.3 -2
3 3.0  1
Sign up to request clarification or add additional context in comments.

Comments

1

A couple base R solutions using strtoi (without warnings)

rowSums (all integer)

df[ !is.na( rowSums( sapply( df, strtoi ) ) ), ]

  a  b
1 1  4
2 5 -2
3 3  1

complete.cases (all integer)

df[ complete.cases( sapply( df, strtoi ) ), ]

  a  b
1 1  4
2 5 -2
3 3  1

EDIT after changes (some float)

The next uses double sapply to touch every single value, no vectors. It's important in cases where you have conflicting modes, i.e. ifelse can't decide.

df
    a     b
1   1     4
2 5.3    -2
3   3     1
4   1   1 2
5  -8     -
6   6   1.3
7  -1     x
8 2.5 1.2.3

data.frame( na.omit( sapply( df, function(x) 
  sapply( x, function(y) 
    ifelse(grepl("^-?\\d+\\.\\d+$", y), as.numeric(y), strtoi(y)) ) )))

      a    b
1   1.0  4.0
5.3 5.3 -2.0
3   3.0  1.0
6   6.0  1.3

1 Comment

Thanks for answering, but this solution also removes float numerics. I have edited the input data.frame of my question

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.