0

I have difficulty to set proper nested if statement in a user-defined function.

My sample data is like this

test <- data.frame(x=rev(0:10),y=10:20)

if_state <- function(x,y) {
  if (x==min(x) && y==max(y)) {
    "good"
  } else if (max(x)/2==y[which(y==15)]/3) {  # to find when x=5 and y=5 condition if it is true set class to "y==5"
    "y==5"
  }
    NA
}

   > test
    x  y
1  10 10
2   9 11
3   8 12
4   7 13
5   6 14
6   5 15
7   4 16
8   3 17
9   2 18
10  1 19
11  0 20

library(dplyr)
test %>%
  mutate(class = if_state(x,y))

    x  y class
1  10 10    NA
2   9 11    NA
3   8 12    NA
4   7 13    NA
5   6 14    NA
6   5 15    NA
7   4 16    NA
8   3 17    NA
9   2 18    NA
10  1 19    NA
11  0 20    NA

I don't know why the if statement is not working correctly? The question is what is the base R function that work same as dplyr's case_when ? please see the comments below.

So the expected output

    x  y class
1  10 10    NA
2   9 11    NA
3   8 12    NA
4   7 13    NA
5   6 14    NA
6   5 15    y==5
7   4 16    NA
8   3 17    NA
9   2 18    NA
10  1 19    NA
11  0 20    good
1
  • 1
    After the if statement, you are returning NA. You need to explicitly return, e.g. return("good") Commented Apr 30, 2018 at 18:45

1 Answer 1

3

R functions return the last value evaluated evaluated during their invocation, even without an explicit call to return (see this answer for more detail); so, where NA is the last value evaluated in your if_state function (as it's outside the if-else if control flow, and so will always be evaluated), it will always return NA, even when the if and else if conditions are true. For your function to work as you expect, you need to move NA into an else statement:

if_state <- function(x,y) {
  if (x == min(x) && y == max(y)) {
    "good"
  } else if (max(x)/2 == y[which(y == 15)]/3) {
    "y==5"
  } else {
    NA 
  }
}

Note that when using dplyr, testing for multiple conditions to determine a return value is often more succinctly accomplished with case_when:

test %>% mutate(class = case_when(
  x == min(x) && y == max(y) ~ "good",
  max(x)/2 == y[which(y == 15)]/3 ~ "y == 5",
  TRUE ~ NA_character_
))

Edit: based on OP's clarification and eipi10's help, here is the final function:

if_state = function(x, y) {
  case_when(x == min(x) && y == max(y) ~ "good", 
            x == max(x)/2 & y/3 == 5 ~ "y==5", 
            TRUE ~ NA_character_)
}
Sign up to request clarification or add additional context in comments.

15 Comments

It looks like max(x)/2 == y[which(y == 15)]/3 is always TRUE, so the result will be "y==5" for any rows that don't satisfy the first condition. Maybe the OP actually wanted something like x==max(x)/2 & y/3==5 ~ "y==5"?
@cmaher Thank you for explicit answer. wheh I run the your new if_state I am getting all class values to be y==5 ??
See my comment above.
max(x)/2 returns 5 for every row in the data frame. y[which(y == 15)]/3 returns 5 for every row in the data frame. So the condition being evaluated is 5==5 which is always TRUE.
Wrap @cmaher's answer in a function call (and I've changed the second condition as well): if_state = function(x,y) {case_when( x == min(x) && y == max(y) ~ "good", x == max(x)/2 & y/3 == 5 ~ "y==5", TRUE ~ NA_character_ )}
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.