How to apply a specified function to multiple variables in a data frame?

Question

I am quite new to writing functions and am working a generic function that is to be applied to several, but not all, rows in a data frame. The function is supposed to conditionally transform the values of these specified rows.

Example data: df <- data.frame("Var1" = c(0:5), "Var2" = c(-5:0), "Var3" = c(0,0,0,0,0,0)

> df
  Var1 Var2 Var3
1    0   -5    0
2    1   -4    0
3    2   -3    0
4    3   -2    0
5    4   -1    0
6    5    0    0

Example function:

myFun <- function(x, na_value){
  x[x == na_value] <- NA
  x
}

Given that I want 0's to transform to NA for Var1 and Var 2 - but NOT Var3, I have written df$Var1 <- myFun(df$Var1, 0) and df$Var2 <- myFun(df$Var2, 0) - but there has got to be a simpler way of doing this?

What I evision is something like myFun(Var1, Var2, 0) that transforms the 0's in Var1 and Var2 to NA without having to repeat the code for both variables. The function is to be applied for multiple data frames with different variable names and different na_values which is why I have written it in the first place, and it works fine, but I would like to simplify even more.

Another posibility: df[, c("Var1", "Var2")][df[, c("Var1", "Var2")] == 0] <- NA — markus
– markus, Commented Jul 10, 2019 at 11:39

Philopolis · Accepted Answer · 2019-07-10 11:56:40Z

1

For one single dataframe, apply is the standard way to do this. For example here:

df[ , -3] <- apply(df[ , -3], FUN = myFun, na_value = 0, MARGIN = 2)
df

I don't know if your other dataframes are formatted exactly in the same way, however. But you can combine an apply and a lapply (or mapply) to do this operation on all your dataframes.

EDIT: Here is a more general (and a little ugly or old-fashioned) solution with a for loop:

## Define a list of two dataframes:
df <- data.frame("Var1" = c(0:5), "Var2" = c(-5:0), "Var3" = c(0,0,0,0,0,0))
df2 <- data.frame("VarA" = c(0:5), "VarB" = c(-5:0), "VarC" = c(3,3,3,3,3,3))
my_list <- list(df, df2)
## Colnames to consider, and missing values indicator, for each dataframe:
na_values <- list(0, 3) # NA = 0 in the first one, NA = 3 in the second
cols <- list(c("Var1", "Var2"), c("VarA", "VarB"))
## Define an R function to replace a given character by "NA" in a dataframe:
replace_nas <- function(data, cols, na_value){
    data[ , cols] <- lapply(data[ , cols], FUN = function(x) {
        x[x == na_value] <- NA
        return(x)
    }
    )
    return(data)
}
## Do this operation for each dataframe in "my_list" with a for loop:
res_list <- list()
for (k in 1:length(my_list)) {
    res_list[[k]] <- replace_nas(my_list[[k]], cols[[k]], na_values[[k]])
}
res_list

Probably not optimal, but it works!

edited Jul 10, 2019 at 11:56

answered Jul 10, 2019 at 11:22

Philopolis

5757 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

markus Over a year ago

IMHO it's better to use apply for arrays and lapply for data.frames as the first will coerce your data.frame to a matrix.

Philopolis Over a year ago

Oops, you're right! :-) All columns were numeric here so it does not really matter, but this will be a serious problem in other situations.

VLarsen Over a year ago

Thanks, lapply works great for this simple example. Is there a way to identify the columns I want to apply the function by name instead of df[ , -3]?

Philopolis Over a year ago

I edited my previous answer to propose an ugly solution!

jay.sf · Accepted Answer · 2019-07-10 12:09:54Z

0

Since you're asking for a simpler solution, you could just identify the cells that equal to zero, thereby excluding column 3, and set them to NA like so:

df[-3][df[-3] == 0] <- NA
#   Var1 Var2 Var3
# 1   NA   -5    0
# 2    1   -4    0
# 3    2   -3    0
# 4    3   -2    0
# 5    4   -1    0
# 6    5   NA    0

answered Jul 10, 2019 at 12:09

jay.sf

76.3k8 gold badges66 silver badges132 bronze badges

Collectives™ on Stack Overflow

How to apply a specified function to multiple variables in a data frame?

2 Answers 2

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related