R mutate selection of dataframe columns using another dataframe with same named selection of columns

Question

Is there an elegant way to mutate a selection of columns in a dataframe (call it df1) based on the same selection of columns in another dataframe (df2) without doing a join?

The selection of columns in df2 have the same names as in df1, also both dataframes have the same number of rows (same id columns).

In this code, please replace 'elegant_function' with your elegant function. The selection of columns is 'a' and 'b'. The 'ignore_me' column is the id column in both, which might tempt you to join the dataframes, however please ignore it instead.

df1 <- data.frame(ignore_me = 1:5, a = 1:5, b = 11:15)
df2 <- data.frame(ignore_me = 1:5, a = c(0, 1, 1, 0, 2), b = c(1, 0, 1, 2, 0))

fn <- function(x1, x2){
  if(x2 == 1){
    return(x1 - x2)
  }
  if(x2 == 2){
    return(x1 + x2)
  }
  x1
}
fn <- Vectorize(fn)

df <- elegant_function(
  df1 
  , df2
  , c("a", "b")
  , fn
  )

The output looks like this:

> df
          ignore_me a b
1         1         1 10
2         2         1 12
3         3         2 12
4         4         4 16
5         5         7 15

Here is an example of an inelegant way to do this:

df <- df1 %>% select(ignore_me) %>%
  mutate(
    a = fn(df1$a, df2$a)
    , b = fn(df1$b, df2$b)
    )

Inelegant because each selected column requires a new line in the mutate function - it would be elegant if the selected columns could be provided as an input string to the function so it can vary at run time.

There may be other columns in df1, df2 to also ignore, I've only included the 'ignore_me' column as an example of these.

You just talked of value of column a in row 2. What of row 1? What of column b in row 1? — Onyambu
– Onyambu, Commented May 27, 2020 at 0:26
I still don't get it what you are trying to do here. For c("a", "b") column in df1 and df2 how do you want to apply the function? — Ronak Shah
– Ronak Shah, Commented May 27, 2020 at 0:36
First, the function fn is applied to column 'a' in df1 and column 'a' in df2 (as the inputs x1, x2 to fn) to create column 'a' in df; then the function is applied to column 'b' ... — Vlad
– Vlad, Commented May 27, 2020 at 0:41

Onyambu · Accepted Answer · 2020-05-27 02:48:23Z

2

Since we are to ignore the ignore_me column, we could do:

(-1)^df2 * df2 + df1
  ignore_me a  b
1         0 1 10
2         4 1 12
3         0 2 12
4         8 4 16
5         0 7 15

Check the other columns apart from the ignore me column

Update:

elegant_function <- function(dat1,dat2,colNames,FUN)
{
  dat1[colNames] <- data.frame(Map(Vectorize(FUN),dat1[colNames],dat2[colNames]))
  dat1
}
elegant_function(df1, df2, c("a", "b"), fn)

 ignore_me a  b
1         1 1 10
2         2 1 12
3         3 2 12
4         4 4 16
5         5 7 15

edited May 27, 2020 at 2:48

answered May 27, 2020 at 0:43

Onyambu

80.3k3 gold badges29 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Onyambu Over a year ago

@Vlad so what exactly is the issue? Does this not solve your problem?

Vlad Over a year ago

I've provided an example of an inelegant solution. Your idea is hard coding the function fn and not providing it as a parameter which doesn't solve my problem.

Onyambu Over a year ago

@Vlad, I reread your question again. elegant function cannot be elegant if fn is not vectorized in the first place

Vlad Over a year ago

Almost there - just missing the ignore_me column which can be taken from df1.

Vlad Over a year ago

Just a small typo - replace df1 with dat1 in the function definition - otherwise works perfectly.

Collectives™ on Stack Overflow

R mutate selection of dataframe columns using another dataframe with same named selection of columns

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related