1

Given this data:

df=data.frame(
  x1=c(2,0,0,NA,0,1,1,NA,0,1),
  x2=c(3,2,NA,5,3,2,NA,NA,4,5),
  x3=c(0,1,0,1,3,0,NA,NA,0,1),
  x4=c(1,0,NA,3,0,0,NA,0,0,1),
  x5=c(1,1,NA,1,3,4,NA,3,3,1))

I want to create an extra column min for the rowwise minimal value of selected columns using dplyr. That's easy using the column names:

df <- df %>% rowwise() %>% mutate(min = min(x2,x5))

But I have a large df with varying column names so I need to match them from some string of values mycols. Now other threads tell me to use select helper functions, but I must be missing something. Here's matches:

mycols <- c("x2","x5")
df <- df %>% rowwise() %>%
  mutate(min = min(select(matches(mycols))))
Error: is.string(match) is not TRUE

And one_of:

mycols <- c("x2","x5")
 df <- df %>%
 rowwise() %>%
 mutate(min = min(select(one_of(mycols))))
Error: no applicable method for 'select' applied to an object of class "c('integer', 'numeric')"
In addition: Warning message:
In one_of(c("x2", "x5")) : Unknown variables: `x2`, `x5`

What am I overlooking? Should select_ work? It doesn't in the following:

df <- df %>%
   rowwise() %>%
   mutate(min = min(select_(mycols)))
Error: no applicable method for 'select_' applied to an object of class "character"

And likewise:

df <- df %>%
  rowwise() %>%
  mutate(min = min(select_(matches(mycols))))
Error: is.string(match) is not TRUE
3
  • You need to use SE version of dplyr verbs when using strings. In this case use select_() Commented Feb 19, 2017 at 19:50
  • Doesn't work as I expected it to work either: df <- df %>% rowwise() %>% mutate(min = min(select_(mycols))) yields "Error: no applicable method for 'select_' applied to an object of class "character"" Commented Feb 19, 2017 at 20:07
  • You get an error with matches as it takes a string (regex) as argument not a vector of string. Commented Feb 19, 2017 at 21:15

2 Answers 2

6

Here's another solution a bit technical with the help of purrr package from the tidyverse designed for functional programming.

Fist, matches helpers from dplyr takes a regex string as argument not a vector. It is a good way for you to find a regex that matches all your columns. (in the code under you can use the dplyr select helper that you wish)

Then, purrr functions works great with dplyr when you understand the underlying scheme of functionnal programming.

Solution to your problem :


df=data.frame(
  x1=c(2,0,0,NA,0,1,1,NA,0,1),
  x2=c(3,2,NA,5,3,2,NA,NA,4,5),
  x3=c(0,1,0,1,3,0,NA,NA,0,1),
  x4=c(1,0,NA,3,0,0,NA,0,0,1),
  x5=c(1,1,NA,1,3,4,NA,3,3,1))


# regex to get only x2 and x5 column
mycols <- "x[25]"

library(dplyr)

df %>%
  mutate(min_x2_x5 =
           # select columns that you want in df
           select(., matches(mycols)) %>% 
           # use pmap on this subset to get a vector of min from each row.
           # dataframe is a list so pmap works on each element of the list that is to say each row
           purrr::pmap_dbl(min)
         )
#>    x1 x2 x3 x4 x5 min_x2_x5
#> 1   2  3  0  1  1         1
#> 2   0  2  1  0  1         1
#> 3   0 NA  0 NA NA        NA
#> 4  NA  5  1  3  1         1
#> 5   0  3  3  0  3         3
#> 6   1  2  0  0  4         2
#> 7   1 NA NA NA NA        NA
#> 8  NA NA NA  0  3        NA
#> 9   0  4  0  0  3         3
#> 10  1  5  1  1  1         1

I won't explain further about purrr here but it works fine in your case

Sign up to request clarification or add additional context in comments.

Comments

2

This was a bit trickier. In case of SE evaluation you'd need to pass the operation as string.

mycols <- '(x2,x5)'
f <- paste0('min',mycols)
df %>% rowwise() %>% mutate_(min = f)
df
# A tibble: 10 × 6
#      x1    x2    x3    x4    x5   min
#   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1      2     3     0     1     1     1
#2      0     2     1     0     1     1
#3      0    NA     0    NA    NA    NA
#4     NA     5     1     3     1     1
#5      0     3     3     0     3     3
#6      1     2     0     0     4     2
#7      1    NA    NA    NA    NA    NA
#8     NA    NA    NA     0     3    NA
#9      0     4     0     0     3     3
#10     1     5     1     1     1     1

1 Comment

Thanks! Now, I want the lowest non-NA value so I needed to adjust this code a bit. It seems changing from min to pmin(na.rm=T) works (adding na.rm=T to min() doesn't seem to work): f <- paste0('pmin(',mycols,',na.rm=T)') df <- df %>% rowwise() %>% mutate_(min = f)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.