53

I want to pass arrange() {dplyr} a vector of variable names to sort on. Usually I just type in the variables I want, but I'm trying to make a function where the sorting variables can be input as a function parameter.

df <- structure(list(var1 = c(1L, 2L, 2L, 3L, 1L, 1L, 3L, 2L, 4L, 4L
  ), var2 = structure(c(10L, 1L, 8L, 3L, 5L, 4L, 7L, 9L, 2L, 6L
  ), .Label = c("b", "c", "f", "h", "i", "o", "s", "t", "w", "x"
  ), class = "factor"), var3 = c(7L, 5L, 5L, 8L, 5L, 8L, 6L, 7L, 
  5L, 8L), var4 = structure(c(8L, 5L, 1L, 4L, 7L, 4L, 3L, 6L, 9L, 
  2L), .Label = c("b", "c", "d", "e", "f", "h", "i", "w", "y"), 
  class = "factor")), .Names = c("var1", "var2", "var3", "var4"), 
  row.names = c(NA, -10L), class = "data.frame")

# this is the normal way to arrange df with dplyr
df %>% arrange(var3, var4)

# but none of these (below) work for passing a vector of variables
vector_of_vars <- c("var3", "var4")
df %>% arrange(vector_of_vars)
df %>% arrange(get(vector_of_vars))
df %>% arrange(eval(parse(text = paste(vector_of_vars, collapse = ", "))))
1
  • 3
    Imo, use of %>% should be saved for chaining, as it's pretty ugly... (for single actions <- or = works just fine... Commented Jan 29, 2015 at 1:15

6 Answers 6

39

Hadley hasn't made this obvious in the help file--only in his NSE vignette. The versions of the functions followed by underscores use standard evaluation, so you pass them vectors of strings and the like.

If I understand your problem correctly, you can just replace arrange() with arrange_() and it will work.

Specifically, pass the vector of strings as the .dots argument when you do it.

> df %>% arrange_(.dots=c("var1","var3"))
   var1 var2 var3 var4
1     1    i    5    i
2     1    x    7    w
3     1    h    8    e
4     2    b    5    f
5     2    t    5    b
6     2    w    7    h
7     3    s    6    d
8     3    f    8    e
9     4    c    5    y
10    4    o    8    c

========== Update March 2018 ==============

Using the standard evaluation versions in dplyr as I have shown here is now considered deprecated. You can read Hadley's programming vignette for the new way. Basically you will use !! to unquote one variable or !!! to unquote a vector of variables inside of arrange().

When you pass those columns, if they are bare, quote them using quo() for one variable or quos() for a vector. Don't use quotation marks. See the answer by Akrun.

If your columns are already strings, then make them names using rlang::sym() for a single column or rlang::syms() for a vector. See the answer by Christos. You can also use as.name() for a single column. Unfortunately as of this writing, the information on how to use rlang::sym() has not yet made it into the vignette I link to above (eventually it will be in the section on "variadic quasiquotation" according to his draft).

Sign up to request clarification or add additional context in comments.

7 Comments

I was thinking this as well, but if you do df %>% arrange_(vector_of_vars), it seems to ignore the second element and sorts only on the first element. However, if you do df %>% arrange_(vector_of_vars[1], vector_of_vars[2]), then it sorts on both values. I assume there's a more elegant approach than the second method, but I'm not sure what it is.
arrange_() does seem to ignore the second column. @eipi10 your solution would work, but the problem is that there can be arbitrary number of elements in vector_of_vars.
Ah, this works: df %>% arrange_(.dots = vector_of_vars). farnsy, if you make this change I'll give you credit for the answer
@farnsy What if you want to sort it in descending order? how to pass the desc parameter? I haven't figured out!
vector_of_vars <- c("desc(var3)", "var4");df %>% arrange_(.dots=vector_of_vars)
|
21

In the quosures spirit:

df %>% arrange(!!! rlang::syms(c("var1", "var3")))

For single variable, it would look like:

df %>% arrange(!! rlang::sym(c("var1")))

Comments

20

In the new version (soon to be released 0.6.0 of dplyr) we can make use of the quosures

library(dplyr)
vector_of_vars <- quos(var1, var3)
df %>%
    arrange(!!! vector_of_vars)
#   var1 var2 var3 var4
#1     1    i    5    i
#2     1    x    7    w
#3     1    h    8    e
#4     2    b    5    f
#5     2    t    5    b
#6     2    w    7    h
#7     3    s    6    d
#8     3    f    8    e
#9     4    c    5    y
#10    4    o    8    c

When there are more than one variable, we use quos and for a single variable it is quo. The quos will return a list of quoted variables and inside arrange, we unquote the list using !!! for evaluation

1 Comment

... which is now deprecated again... 1: Unquoting language objects with '!!!' is soft-deprecated as of rlang 0.3.0. Please use '!!' instead. It's mindblowing (to stay polite) how many functions are constantly being deprecated in the tidyverse... I'll go back to Base R for my long term code I think...
20

I think now you can just use dplyr::arrange_at().

library(dplyr)

### original
head(iris)
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6         3.1          1.5         0.2  setosa
# 5          5.0         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa

### arranged
iris %>% 
  arrange_at(c("Sepal.Length", "Sepal.Width")) %>% 
  head()
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          4.3         3.0          1.1         0.1  setosa
# 2          4.4         2.9          1.4         0.2  setosa
# 3          4.4         3.0          1.3         0.2  setosa
# 4          4.4         3.2          1.3         0.2  setosa
# 5          4.5         2.3          1.3         0.3  setosa
# 6          4.6         3.1          1.5         0.2  setosa

1 Comment

This worked for me. Holy crap there are so many syntax changes over the years for something so fundamental.
9

It's a little dense, but I think the best approach now is to use across() along with a tidyselect function, e.g. all_of():

df <- structure(list(var1 = c(1L, 2L, 2L, 3L, 1L, 1L, 3L, 2L, 4L, 4L
  ), var2 = structure(c(10L, 1L, 8L, 3L, 5L, 4L, 7L, 9L, 2L, 6L
  ), .Label = c("b", "c", "f", "h", "i", "o", "s", "t", "w", "x"
  ), class = "factor"), var3 = c(7L, 5L, 5L, 8L, 5L, 8L, 6L, 7L, 
  5L, 8L), var4 = structure(c(8L, 5L, 1L, 4L, 7L, 4L, 3L, 6L, 9L, 
  2L), .Label = c("b", "c", "d", "e", "f", "h", "i", "w", "y"), 
  class = "factor")), .Names = c("var1", "var2", "var3", "var4"), 
  row.names = c(NA, -10L), class = "data.frame")

vector_of_vars <- c("var3", "var4")

df %>% arrange(across(all_of(vector_of_vars)))

Comments

3

Try this:

df %>% do(do.call(arrange_, . %>% list(.dots = vector_of_vars)))

and actually this can be written more simply as:

df %>% arrange_(.dots = vector_of_vars)

although at this point I think its the same as farnsy's implied solution.

2 Comments

This didn't work for me, see my post.
arrange_ is deprecated, quosures way seems to be the way to go

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.