0

I have a loop, where in the different iterations I want to sort the dataframe using different column lists. I can do this sorting when I hardcode the sorting variables. However I want to pass the column names using variable list. I could not find a way of doing this.

DT <-data.frame(avar = c(1,4,3,1), bvar = c("a","f","s","b"), cvar = c(3,4,5,2))

sort1 <-c("avar", "cvar")
sort2 <-c("avar", "bvar")
sorting <-list(sort1,sort2)
DT2<-list()

for (i in 1:2) {
  sorter<- sorting[[i]]
   #THE FOLLOWING SOLUTION WORKS!!!
   DT2[[i]] <- DT[do.call(order,DT[as.character(sorting[[i]])]),]       
}

What I mean by sorting by c("avar", "cvar") is that the data is to be sorted by avar first, and (if there are two values of avar that are the same) then by cvar. In other words, the output of sorting by that vector should be only one sorted dataframe (not a list). Same thing about sorting by c("avar", "bvar"). Above "ps1" stands for one of the proposed solutions. It gives me DT2[[1]] that is a list of two dataframes. that is not what I need. DT2 should be a list of two dataframes. DT2[[1]] should be one dataframe.

I want to also stress that I do need this sorting to happen through a loop, not through passing the list ("sorting") to the command. In other words, first iteration should sort the database by first item of the sorting list, which is the vector "sorter" in my code. In real application, the data in different iterations is not the same dataset.

After the first loop, DT2[[1]] should be sorted as follows:

avar  bvar  cvar
1     b     2
1     a     3
3     s     5
4     f     4    

after the second loop DT2[[2]] should be sorted as :

avar  bvar  cvar
1     a     3
1     b     2
3     s     5
4     f     4    

In real life I may also have a different number of sorting columns in different iteration.

Regarding solutions proposed that use "map" function: I have some geospacial packages loaded (mapproj, fiftystater, geofacet), so the "map" function does not work as suggested unless I unload those packages. Is there a way to qualify to use native "map" function rather than geospacial map function?

Thank you for your help!

4
  • The sample code you give is not valid R code. For example sorting <-(sort1,sort2) will throw an error. Ditto for for (i = 1:2). This may suggest that there are deeper issues with general R syntax. Perhaps you need to take a step back and familiarise yourself with the basics of R first; you can find a lot of free material on the web. A good starting point could be An introduction to R. Commented Apr 10, 2019 at 2:40
  • 1
    Thank you. I corrected the syntax in the post. Commented Apr 10, 2019 at 2:47
  • 1
    Thanks. Corrected that too. This is a much simplified version of 5000 line code. sorry for typos. Commented Apr 10, 2019 at 2:56
  • "Is there a way to qualify to use native "map" function rather than geospacial map function?" Just prefix with purrr::map. Commented Apr 10, 2019 at 21:51

4 Answers 4

1

Using base R we can apply order on selected columns in sorting using do.call. We use lapply to get list of dataframes

lapply(sorting, function(x) DT[do.call(order, DT[x]), ])


#[[1]]
#  avar bvar cvar
#4    1    b    2
#1    1    a    3
#3    3    s    5
#2    4    f    4

#[[2]]
#  avar bvar cvar
#1    1    a    3
#4    1    b    2
#3    3    s    5
#2    4    f    4
Sign up to request clarification or add additional context in comments.

9 Comments

Thank you. I edited my original post (adding last three paragraphs) on why proposed solutions do not work.
@Edith I haven't used any map function in my answer but for remaining of them you could do packageName::functionName to use those package explicitly. For example, purrr::map
When I use your solution ( lapply(sorting[[1]], function(x) DT[do.call(order, DT[x]), ]), I get an output that is a list: first item is data sorted by avar, and the second item is data sorted by cvar. The second paragraph in my edited post explains what I want to happen instead.
@Edith I get the same expected output as shown in your post. DT2 <- lapply(sorting, function(x) DT[do.call(order, DT[x]), ]). DT2[[1]] and DT[[2]] are the two dataframes same as shown.
When I use our solution, DT2[[1]] is a list of two items - date sorted by avar and cvar respectively. I will edit the code to show exactly how I added that line.
|
1

A dplyr+purrr solution

library(purrr)
library(dplyr)
map(sorting, ~arrange(DT, !!!syms(.x)))
#[[1]]
#  avar bvar cvar
#1    1    b    2
#2    1    a    3
#3    3    s    5
#4    4    f    4
#
#[[2]]
#  avar bvar cvar
#1    1    a    3
#2    1    b    2
#3    3    s    5
#4    4    f    4

2 Comments

Thank you. I edited my original post (adding last three paragraphs) on why proposed solutions do not work.
@Edit Use purrr::map instead of map.
0

Here is a method with setorder from data.table

library(data.table)
Map(setorderv, replicate(2, copy(DT), simplify = FALSE), sorting)
#[[1]]
#  avar bvar cvar
#4    1    b    2
#1    1    a    3
#3    3    s    5
#2    4    f    4

#[[2]]
#  avar bvar cvar
#1    1    a    3
#4    1    b    2
#3    3    s    5
#2    4    f    4

Or use arrange_at from dplyr (without using the evaluation way)

library(tidyverse)
map(sorting, ~ DT %>%
                 arrange_at(.x))
#[[1]]
#  avar bvar cvar
#1    1    b    2
#2    1    a    3
#3    3    s    5
#4    4    f    4

#[[2]]
#  avar bvar cvar
#1    1    a    3
#2    1    b    2
#3    3    s    5
#4    4    f    4

4 Comments

Hi, thank you. I have two problems related to this solution:
Hi, thank you. I have two problems related to this solution: (1) I have some geospacial packages loaded as well (fifitystater, geofacet, mapproj). So it looks like "map" function from these packages "takes" over the native R map function. I get an error message on "map"statement unless I unload those geospecial packages. Is there a way to qualify to use native map function rather than what comes in geospacial tools.
Problem (2) Is more directly related to this issue. When I say to sort by c("avar", "bvar"), I do not want two separate outputs. I want the file to be sorted first by avar, and then (if there are two or more values of avar that are the same) by bvar. At the moment, using this proposed solution, I get the output that is a list: first item is data sorted by a. and the second item is data sorted by b.
@Edith Use purrr::map(sorting, ~ DT %>% to sort out the issue
0
The following solution works for me! Other proposed solutions I tried failed to sort by two variables in a given vector simultaneously.

DT <-data.frame(avar = c(1,4,3,1), bvar = c("a","f","s","b"), cvar = c(3,4,5,2))

sort1 <-c("avar", "cvar")
sort2 <-c("avar", "bvar")
sorting <-list(sort1,sort2)
DT2<-list()

for (i in 1:2) {
    #THE FOLLOWING SOLUTION WORKS!!!
    DT2[[i]] <- DT[do.call(order,DT[as.character(sorting[[i]])]),]       
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.