4

Given a data frame like:

df <- data.frame(z_a = 1:2,
                 z_b = 1:2,
                 y_a = 3:4,
                 y_b = 3:4)

I can select columns names that contain a character with:

library(dplyr)
df %>% select(contains("a"), contains("b"))

  z_a y_a z_b y_b
1   1   3   1   3
2   2   4   2   4

NOTE that the column order has changed. Columns containing a come first before columns containing b

I'd like to select column names that contain characters in a vector and that reorders the columns.

searchfor <- letters[1:2]

Using searchfor, I'd like to make the following expression and use it in a select statement:

E <- quote(contains(searchfor[1]), contains(searchfor[2]))
df %>% select_(E) 
3
  • 1
    This is a slightly different question than stackoverflow.com/questions/29018292/…. But It has the same solution. Commented Jul 9, 2017 at 13:58
  • 1
    Here's a more direct comparison: stackoverflow.com/questions/25923392/… Commented Jul 9, 2017 at 14:15
  • @wibeasley given the clarification to my original post, the below answers answer my question more closely than the other posts. Thanks! Commented Jul 9, 2017 at 17:10

4 Answers 4

4

We can do

df %>% 
   select_at(vars(matches(paste(searchfor, collapse="|")))) %>%
   select(order(sub(".*_", "", names(.))))
Sign up to request clarification or add additional context in comments.

6 Comments

Not quite the behavior I was looking for. df %>% select(contains("a"), contains("b")) changes the order of the columns, which is the output I wanted. I'll make it clear in my post.
Thanks. Now I need to figure out what you did.
@ChiPak In the first select I used a regex to extract those column and second remove the substring, order based on that and select the columns. Thanks for your note
the second only works if I want alphabetical ordering, is that right? If I wanted arbitrary ordering (determined by order of searchfor), it would not work in that case?
@ChiPak You can add a factor with levels for a general case
|
2

purrr solution:

library(purrr)
ind_lgl <- map(letters[1:2], ~ grepl(.x, names(df), fixed = TRUE)) %>%
  pmap_lgl(`|`)

df[ind_lgl]

With the pipe:

df %>%
  `[`(map(letters[1:2], ~ grepl(.x, names(df), fixed = TRUE)) %>%
        pmap_lgl(`|`))

If you to get the right order:

rank <- map(letters[1:2], ~ grepl(.x, names(df), fixed = TRUE)) %>%
  pmap(c) %>%
  map(which)


ind_chr <- data_frame(colnames = names(df), rank) %>%
  mutate(l = lengths(rank)) %>%
  filter(l > 0) %>%
  mutate(rank = unlist(map(rank, ~ .x[[1]]))) %>%
  arrange(rank) %>%
  pull(colnames)


df[ind_chr]

But it is not pretty...

2 Comments

Not quite the behavior I was looking for. df %>% select(contains("a"), contains("b")) changes the order of the columns, which is the output I wanted. Should have made that more clear in my post
Not pretty...but useful for me to study anyways. You've earned my upvote...
1

I don't understand the exact requirement, but is this solution.

select(df, matches("a|b"))

1 Comment

Close...two things I wanted. First, use a vector of character elements searchfor as arguments to contains in select. You have not used searchfor in your statement. Second, the statements should reorder the columns based on the match, such that the order of searchfor should determine the column order of the output.
0

Self answer - here's a solution with select_ and that still uses contains - just in case anyone else is interested:

library(iterators)
library(dplyr)
s <- paste0("c(", paste0(sapply(iter(searchfor), function(x) paste0("contains(\"", x, "\")")), collapse=","), ")")
df %>% select_(., s)

  z_a y_a z_b y_b
1   1   3   1   3
2   2   4   2   4

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.