Selecting rows from a column which name is in a variable

Question

I have a dataset with a named column. The name of the column is also in a variable. I would like to be able to select all rows that satisfy a condition on the column "col".

Here I would like to select all rows whose value in the "col" column matches the condition "< 2".

name = "col"
dataset = data.frame(col = 1:3)

I tried to use "eval" in subset, or the "select" function of the dplyr package, but it does not do what I want (or I misused it).

Is there a simple way to do this?

Why not just reference the column number rather than the name? — SteveM
– SteveM, Commented Nov 12, 2020 at 16:55
I would like to create a function and let the user decide the file structure to pass. Thus I don't know name nor the column number. — Pyxel
– Pyxel, Commented Nov 12, 2020 at 19:11

Eric Burden · Accepted Answer · 2020-11-12 19:05:35Z

1

If you're new(ish) to R, I'm going to recommend using the tidyverse set of packages, including the ever-useful dplyr for problems like this so you can have more immediately readable and understandable code. You can get this package using install.packages('tidyverse'). With that installed, to answer your question:

library(dplyr)

df <- data.frame(
  col = c(0:10),
  another_col = c(10:20),
  third_col = c(25:35)
)

dynamic_name <- "col"

filter_at(df, dynamic_name, ~ .x < 2)

Note: The tidyverse family of packages typically accept the formula syntax (that ~ expression) as a way to introduce anonymous (lambda) functions, so ~ .x < 2 is a function that returns TRUE if the value passed in is less than 2).

answered Nov 12, 2020 at 19:05

Eric Burden

761 silver badge2 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Pyxel Over a year ago

Thank you! Indeed, I'm new to R and learning on my own so many useful packages slip by. 'filter_at' is said to be superseded in the documentation, do you know how to do the same thing using 'accross' (which is apparently the new function to use, if I understood correctly)?

Eric Burden Over a year ago

I wouldn't worry about it too much. across() is very new and I wouldn't expect the *_at, *_if, *_all functions to go away any time soon. If you're looking for learning materials, there's some really good (mostly free) stuff here: rstudio.com/resources/books

gaut · Accepted Answer · 2020-11-12 16:48:38Z

0

please see below.

a <- 1:5
b <- 6:10
namevar <- "a"
df <- data.frame(a,b)
df[df[,namevar] %in% c(1:3),]
  a b
1 1 6
2 2 7
3 3 8

What happens here is that df[,namevar] %in% c(1:3) gives a vector of TRUE, FALSE depending on whether the condition is met.

Then passing this boolean vector as indices to df yields all the rows where the condition is TRUE.

For more details about %in% see help(is.element())

answered Nov 12, 2020 at 16:48

gaut

6,0381 gold badge29 silver badges59 bronze badges

Collectives™ on Stack Overflow

Selecting rows from a column which name is in a variable

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related