0

I have a dataset with a named column. The name of the column is also in a variable. I would like to be able to select all rows that satisfy a condition on the column "col".

Here I would like to select all rows whose value in the "col" column matches the condition "< 2".

name = "col"
dataset = data.frame(col = 1:3)

I tried to use "eval" in subset, or the "select" function of the dplyr package, but it does not do what I want (or I misused it).

Is there a simple way to do this?

3
  • Try: dataset[ dataset[[ name ]] < 2, ] Commented Nov 12, 2020 at 16:47
  • Why not just reference the column number rather than the name? Commented Nov 12, 2020 at 16:55
  • I would like to create a function and let the user decide the file structure to pass. Thus I don't know name nor the column number. Commented Nov 12, 2020 at 19:11

2 Answers 2

1

If you're new(ish) to R, I'm going to recommend using the tidyverse set of packages, including the ever-useful dplyr for problems like this so you can have more immediately readable and understandable code. You can get this package using install.packages('tidyverse'). With that installed, to answer your question:

library(dplyr)

df <- data.frame(
  col = c(0:10),
  another_col = c(10:20),
  third_col = c(25:35)
)

dynamic_name <- "col"

filter_at(df, dynamic_name, ~ .x < 2)

Note: The tidyverse family of packages typically accept the formula syntax (that ~ expression) as a way to introduce anonymous (lambda) functions, so ~ .x < 2 is a function that returns TRUE if the value passed in is less than 2).

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you! Indeed, I'm new to R and learning on my own so many useful packages slip by. 'filter_at' is said to be superseded in the documentation, do you know how to do the same thing using 'accross' (which is apparently the new function to use, if I understood correctly)?
I wouldn't worry about it too much. across() is very new and I wouldn't expect the *_at, *_if, *_all functions to go away any time soon. If you're looking for learning materials, there's some really good (mostly free) stuff here: rstudio.com/resources/books
0

please see below.

a <- 1:5
b <- 6:10
namevar <- "a"
df <- data.frame(a,b)
df[df[,namevar] %in% c(1:3),]
  a b
1 1 6
2 2 7
3 3 8

What happens here is that df[,namevar] %in% c(1:3) gives a vector of TRUE, FALSE depending on whether the condition is met.

Then passing this boolean vector as indices to df yields all the rows where the condition is TRUE.

For more details about %in% see help(is.element())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.