10

I want to select all columns in my dataframe which I have stored in a string variable. For example:

v1 <- rnorm(100)
v2 <- rnorm(100)
v3 <- rnorm(100)
df <- data.frame(v1,v2,v3)

I want to accomplish the following:

df[,c('v1','v2')]

But I want to use a variable instead of (c('v1', 'v2'))(these all fail):

select.me <- "'v1','v2'"
df[,select.me]
df[,c(select.me)]
df[,c(paste(select.me,sep=''))]

Thanks for help with a simple question,

2 Answers 2

22

The great irony here is that when you said "I want to do this" the first expression should have succeeded,

df[,c('v1','v2')]
> str( df[,c('v1','v2')] )
'data.frame':   100 obs. of  2 variables:
 $ v1: num  -0.3347 0.2113 0.9775 -0.0151 -1.8544 ...
 $ v2: num  -1.396 -0.95 -1.254 0.822 0.141 ...

whereas all the later attempts would fail. I later realized that you didn't know that you could use select.me <- c('v1','v2') ; df[ , select.me]. You could also use these forms which might be safer in some instances:

df[ , names(df) %in% select.me] # logical indexing
df[ , grep(select.me, names(df) ) ]  # numeric indexing
df[ , grepl(select.me, names(df) ) ]  # logical indexing

Any of those can be used with negation( !logical ) or minus ( -numeric) to retrieve the complement, whereas you cannot use character indexing with negation. If you wanted to go down one level in understandability and were willing to change the select.me values to a valid R expression you could do this:

select.me <- "c('v1','v2')"
df[ , eval(parse(text=select.me)) ]

Not that I recommend this... just to let you know that such is possible after you "learn to walk". It would also have been possible (although rather baroque) using your original quoted string to pull out the information (although I think this just illustrates why your first version is superior):

select.me <- "'v1','v2'"
df [ , scan(textConnection(select.me), what="", sep=",") ]
> str( df [ , scan(textConnection(select.me), what="", sep=",") ] )
Read 2 items
'data.frame':   100 obs. of  2 variables:
 $ v1: num  -0.3347 0.2113 0.9775 -0.0151 -1.8544 ...
 $ v2: num  -1.396 -0.95 -1.254 0.822 0.141 ...
Sign up to request clarification or add additional context in comments.

3 Comments

+1 beat me to eval(parse(...)). scan has a text argument, btw.
Hmmm. Right you are: scan(text=select.me, what="", sep=",") ...Is that 'text' argument how read.table handles it's text argument now? Must be. And why doesn't readLines accept a 'text' argument?
They added a "text" formal and check to see of "file" is missing. Seems that could have been done with readLines, too.
13

This is basic R sytnax, perhaps you need to read the introductory manual

select.me <- c('v1','v2')
df[,select.me]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.