0

Question:

I have a particular problem where I want to subset a given dataframe columnwise where the column names are stored in another dataframe.

Example using mtcars dataset:

options(stringsAsFactors = FALSE)

col_names <- c("hp,disp", "disp,hp,mpg")
df_col_names <- as.data.frame(col_names)

vec <- df_col_names[1,] # first row contains "hp" and "disp"
mtcars_new <- mtcars[, c("hp", "disp")] ## assuming that vec gives colnames

I even tried inserting double quotes to each of the words using the following:

Attempted solution:

options(stringsAsFactors = FALSE)

col_names <- c("hp,disp", "disp,hp,mpg")
df_col_names <- as.data.frame(col_names)

df_col_names$col_names <- gsub("(\\w+)", '"\\1"', df_col_names$col_names)
vec <- df_col_names[1,]
vec2 <- gsub("(\\w+)", '"\\1"', vec)

mtcars_new <- mtcars[,vec2] ## this should be same as mtcars[, c("hp", "disp")]

Expected Solution

mtcars_new <- mtcars[,vec2] is equal to mtcars_new <- mtcars[, c("hp", "disp")]

2
  • Not clear what kind of output you want. Is it one data frame with columns hp,disp and another one with disp,hp,mpg? Please show how you want it to be. Commented Nov 13, 2018 at 10:50
  • Ok editing the question to make it more clear, its hp,disp Commented Nov 13, 2018 at 10:50

2 Answers 2

2

Here's another way to do this:

col_names <- c("hp,disp", "disp,hp,mpg")

vec2 <- unlist(str_split(col_names[[1]],','))
mtcars_new <- mtcars[,vec2]

What you are doing is picking the first element from the col_names vector, splitting it by the separator, then unlisting it (because str_split() makes a list), then you are using your new vector of names to subset the mtcars data-frame.

Sign up to request clarification or add additional context in comments.

1 Comment

I accepted @Ronak answer but I upvoted yours. Thanks so much!
0

Do you need this?

lapply(strsplit(as.character(df_col_names$col_names), ","), function(x) mtcars[x])

#[[1]]
#                     hp  disp
#Mazda RX4           110 160.0
#Mazda RX4 Wag       110 160.0
#Datsun 710           93 108.0
#Hornet 4 Drive      110 258.0
#Hornet Sportabout   175 360.0
#.....

#[[2]]
#                     disp  hp  mpg
#Mazda RX4           160.0 110 21.0
#Mazda RX4 Wag       160.0 110 21.0
#Datsun 710          108.0  93 22.8
#Hornet 4 Drive      258.0 110 21.4
#Hornet Sportabout   360.0 175 18.7
#....

Here, we split the column names on comma (",") and then subset it from the dataframe using lapply. This returns a list of dataframes with length of list which is same as number of rows in the data frame.


If you want to subset only the first row, you could do

mtcars[strsplit(as.character(df_col_names$col_names[1]), ",")[[1]]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.