1

this one has been bugging me for a couple of days now, and I havent had any luck on stack exchange yet. Essentially, I have two tables, one table defines what columns (by column number) to select from the second table. My initial plan was to string together the columns and pass that into a subselect statement, however when I define the string as as.character it's not happy, i.e.:

# Data Sets, Variable_Selection: Table of Columns to Select from Variable_Table

VARIABLE_SELECTION <- data.frame(Set.1 = c(3,1,1,1,1), Set.2 = c(0,3,2,2,2), Set.3 = c(0,0,3,4,3),
                                 Set.4 = c(0,0,0,5,4), Set.5 = c(0,0,0,0,5))

VARIABLE_TABLE <- data.frame(Var.1 = runif(100,0,10), Var.2 = runif(100,-100,100), Var.3 = runif(100,0,1),
                             Var.4 = runif(100,-1000,1000), Var.5 = runif(100,-1,1), Var.6 = runif(100,-10,10))

# Sting rows into character string of columns to select

VARIABLE_STRING <- apply(VARIABLE_SELECTION,1,paste,sep = ",",collapse = " ")
VARIABLE_STRING <- gsub(" ",",",VARIABLE_STRING)
VARIABLE_STRING <- data.frame(VAR_STRING = gsub(",0","",VARIABLE_STRING))

# Will actually be part of lapply function but, one line selection for demonstration:

VARIABLE_SINGLE_SET <- as.character(VARIABLE_STRING[4,])

# Subset table for selected columns

VARIABLE_TABLE_SUB_SELECT <- VARIABLE_TABLE[,c(VARIABLE_SINGLE_SET)]

#  Error Returned:
#  Error in `[.data.frame`(VARIABLE_TABLE, , c(VARIABLE_SINGLE_SET)) : 
#  undefined columns selected

I know the text formatting is the problem but I can't find a workaround, any suggestions?

2 Answers 2

1

You should avoid sub-setting by number of columns and process by variables names or at least keep your index as integer list( no need to coerce to a string)

First To stay in the same idea, this correct your code. Basciaclly I coerce your variable to vector:

VARIABLE_TABLE[,as.numeric(unlist(strsplit(
        VARIABLE_SINGLE_SET,',')))]
Sign up to request clarification or add additional context in comments.

1 Comment

Great!!!! Thanks very much, I thought it'd be something like this. It works, Cheers.
1

Does this give the desired result?

lapply(VARIABLE_SELECTION, function(x) VARIABLE_TABLE[ , x[x != 0], drop = FALSE])

Produces a list where each element is a subset of 'VARIABLE_TABLE' given by 'VARIABLE_SELECTION' (using a 'VARIABLE_TABLE' with fewer rows).

# $Set.1
#       Var.3    Var.1  Var.1.1  Var.1.2  Var.1.3
# 1 0.09536403 5.593292 5.593292 5.593292 5.593292
# 2 0.09086404 6.339074 6.339074 6.339074 6.339074
# 
# $Set.2
#        Var.3    Var.2  Var.2.1  Var.2.2
# 1 0.09536403 65.81870 65.81870 65.81870
# 2 0.09086404 66.79157 66.79157 66.79157
# 
# $Set.3
#        Var.3     Var.4    Var.3.1
# 1 0.09536403 -674.6672 0.09536403
# 2 0.09086404 -576.7986 0.09086404
# 
# $Set.4
#        Var.5     Var.4
# 1  0.5155411 -674.6672
# 2 -0.9593219 -576.7986
# 
# $Set.5
#        Var.5
# 1  0.5155411
# 2 -0.9593219

1 Comment

Also works perfectly, the solution in the previous answer will probably be more appropriate for my immediate purposes, but this'll be really handy. Cheers

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.