0

I have two dataframes as u can see below.

  #Dataframe 1
    colname value
    col1    0.45
    col2    -0.2
    col3    -0.4
    col4    0.1

#Dataframe 2
col1 col2 col3 col4
1    5    9    5
45   29   43   9
34   33   56   3
2    67   76   1

What I want to do is to firstly select all columns of dataframe 1 that have a value > 0.3 or value < -0.3. The second thing I want is to select all column from dataframe 2 that match this condition. So the columns col1 and col3 of dataframe2 should be selected into a new dataframe like below.

col1  col3 
1     9   
45    43   
34    56   
2     76   

The solution I thought about is to firstly select the relevant columns as u can see in the code below.

library(sqldf)
features = sqldf('select colname from dataframe1 where value > 0.3 or value < -0.3')

After this to build a string in a for loop that should look like below. And paste this in a sqldf query to select to right columns from dataframe2. However I dont know how to build this string. U guys know this or have a other solution?

  stringValue = "col1, col3, col4"
   sprintf("SELECT %s FROM dataframe2", stringValue)
1
  • paste(features, collapse=",") should work Commented Jan 3, 2020 at 16:23

3 Answers 3

2

With your current dataframe1 only col1 and col3 will get selected.

library(sqldf)
features = sqldf('select colname from dataframe1 where value > 0.3 or value < -0.3')
sqldf(sprintf("SELECT %s FROM dataframe2", paste0(features$colname, collapse = ", ")))


#       col1 col3
#    1    1    9
#    2   45   43
#    3   34   56
#    4    2   76

data

#Dataframe 1
dataframe1 <- read.table(text = 'colname value
    col1    0.45
                         col2    -0.2
                         col3    -0.4
                         col4    0.1', header = T, sep = "")

#Dataframe 2
dataframe2 <- read.table(text = 'col1 col2 col3 col4
1    5    9    5
45   29   43   9
34   33   56   3
2    67   76   1', header = T, sep = "")
Sign up to request clarification or add additional context in comments.

2 Comments

The last sqldf line can be written fn$sqldf("SELECT `toString(features$colname)` FROM dataframe2") or as cols <- toString(features$colname); fn$sqldf("SELECT $cols FROM dataframe2")
Also the first sqldf statement could be written features <- sqldf('select colname from dataframe1 where not value between -0.3 and 0.3')
1

A base R way of doing this:

> mask <- dataframe1$value > 0.3 | dataframe1$value < -0.3
> dataframe2[, mask]

  col1 col3
1    1    9
2   45   43
3   34   56
4    2   76

Comments

0

Using dplyr (not sure if it is relevant), you can do:

df2 %>% 
select(one_of(df1 %>% filter(value > 0.3 | value < -0.3) %>% pull(colname) %>% as.character()))

This works by selecting column names that match one_of the strings from df1 that works within the filter.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.