2

I have a data frame with 800 columns. I want to select rows from a data frame using a condition from every column. How can I do that without huge long which like data[which(data$V_1 < bound_1 & ...& data$V_n<bound_n),]

This is a fragment of my data frame

    type_Browser os_name_Windows XP ua_family_Chrome ua_name_Chrome0
 [1,]   0.06453172         0.09318651       0.09849316        0.1962756
 [2,]   0.06453172         0.09318651       0.09849316        0.1962756
 [3,]   0.06453172         0.09318651       0.00000000        0.0000000
 [4,]   0.06453172         0.00000000       0.00000000        0.0000000
 [5,]   0.06453172         0.00000000       0.09849316        0.1962756
 [6,]   0.06453172         0.09318651       0.00000000        0.0000000
 [7,]   0.06453172         0.00000000       0.00000000        0.0000000
 [8,]   0.06453172         0.09318651       0.00000000        0.0000000
 [9,]   0.06453172         0.00000000       0.09849316        0.1962756
[10,]   0.06453172         0.09318651       0.00000000        0.0000000

This is a fragment of centers of clusters after kmeans

type_Browser os_name_Windows XP ua_family_Chrome ua_name_Chrome 0
    1     0.9973870          0.9014791        0.8885468        0.9162910
    2     0.1370203          0.9323763        0.3940263        0.8250081
    3     0.7121533          0.9541988        0.1418068        0.6568214
    4     0.9998909          0.9881944        0.9959341        0.3181853
    5     0.9278844          0.9796447        0.9247542        0.9510941
    6     0.9784205          0.8586415        0.8902691        0.8210114
    7     0.7115432          0.9930360        0.9652756        0.9735471
    8     0.9907865          0.9896360        0.9910279        0.9781258
    9     0.9967735          0.9919486        0.9921240        0.9702438
    10    0.9998825          0.9940538        0.9970676        0.9839453

Then I make two bounds

lowerBound = centers - eps;
upperBound = centers + eps;

Then I want to select rows which lies in [ centers - eps, centers + eps ].

for(i in 1:k){
  ithLB = lowerBound[i,];
  ithUB = upperBound[i,];
  ithKernel <- data[ which(data[,1]<=lowerBound[1] & ...& which(data[,812]<=lowerBound[812],] # I want to change this expression for something more reasonable.
}
5
  • 2
    Could you provide a small example? What are these bound_1 to bound_n? If bound is a vector that has the bound1, bound2etc. values, perhaps data[Reduce('&', Map('<', data, bound)),] (not tested though) Commented Feb 19, 2015 at 16:05
  • @FedorenkaKristina Have you tried the above code? I would put the lowerBound in a vector. Your dataset seems to be matrix. In that case, conver to data.frame by as.data.frame and try with Map Commented Feb 19, 2015 at 16:20
  • @FedorenkaKristina I didn't understand the input variables for lowerBound = centers - eps; Commented Feb 19, 2015 at 16:27
  • @akrun thanks a lot! I'll try your solution. What exactly didn't you understand about lowerBound? Commented Feb 19, 2015 at 16:34
  • @Fedorenka Kristina Regarding the centers and eps from the data showed. Commented Feb 19, 2015 at 16:35

1 Answer 1

1

You could try

data[Reduce(`&`,Map('<', data, bound)),]

Suppose there is "bound_1", "bound_2", ..."bound_N" objects

 bound <- mget(paste('bound', 1:ncol(data), sep="_"))

and use the same code as above

Another less optimal option would be using paste with eval(parse (not recommended)

str1 <- paste(paste(paste0('data$',paste('V', 1:ncol(data), sep="_")),
  paste('bound', 1:ncol(data), sep="_"), sep=" < "), collapse=" & ")
data[eval(parse(text=str1)),]

data

set.seed(153)
data <- as.data.frame(matrix(sample(0:8, 5*20, replace=TRUE), ncol=5))
colnames(data) <- paste('V', 1:ncol(data), sep="_")
bound <- sample(1:15, 5, replace=TRUE)

In case you have "bound_1", "bound_2", etc instead of a "vector"

bound_1 <- 6
bound_2 <- 8
bound_3 <- 7
bound_4 <- 7
bound_5 <- 14
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.