1

I have a data frame that contains 10 variables (ID1 - 1D10), each of which has 5 x values (A, B, C, D, E):

library(plotly)
library(data.table)

set.seed(1)
dat <- data.frame(ID = paste0("ID",1:10), A = runif(10), B = runif(10), C = runif(10), D = runif(10), E = runif(10))
dat$ID <- as.character(dat$ID)
datt <- data.frame(t(dat))
names(datt) <- as.matrix(datt[1, ])
datt <- datt[-1, ]
datt[] <- lapply(datt, function(x) type.convert(as.character(x)))
setDT(datt, keep.rownames = TRUE)[]
colnames(datt)[1] <- "x"
dat_long <- melt(datt, id.vars ="x" )

This creates a data frame in the following format (this is the first 7 lines of it):

   x variable     value
1: A      ID1 0.2655087
2: B      ID1 0.2059746
3: C      ID1 0.9347052
4: D      ID1 0.4820801
5: E      ID1 0.8209463
6: A      ID2 0.3721239
7: B      ID2 0.1765568

I am simply trying to grab only the rows from this data frame that have variable values of ID1 or ID2. This should result in 10 rows (since each ID has 5 x values A, B, C, D, E). However, upon doing:

dat_long[dat_long$variable==c("ID1","ID2"),]

I only receive 6 rows. Specifically, I only receive 3 out of the 5 x values (A, C, E):

   x variable     value
1: A      ID1 0.2655087
2: C      ID1 0.9347052
3: E      ID1 0.8209463
4: A      ID2 0.3721239
5: C      ID2 0.2121425
6: E      ID2 0.6470602

I tried to change the variable column of the data frame from a Factor to a character as follows:

dat_long$variable = as.character(dat_long$variable)
dat_long[dat_long$variable==c("ID1","ID2"),]

But this results in the exact same problem. When I run which() commands, I see the same problem still:

which(dat_long$variable==c("ID1","ID2"),)

Do you have any suggestions on how to remedy this problem? When I do:

str(c("ID1","ID2"))

I get the following:

chr [1:2] "ID1" "ID2"

I probably need to keep the key of IDs in the immediately above format. The reason is I am using a Shiny application, and the input value of the ID keys is in this format. The ID keys could sometimes be different combinations and numbers. For instance, the input could have three IDs (ex: c("ID1", "ID2", "ID5")). Hence, I would need to derive a solution using a character array in the above format for ID keys.

Any advice would be greatly appreciated!

1
  • 2
    ...%in% c("ID1","ID2") Commented Oct 21, 2016 at 19:26

1 Answer 1

1

As @bergant suggested you should probably use the %in% operator. Otherwise if you wanted to take advantage of data.table you could very quickly look it up using:

setkey(dat_long,variable)
dat_long[J(c("ID1","ID2"))]

    x variable     value
 1: A      ID1 0.2655087
 2: B      ID1 0.2059746
 3: C      ID1 0.9347052
 4: D      ID1 0.4820801
 5: E      ID1 0.8209463
 6: A      ID2 0.3721239
 7: B      ID2 0.1765568
 8: C      ID2 0.2121425
 9: D      ID2 0.5995658
10: E      ID2 0.6470602
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.