Subset a dataframe by multiple factor levels [duplicate]

Question

How can I avoid using a loop to subset a dataframe based on multiple factor levels?

In the following example my desired output is a dataframe. The dataframe should contain the rows of the original dataframe where the value in "Code" equals one of the values in "selected".

Working example:

#sample data
Code<-c("A","B","C","D","C","D","A","A")
Value<-c(1, 2, 3, 4, 1, 2, 3, 4)
data<-data.frame(cbind(Code, Value))

selected<-c("A","B") #want rows that contain A and B

#Begin subsetting
result<-data[which(data$Code==selected[1]),]
s1<-2
while(s1<length(selected)+1)
{
  result<-rbind(result,data[which(data$Code==selected[s1]),])
  s1<-s1+1
}

This is a toy example of a much larger dataset, so "selected" may contain a great number of elements and the data a great number of rows. Therefore I would like to avoid the loop.

Metrics · Accepted Answer · 2013-10-20 22:31:03Z

44

You can use %in%

  data[data$Code %in% selected,]
  Code Value
1    A     1
2    B     2
7    A     3
8    A     4

edited Oct 20, 2013 at 22:31

answered Oct 20, 2013 at 22:11

Metrics

15.5k7 gold badges56 silver badges83 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Joe · Accepted Answer · 2016-10-23 09:43:46Z

5

Here's another:

data[data$Code == "A" | data$Code == "B", ]

It's also worth mentioning that the subsetting factor doesn't have to be part of the data frame if it matches the data frame rows in length and order. In this case we made our data frame from this factor anyway. So,

data[Code == "A" | Code == "B", ]

also works, which is one of the really useful things about R.

edited Oct 23, 2016 at 9:43

answered Oct 17, 2016 at 15:00

Joe

8,7512 gold badges55 silver badges60 bronze badges

1 Comment

JacaByte Over a year ago

The second part did not work for me in Jupyter notebook.

Jilber Urbina · Accepted Answer · 2013-10-20 22:38:43Z

4

Try this:

> data[match(as.character(data$Code), selected, nomatch = FALSE), ]
    Code Value
1      A     1
2      B     2
1.1    A     1
1.2    A     1

edited Oct 20, 2013 at 22:38

answered Oct 20, 2013 at 22:05

Jilber Urbina

61.4k10 gold badges116 silver badges141 bronze badges

Collectives™ on Stack Overflow

Subset a dataframe by multiple factor levels [duplicate]

3 Answers 3

Comments

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

Comments

Linked

Related