0

Lets say we have something like this

df <- data.frame(species = c("Passer", "Turdus", "Turdus", "Gallus", "Anas", "Anas"),
                 season = c("breeding", "breeding", "non-breeding", "non-breeding", "breeding", "non-breeding"), 
                 value = seq(1, 12, 2), 
                 drop = c("no", "no", "dog", "dog", "cat", "cat"))
  species       season value drop
1  Passer     breeding     1   no
2  Turdus     breeding     3   no
3  Turdus non-breeding     5  dog
4  Gallus non-breeding     7  dog
5    Anas     breeding     9  cat
6    Anas non-breeding    11  cat

I would like to select the rows that have species with both values of breeding season, i.e. breeding and non-breeding.

The outcome should look like this

  species       season value drop
2  Turdus     breeding     3   no
3  Turdus non-breeding     5  dog
5    Anas     breeding     9  cat
6    Anas non-breeding    11  cat

I gave it a couple tries with filter() but I couldn’t make it work. I suspect making a loop could be the way to go? Thanks!

1 Answer 1

1

If there are just the possibility of species appears two times, you can use this:

library(dplyr)

 df %>% 
   add_count(species) %>% 
   filter(n == 2) %>% 
   select(-n)

  species       season value drop
1  Turdus     breeding     3   no
2  Turdus non-breeding     5  dog
3    Anas     breeding     9  cat
4    Anas non-breeding    11  cat

Another way would be to do this

df %>% 
  group_by(species) %>% 
  filter(nlevels(factor(season)) == 2)
Sign up to request clarification or add additional context in comments.

1 Comment

The second alternative did the trick. Many thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.