How do I filter rows based on two values within a column?

Question

Totally new to R and I trying to solve this using the dplyr package. I want to filter out and return countries that both have Import and Export values and view them separately. I tried a lot of methods such as select and filter but have been unable to do so.

Country Year    Quantity    Description Import/Export
A   2001    10  Frozen  Export
B   2001    50  Fresh   Import
B   2004    20  Frozen  Export
C   2003    30  Frozen  Import
C   2005    40  Fresh   Export
C   2006    60  Frozen  Import
D   2007    290 Fresh   Import

Ideally, the end result should be this:

Country Year    Quantity    Description Import/Export
B   2001    50  Fresh   Import
B   2004    20  Frozen  Export
C   2003    30  Frozen  Import
C   2005    40  Fresh   Export
C   2006    60  Frozen  Import

GuedesBF · Accepted Answer · 2021-09-11 13:52:58Z

3

We can group_by() Country, then filter all groups with any "Import/Export" =='Import' and any ""Import/Export"" == 'Export'

library(dplyr)

df %>% group_by(Country) %>%
        filter(any(`Import/Export`=='Import') & 
               any(`Import/Export`=='Export')) %>%
        ungroup()

# A tibble: 5 x 5
  Country  Year Quantity Description `Import/Export`
  <chr>   <dbl>    <dbl> <chr>       <chr>          
1 B        2001       50 Fresh       Import         
2 B        2004       20 Frozen      Export         
3 C        2003       30 Frozen      Import         
4 C        2005       40 Fresh       Export         
5 C        2006       60 Frozen      Import

data

structure(list(Country = c("A", "B", "B", "C", "C", "C", "D"), 
    Year = c(2001, 2001, 2004, 2003, 2005, 2006, 2007), Quantity = c(10, 
    50, 20, 30, 40, 60, 290), Description = c("Frozen", "Fresh", 
    "Frozen", "Frozen", "Fresh", "Frozen", "Fresh"), `Import/Export` = c("Export", 
    "Import", "Export", "Import", "Export", "Import", "Import"
    )), row.names = c(NA, -7L), class = c("tbl_df", "tbl", "data.frame"
))

edited Sep 11, 2021 at 13:52

answered Sep 11, 2021 at 13:00

GuedesBF

9,9515 gold badges23 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Gabriel Liu Over a year ago

Amazing!! What would be the benefit of presenting the data as shown in the 2nd part?

GuedesBF Over a year ago

what 2nd part??

Gabriel Liu Over a year ago

structure(list(Country = c("A", "B", "B", "C", "C", "C", "D"), Year = c(2001, 2001, 2004, 2003, 2005, 2006, 2007), Quantity = c(10, 50, 20, 30, 40, 60, 290), Description = c("Frozen", "Fresh", "Frozen", "Frozen", "Fresh", "Frozen", "Fresh"), Import/Export = c("Export", "Import", "Export", "Import", "Export", "Import", "Import" )), row.names = c(NA, -7L), class = c("tbl_df", "tbl", "data.frame" ))

GuedesBF Over a year ago

this is your data in a reproducible form. If you run that code, it gets us your dataframe. It makes it easier to share the data. You can get the same code with dput(data)

GuedesBF Over a year ago

You should always share your data this way, it makes it much easier for others to test the answers and manipulate your data.

Ronak Shah · Accepted Answer · 2021-09-11 13:40:00Z

3

Using data from @GuedesBF answer here is another dplyr way to filter groups which has both 'Import' and 'Export'.

library(dplyr)

df %>%
  group_by(Country) %>%
  filter(all(c('Import', 'Export') %in% `Import/Export`)) %>%
  ungroup()

# Country  Year Quantity Description `Import/Export`
#  <chr>   <dbl>    <dbl> <chr>       <chr>          
#1 B        2001       50 Fresh       Import         
#2 B        2004       20 Frozen      Export         
#3 C        2003       30 Frozen      Import         
#4 C        2005       40 Fresh       Export         
#5 C        2006       60 Frozen      Import

answered Sep 11, 2021 at 13:40

Ronak Shah

391k20 gold badges173 silver badges237 bronze badges

Comments

akrun · Accepted Answer · 2021-09-11 19:09:26Z

1

Using data.table

library(data.table)
setDT(df)[df[, .I[all(c('Import', 'Export') %in% `Import/Export`)], Country]$V1]
   Country Year Quantity Description Import/Export
1:       B 2001       50       Fresh        Import
2:       B 2004       20      Frozen        Export
3:       C 2003       30      Frozen        Import
4:       C 2005       40       Fresh        Export
5:       C 2006       60      Frozen        Import

answered Sep 11, 2021 at 19:09

akrun

891k38 gold badges590 silver badges700 bronze badges

Comments

danlooo · Accepted Answer · 2021-09-11 13:03:58Z

0

library(tidyverse)

data <- tribble(
  ~Country, ~Year, ~Quantity, ~Description, ~`Import/Export`,
  "A", 2001, 10, "Frozen", "Export",
  "B", 2001, 50, "Fresh", "Import",
  "B", 2004, 20, "Frozen", "Export",
  "C", 2003, 30, "Frozen", "Import",
  "C", 2005, 40, "Fresh", "Export",
  "C", 2006, 60, "Frozen", "Import",
  "D", 2007, 290, "Fresh", "Import"
)
data
#> # A tibble: 7 x 5
#>   Country  Year Quantity Description `Import/Export`
#>   <chr>   <dbl>    <dbl> <chr>       <chr>          
#> 1 A        2001       10 Frozen      Export         
#> 2 B        2001       50 Fresh       Import         
#> 3 B        2004       20 Frozen      Export         
#> 4 C        2003       30 Frozen      Import         
#> 5 C        2005       40 Fresh       Export         
#> 6 C        2006       60 Frozen      Import         
#> 7 D        2007      290 Fresh       Import

selected_countries <-
  data %>%
  mutate(is_there = TRUE) %>%
  distinct(Country, `Import/Export`, is_there) %>%
  pivot_wider(names_from = "Import/Export", values_from = is_there) %>%
  filter(!is.na(Export) & !is.na(Import)) %>%
  pull(Country) %>%
  unique()
selected_countries
#> [1] "B" "C"

data %>% filter(Country %in% selected_countries)
#> # A tibble: 5 x 5
#>   Country  Year Quantity Description `Import/Export`
#>   <chr>   <dbl>    <dbl> <chr>       <chr>          
#> 1 B        2001       50 Fresh       Import         
#> 2 B        2004       20 Frozen      Export         
#> 3 C        2003       30 Frozen      Import         
#> 4 C        2005       40 Fresh       Export         
#> 5 C        2006       60 Frozen      Import

^{Created on 2021-09-11 by the reprex package (v2.0.1)}

answered Sep 11, 2021 at 13:03

danlooo

10.8k2 gold badges13 silver badges27 bronze badges

2 Comments

Gabriel Liu Over a year ago

Love it! What is the purpose of adding a column?

danlooo Over a year ago

It's to indicate that the property is "there". Pivot wider requires a name and a value column. After the pivot, we have all the info need in one row per country.

Collectives™ on Stack Overflow

How do I filter rows based on two values within a column?

4 Answers 4

5 Comments

Comments

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

5 Comments

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related