0

I have these two tables;

   <A>                       <B>
a1    a2                     b1   
ABC   CAFE                   AB
ABD   DRINK                  BF
ABF   CAFE                   ..
ABFF  DRINK
..     ..

I would like to know the summarize table containing B to a1 in table A like this;

library(dplyr)
library(stringr)

A1 <- A %>%
filter(str_detect(a1, "AB")) %>%
group_by(a2) %>%
summarize(n())

A2 <- A %>%
filter(str_detect(a1, "BF")) %>%
group_by(a2) %>%
summarize(n())

However, I should make the code several times so that I would like to a function to input the B table in the str_detect function... How do I make the function?

4
  • lapply(A$b1,function(x)A%>%filter(str_detect(a1, x)) %>% group_by(a2) %>% summarize(n())) Commented Dec 29, 2017 at 2:04
  • Is it right to use 'function(x)A' ?? Commented Dec 29, 2017 at 2:09
  • Why not? A is not the parameter, it will call A from the .Globalenv.. Try it out if it doesnt work its am sure someone will give you a correct method.. lapply(B$b1,function(x)A%>%filter(str_detect(a1, x)) %>% group_by(a2) %>% summarize(n())) Commented Dec 29, 2017 at 2:10
  • YES! I'm going to study the lapply function now Commented Dec 29, 2017 at 2:13

2 Answers 2

1

Here I designed a function called count_fun, which has four arguments. dat is a data frame like A, Scol is a column with strings, Gcol is the grouping column, and String is the test string. See https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html to learn how to design a function using dplyr.

library(dplyr)
library(stringr)

count_fun <- function(dat, Scol, Gcol, String){

  Scol <- enquo(Scol)
  Gcol <- enquo(Gcol)

  dat2 <- dat %>%
    filter(str_detect(!!Scol, String)) %>%
    group_by(!!Gcol) %>%
    summarize(n())
  return(dat2)
}

count_fun(A, a1, a2, "AB")
# # A tibble: 2 x 2
#   a2    `n()`
#   <chr> <int>
# 1 CAFE      2
# 2 DRINK     2

count_fun(A, a1, a2, "BF")
# # A tibble: 2 x 2
#   a2    `n()`
#   <chr> <int>
# 1 CAFE      1
# 2 DRINK     1

We can then apply count_fun using lapply to loop through every elements in B.

lapply(B$b1, function(x){
  count_fun(A, a1, a2, x)
})

# [[1]]
# # A tibble: 2 x 2
#   a2    `n()`
#   <chr> <int>
# 1 CAFE      2
# 2 DRINK     2
# 
# [[2]]
# # A tibble: 2 x 2
#   a2    `n()`
#   <chr> <int>
# 1 CAFE      1
# 2 DRINK     1

DATA

A <- read.table(text = "a1    a2
ABC   CAFE
                ABD   DRINK 
                ABF   CAFE
                ABFF  DRINK
                ",
                header = TRUE, stringsAsFactors = FALSE)

B <- data.frame(b1 = c("AB", "BF"), stringsAsFactors = FALSE)
Sign up to request clarification or add additional context in comments.

2 Comments

Excuse me, I would like to get the proportion of the summarized table rather than count now... Then I changed the count_fun function but it does not work... How do I get the proportion (%) of the types in the 'function'?
This could be a new question, but before you ask a new question, please search on SO to see if there are posts talking about how to calculate percentage using dplyr.
1

I guess this solved your issue:

 lapply(B$b1,function(x)A%>%filter(str_detect(a1, x)) %>% group_by(a2) %>% summarize(n()))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.