1

I have a data frame with four columns: two columns indicate participation in a sport, while the other two columns indicate whether the player passed each of their two fitness exams.

dat <- data.frame(SOCCER = sample(0:1, 10, replace = T),
                  BASEBALL = sample(0:1, 10, replace = T),
                  TEST_1_PASS = sample(0:1, 10, replace = T),
                  TEST_2_PASS = sample(0:1, 10, replace = T))

I would like to obtain a list containing contingency tables for each sport and exam. I know that I can accomplish this using the following code, which uses nested lapply statements, but this strikes me as inefficient. Can anyone propose a more elegant solution that doesn't use nesting?

results <- lapply(c("SOCCER", "BASEBALL"), function(x) {
  lapply(c("TEST_1_PASS", "TEST_2_PASS"), function(y){
    table(sport = dat[[x]], pass = dat[[y]])
  })
})

Thanks as always!

5
  • do you necessarily want the contigency tables or the mean and number of success per sport? Commented Feb 6, 2020 at 17:21
  • Contingency tables specifically. Need them for a whole heap of fisher tests Commented Feb 6, 2020 at 17:26
  • Also, where is PLAYER indicator? Please dput actual data as advised: How to make a great R reproducible example. Commented Feb 6, 2020 at 17:32
  • I don't think your nesting is inefficient... you need to call table 4 times. It doesn't really matter if that's inside one loop/lapply over 4 items or 2 nested loops/lapplys with 2 items each. Commented Feb 6, 2020 at 17:33
  • You could clean up the syntax a little if you switched to purrr (e.g., use ~ for anonymous functions, and you'd get nicer names in the resulting lists using purrr::map instead of lapply), but it'd basically be the same. Commented Feb 6, 2020 at 17:34

2 Answers 2

4

The double lapply gets all pairwise combinations of the columns in each of the columns' vectors, like @Gregor wrote in a comment

I don't think your nesting is inefficient... you need to call table 4 times. It doesn't really matter if that's inside one loop/lapply over 4 items or 2 nested loops/lapplys with 2 items each.

But here is another way, with one of the loops in disguise as expand.grid.

cols <- expand.grid(x = c("SOCCER", "BASEBALL"), 
                    y = c("TEST_1_PASS", "TEST_2_PASS"),
                    stringsAsFactors = FALSE)
Map(function(.x, .y)table(dat[[.x]], dat[[.y]]), cols$x, cols$y)
Sign up to request clarification or add additional context in comments.

3 Comments

I guess you can also do do.call(Map, c(f = function(...) table(dat[c(...)]), cols))
@akrun Nice! Post as an answer? The tables have names, unlike in my answer.
It is just a slight modification of your answer.
0

Consider re-formatting your data into long format (i.e., tidy data), merging the two data pieces of sport and exam, then run by (a rarely used member of apply family as object-oriented wrapper to tapply) for all subset combinations between the two returning a named header report of results:

# RESHAPE EACH DATA SECTION (SPORT AND EXAM) INTO LONG FORMAT
df_list <- lapply(list(c("SOCCER", "BASEBALL"), c("TEST_1_PASS", "TEST_2_PASS")), function(cols)
                   reshape(cbind(PLAYER = row.names(dat), dat[cols]), 
                           varying = cols, v.names = "VALUE", 
                           times = cols, timevar = "INDICATOR", 
                           idvar = "PLAYER", ids = NULL,
                           new.row.names = 1:1E4, direction = "long")
           )

# CROSS JOIN (ALL COMBINATION PAIRINGS)
final_df <- Reduce(function(x,y) merge(x, y, by="PLAYER", suffixes=c("_SPORT", "_EXAM")), df_list)
final_df

# RUN TABLES FOR EACH SUBSET COMBINATION
tables_list <- with(final_df, by(final_df, list(INDICATOR_SPORT, INDICATOR_EXAM), function(sub)
                                  table(sport = sub$VALUE_SPORT, pass = sub$VALUE_EXAM)
                                )
                   )    

Output

tables_list

# : BASEBALL
# : TEST_1_PASS
#      pass
# sport 0 1
#     0 3 4
#     1 2 1
# ------------------------------------------------------------ 
# : SOCCER
# : TEST_1_PASS
#      pass
# sport 0 1
#     0 2 0
#     1 3 5
# ------------------------------------------------------------ 
# : BASEBALL
# : TEST_2_PASS
#      pass
# sport 0 1
#     0 3 4
#     1 1 2
# ------------------------------------------------------------ 
# : SOCCER
# : TEST_2_PASS
#      pass
# sport 0 1
#     0 2 0
#     1 2 6

Online Demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.