1

I'm trying to use the following function to iterate through a dataframe and return the counts from each row:

library(dplyr)
library(tidyr)
row_freq <- function(df_input,row_input){
  print(df_input)
  vec <- unlist(df_input %>% 
                  select(-1) %>% 
                  slice(row_input), use.names = FALSE)
  r <- data.frame(table(vec)) %>% 
    pivot_wider(values_from = Freq, names_from = vec)
  return(r)
}

This works fine if I use a single row from the dataframe:

sample_df <- data.frame(id = c(1,2,3,4,5), obs1 = c("A","A","B","B","B"),
                        obs2 = c("B","B","C","D","D"), obs3 = c("A","B","A","D","A"))
row_freq(sample_df, 1)

  id obs1 obs2 obs3
1  1    A    B    A
2  2    A    B    B
3  3    B    C    A
4  4    B    D    D
5  5    B    D    A
# A tibble: 1 × 2
      A     B
  <int> <int>
1     2     1

But when iterating over rows using purrr::map_dfr, it seems to reduce df_input to only the id column instead of using the entire dataframe as the argument, which I found quite strange:

purrr::map_dfr(sample_df, row_freq, 1:5)
[1] 1 2 3 4 5
 Error in UseMethod("select") : 
no applicable method for 'select' applied to an object of class "c('double', 'numeric')"

I'm looking for help with regards to 1) why this is happening, 2) how to fix it, and 3) any alternative approaches or functions that may already perform what I'm trying to do in a more efficient manner.

1 Answer 1

1

Specify the order of the arguments correctly if we are not passing with named arguments

purrr::map_dfr(1:5, ~ row_freq(sample_df, .x))

-output

# A tibble: 5 × 4
      A     B     C     D
  <int> <int> <int> <int>
1     2     1    NA    NA
2     1     2    NA    NA
3     1     1     1    NA
4    NA     1    NA     2
5     1     1    NA     1

Or use a named argument

purrr::map_dfr(df_input = sample_df, .f = row_freq, .x = 1:5)

-output

# A tibble: 5 × 4
      A     B     C     D
  <int> <int> <int> <int>
1     2     1    NA    NA
2     1     2    NA    NA
3     1     1     1    NA
4    NA     1    NA     2
5     1     1    NA     1

The reason is that map first argument is .x

map(.x, .f, ...)

and if we are providing the 'sample_df' as the argument, it takes the .x as sample_df and loops over the columns of the data (as data.frame/tibble/data.table - unit is column as these are list with additional attributes)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.