I have two dataframes. The first (games) shows, for each of several games, the year and which player accomplished certain unspecified goals (player1, player2, player3). The second (rankings) show the ranking of each player in a given year.
My goal is to add a column to the games dataframe, indicating the average ranking of all players who accomplished those goals in each game.
A reproducible example:
set.seed(0)
players <- c("Abe", "Bob", "Chris", "John", "Jane", "Linda", "Mason", "Zoe", "NA")
years <- c(2000:2005)
season <- sample(years, 20, replace = TRUE)
player1 <- sample(players, 20, replace = TRUE)
player2 <- sample(players, 20, replace = TRUE)
player3 <- sample(players, 20, replace = TRUE)
games <- data.frame(season, player1, player2, player3, stringsAsFactors = FALSE)
rankings <- data.frame(replicate(6,sample(1:5,8,rep=TRUE)))
colnames(rankings) <- years
ranked_players <- players[-9]
rankings <- cbind(ranked_players, rankings)
The games is the first dataframe, showing the year of the game (season), who was player1, who was player2 and who was player3. There isn't always a player for all categories for all games.
The rankings is the second dataframe, showing the ranking from 1 to 5 of each player in a given year.
I want to calculate what is the ranking of the player who played as player1, player2, and player3 respectively for each game in games, and average those rankings.
To calculate the ranking, I tried this function:
calc_ranking <- function(x, y) {
z <- select(filter(rankings, ranked_players==x), c(y))
z <- as.integer(z[1,1])
z
}
It apparently works. Now I have to apply it for each player who played a game in games and for every year.
I tried this loop:
new_col <- mapply(calc_ranking, games$player1, games$season)
but it doesn't work. It gives me an error
Error in inds_combine(.vars, ind_list) : Position must be between 0 and n
However, even if it worked, with this solution I should repeat the loop 3 times to create 3 columns, one for each role as player1, player2, and player3, and then create the column I really want (the average of the 3 columns). I suspect there is a more efficient way to do it without repeating the loop (assuming I can fix it)? It would be very useful, because in my real dataset I have 13 "roles" for which I have to calculate the ranking.
Hope this second question is better than my first. Apologize for any mistake, I'm only 1 week into learning R (which is my first experience with coding in general).
Thanks a lot!
seasonwhen creating your reproducible data