0

I would like to calculate the average exam score of each student and add this as a new column to a data frame:

library(dplyr)

my_students <- c("John", "Lisa", "Sam")
student_exam <- c("John", "Lisa", "John", "John")
score_exam <- c(7, 6, 7, 6)

students <- as.data.frame(my_students)
scores <- as.data.frame(student_exam)
scores <- cbind(scores, score_exam)

new_frame <- students %>% mutate(avg_score = (scores %>% filter(student_exam == my_students) %>% mean(score_exam)))

But the code above gives the following error:

Error in Ops.factor(student_examn, my_students) : 
  level sets of factors are different

I assume it has to do with filter(student_exam == my_students). How would I do this in dplyr?

2
  • Not very clear what the filter tries to do. All your students have a score in your example. Something like this would work in your case: df = data.frame(student_exam, score_exam); df %>% group_by(student_exam) %>% mutate(avg_score = mean(score_exam)) %>% ungroup() Commented May 9, 2020 at 18:29
  • @AntoniosK this would remove Sam from the result if I am correct. I need Sam to remain. If it has no grades for a student it should just say NA. Commented May 9, 2020 at 18:56

1 Answer 1

2

You need to make sure you define two data frames with matching column named "name". You can then use group_by and summarize to group scores by student and summarize the average for each student. This solution has a warning that is telling you that you should be aware that not every student in your class has an average exam score. As a result, Sam's average score is NA.

library(dplyr)

my_students <- c("John", "Lisa", "Sam")
student_exam <- c("John", "Lisa", "John", "John")
score_exam <- c(7, 6, 7, 6)

students <- data.frame("name" = as.character(my_students))
scores <- data.frame("name" = as.character(student_exam), "score" = score_exam)


avg_scores <- scores %>%
  group_by(name) %>%
  summarize(avgScore = mean(score)) %>%
  right_join(students)
Sign up to request clarification or add additional context in comments.

2 Comments

I would like to keep Sam in the resulting data frame. I assume this removes him? Id'like to just have it say NA if there are no scores for Sam.
@RuudVerhoef that is the result of this approach.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.