Adding column based on data in other data frame

Question

I would like to calculate the average exam score of each student and add this as a new column to a data frame:

library(dplyr)

my_students <- c("John", "Lisa", "Sam")
student_exam <- c("John", "Lisa", "John", "John")
score_exam <- c(7, 6, 7, 6)

students <- as.data.frame(my_students)
scores <- as.data.frame(student_exam)
scores <- cbind(scores, score_exam)

new_frame <- students %>% mutate(avg_score = (scores %>% filter(student_exam == my_students) %>% mean(score_exam)))

But the code above gives the following error:

Error in Ops.factor(student_examn, my_students) : 
  level sets of factors are different

I assume it has to do with filter(student_exam == my_students). How would I do this in dplyr?

Not very clear what the filter tries to do. All your students have a score in your example. Something like this would work in your case: df = data.frame(student_exam, score_exam); df %>% group_by(student_exam) %>% mutate(avg_score = mean(score_exam)) %>% ungroup() — AntoniosK
– AntoniosK, Commented May 9, 2020 at 18:29
@AntoniosK this would remove Sam from the result if I am correct. I need Sam to remain. If it has no grades for a student it should just say NA. — SecretIndividual
– SecretIndividual, Commented May 9, 2020 at 18:56

mcz · Accepted Answer · 2020-05-09 18:31:13Z

2

You need to make sure you define two data frames with matching column named "name". You can then use group_by and summarize to group scores by student and summarize the average for each student. This solution has a warning that is telling you that you should be aware that not every student in your class has an average exam score. As a result, Sam's average score is NA.

library(dplyr)

my_students <- c("John", "Lisa", "Sam")
student_exam <- c("John", "Lisa", "John", "John")
score_exam <- c(7, 6, 7, 6)

students <- data.frame("name" = as.character(my_students))
scores <- data.frame("name" = as.character(student_exam), "score" = score_exam)


avg_scores <- scores %>%
  group_by(name) %>%
  summarize(avgScore = mean(score)) %>%
  right_join(students)

answered May 9, 2020 at 18:31

mcz

5872 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

SecretIndividual Over a year ago

I would like to keep Sam in the resulting data frame. I assume this removes him? Id'like to just have it say NA if there are no scores for Sam.

mcz Over a year ago

@RuudVerhoef that is the result of this approach.

Collectives™ on Stack Overflow

Adding column based on data in other data frame

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related