1

First, I apologize for the somewhat non-descriptive title but as you can probably see below, the problem is not easy to sum up in a single question.

I have a data set of peer evaluations, where group members rated each other on a set of criteria. Unfortunately, I wasn't able to pre-populate respondents' group members in the the survey so I ended up with a very jumbled data set that looks like this:

data = data.frame(gm1 = c("Lin", "Bill", "Cass", "Dan"), 
                  gm2 = c("Bill", "Lin", "Bill", "Lin"),
                  gm3 = c("Cass", "Dan", "Lin", "Cass"),
                  gm4 = c("Dan", "Cass", "Dan", "Bill"),
                  crit1_1 = c(5, 2, 4, 4), 
                  crit1_2 = c(5, 3, 3, 3), 
                  crit1_3 = c(4, 2, 4, 4), 
                  crit1_4 = c(5, 5, 4, 4))

where gm1 is the respondent and crit1 represents their self-ratings and gm2-4 are their group members with crit1_2-4 representing their ratings of each member on the first criteria (there are multiple). I need to calculate the average of the group's ratings for each member but as you can see, there is no consistent pattern for who is gm1-4. For example, for Lin, I need an average of Bill, Cass, and Dan's ratings of Lin.

I know this is a bit of a mess but I hoping there is a relatively simple way of transforming this data so that I can perform the calculations that I need to.

I've thought about assigning each group member a number like so

key = data.frame(gm1 = c("Lin", "Bill", "Cass", "Dan"), 
                 num = c(1, 2, 3, 4))
data1 = data[ , c(1,2,3,4)]
data1[] <- key$num[match(unlist(data), key$gm1)]
data2 = cbind(data1, data[ , -c(1,2,3,4)])

and then sorting columns gm1-4 so that the pattern is always the same for each criteria. However, I do not know how to sort gm1-4 for each row in such a way that creates corresponding changes in crit1_1-4.

Any suggestions would be much appreciated!

1
  • If the first column represents thier self-evaluation it means that the other columns are the evaluations of thier peers. You only need to group the data by gm1 and calculate the mean of crit1_2-4. Commented Apr 19, 2021 at 19:06

2 Answers 2

1

Perhaps this might be more manageable in long form. With pivot_longer from tidyr, you can extend your data for each group member and criterion. The filter and select will remove the "self-rating" data.

library(tidyverse)

data %>%
  pivot_longer(cols = starts_with("crit"), 
               values_to = "rating", 
               names_to = c("criteria", "rating_number"), 
               names_pattern = "crit(\\d+)_(\\d+)") %>%
  filter(rating_number != 1) %>%
  select(-gm1) %>%
  pivot_longer(cols = starts_with("gm"), 
               names_pattern = "gm(\\d+)", 
               names_to = "member_number", 
               values_to = "member_name") %>%
  filter(member_number == rating_number) %>%
  group_by(member_name, criteria) %>%
  summarise(mean_rating = mean(rating))

Output

  member_name criteria mean_rating
  <chr>       <chr>          <dbl>
1 Bill        1               4   
2 Cass        1               4.33
3 Dan         1               3.67
4 Lin         1               3.33
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for this. It appears though that this method is calculating the average of each member's ratings of their team members, rather than their team members' ratings of them . For example, Bill gave himself a 2, and rated his team members Lin = 3, Dan = 2, and Cass = 5, which I think is how the mean rating is being calculated in your solution. Instead, I need to average Bill's team members' ratings of Bill, so Bill's mean rating should really be (5 + 3 + 4)/3 = 4.
@AddisonMaerz Please see edited answer and let me know if this is what you had in mind.
1

base

data = data.frame(gm1 = c("Lin", "Bill", "Cass", "Dan"), 
                  gm2 = c("Bill", "Lin", "Bill", "Lin"),
                  gm3 = c("Cass", "Dan", "Lin", "Cass"),
                  gm4 = c("Dan", "Cass", "Dan", "Bill"),
                  crit1_1 = c(5, 2, 4, 4), 
                  crit1_2 = c(5, 3, 3, 3), 
                  crit1_3 = c(4, 2, 4, 4), 
                  crit1_4 = c(5, 5, 4, 4))



cols_gm <- sapply(data, is.character)
cols_crit <- sapply(data, is.numeric)


crit <- data[cols_crit]
gm <- data[cols_gm]
nm <- unique(unlist(gm))


sapply(nm, function(x) {
  coord <- matrix(c(seq(nrow(data)), max.col(gm == x)), nrow = nrow(data))
  coord <- coord[!(coord[, 2] == 1), ]
  mean(crit[coord])
})
#>      Lin     Bill     Cass      Dan 
#> 3.333333 4.000000 4.333333 3.666667

Created on 2021-04-19 by the reprex package (v2.0.0)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.