1

I have a long-format repeated measures dataset, similar to this:

    ID     Stimuli  Score   Correct
   <fct>    <chr>   <int>   <int>
 1 1          A1     0.046    1
 2 1          A1     0.037    1          
 3 1          A2    -0.261    0      
 4 1          A2     0.213    0     
 5 1          A3     0.224    0          
 6 1          A3     0.001    1        
 7 2          A1     -1.38    0         
 8 2          A1     -0.81    0
 9 2          A2     -0.03    1  
10 2          A2      0.88    0  
11 2          A3     -0.00    1      
12 2          A3      0.49    0  

I created the Correct variable based on whether the Score for each row was within a specific range (if Score is between -.10 and +.10 = 1, otherwise 0).

What I want now is to change the values in Correct for each stimulus (A1, A2, A3) in Stimuli and per ID number. Specifically, whenever there is a 1 in ANY of the rows of Correct, all values should become 1 BUT ONLY for that corresponding stimulus and ID. In other words, in the example above, rows 1-2 of Correct would stay the same (1,1), rows 3-4 would stay the same (0,0), but rows 5-6 would become 1s for Stimuli A3 for ID 1 only. For ID 2, the only change would be for stimulus A2 (that should become 1,1).

I've tried several things but I can't think of an easy way to do this. There are similar posts about replacing values in a data frame but haven't seen one where I can do it by specific values in other variables within the same data frame.

2 Answers 2

2

You can try using dplyr::group_by with any(Correct == 1)

library(dplyr)

df %>%
  group_by(ID, Stimuli) %>% 
  mutate(Correct = +any(Correct == 1))

#------
      ID Stimuli  Score Correct
   <int> <chr>    <dbl>   <dbl>
 1     1 A1       0.046       1
 2     1 A1       0.037       1
 3     1 A2      -0.261       0
 4     1 A2       0.213       0
 5     1 A3       0.224       1
 6     1 A3       0.001       1
 7     2 A1      -1.38        0
 8     2 A1      -0.81        0
 9     2 A2      -0.03        1
10     2 A2       0.88        1
11     2 A3       0           1
12     2 A3       0.49        1

Data

df <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L), Stimuli = c("A1", "A1", "A2", "A2", "A3", "A3", "A1", 
"A1", "A2", "A2", "A3", "A3"), Score = c(0.046, 0.037, -0.261, 
0.213, 0.224, 0.001, -1.38, -0.81, -0.03, 0.88, 0, 0.49), Correct = c(1L, 
1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"))
Sign up to request clarification or add additional context in comments.

Comments

0

It should also work, simply

library(dplyr)

df %>%
  group_by(ID, Stimuli) %>% 
  mutate(Correct = max(Correct))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.