Replace values on a variable based on other variables in dataframe in R

Question

I have a long-format repeated measures dataset, similar to this:

    ID     Stimuli  Score   Correct
   <fct>    <chr>   <int>   <int>
 1 1          A1     0.046    1
 2 1          A1     0.037    1          
 3 1          A2    -0.261    0      
 4 1          A2     0.213    0     
 5 1          A3     0.224    0          
 6 1          A3     0.001    1        
 7 2          A1     -1.38    0         
 8 2          A1     -0.81    0
 9 2          A2     -0.03    1  
10 2          A2      0.88    0  
11 2          A3     -0.00    1      
12 2          A3      0.49    0

I created the Correct variable based on whether the Score for each row was within a specific range (if Score is between -.10 and +.10 = 1, otherwise 0).

What I want now is to change the values in Correct for each stimulus (A1, A2, A3) in Stimuli and per ID number. Specifically, whenever there is a 1 in ANY of the rows of Correct, all values should become 1 BUT ONLY for that corresponding stimulus and ID. In other words, in the example above, rows 1-2 of Correct would stay the same (1,1), rows 3-4 would stay the same (0,0), but rows 5-6 would become 1s for Stimuli A3 for ID 1 only. For ID 2, the only change would be for stimulus A2 (that should become 1,1).

I've tried several things but I can't think of an easy way to do this. There are similar posts about replacing values in a data frame but haven't seen one where I can do it by specific values in other variables within the same data frame.

nniloc · Accepted Answer · 2021-02-05 00:23:07Z

You can try using dplyr::group_by with any(Correct == 1)

library(dplyr)

df %>%
  group_by(ID, Stimuli) %>% 
  mutate(Correct = +any(Correct == 1))

#------
      ID Stimuli  Score Correct
   <int> <chr>    <dbl>   <dbl>
 1     1 A1       0.046       1
 2     1 A1       0.037       1
 3     1 A2      -0.261       0
 4     1 A2       0.213       0
 5     1 A3       0.224       1
 6     1 A3       0.001       1
 7     2 A1      -1.38        0
 8     2 A1      -0.81        0
 9     2 A2      -0.03        1
10     2 A2       0.88        1
11     2 A3       0           1
12     2 A3       0.49        1

Data

df <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L), Stimuli = c("A1", "A1", "A2", "A2", "A3", "A3", "A1", 
"A1", "A2", "A2", "A3", "A3"), Score = c(0.046, 0.037, -0.261, 
0.213, 0.224, 0.001, -1.38, -0.81, -0.03, 0.88, 0, 0.49), Correct = c(1L, 
1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"))

AnilGoyal · Accepted Answer · 2021-02-05 00:20:22Z

0

It should also work, simply

library(dplyr)

df %>%
  group_by(ID, Stimuli) %>% 
  mutate(Correct = max(Correct))

answered Feb 5, 2021 at 0:20

AnilGoyal

26.3k4 gold badges34 silver badges50 bronze badges

Collectives™ on Stack Overflow

Replace values on a variable based on other variables in dataframe in R

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related