1

I would like to create a new variable that is labelled change_index. This variable is outcome1 at time 3 - outcome 1 at time 1 / outcome1 at time 1.

How do I go about doing this? I tried doing the following

outcome1t0 <- data %>%
filter(time == "1") %>%
select(outcome1)

outcome1t12 <- data %>%
filter(time == "3") %>%
select(outcome1)

data$newvariable <- (outcome1t0 - outcome1t12) / outcome1t0

but I get the following error

Error in `$<-.data.frame`(`*tmp*`, bicind, value = list(bicep = c(13.3591525423729,  : 
replacement has 20 rows, data has 60

I realize this happens because the new data frame is smaller since it contains less rows. Should I just create a new data frame with change index? How do I go about doing this?

I have to calculate this change index for many variables in columns (many outcomes). Is there a way to automate this process?

Thanks for reading.

   subject treatment time outcome1 outcome2
1       1         a    1       80       15
2       1         a    2       75       14
3       1         a    3       74       12
4       2         b    1       90       16
5       2         b    2       81       15
6       2         b    3       76       15

EDIT 1

Tried doing the following as suggested below, I changed the names according to my data

ancestral1 %>%
group_by(subject) %>% 
mutate(bicep0 = bicep[time == 0],
     bicep12 = bicep[time == 12], 
     bicepind = (bicep12 - bicep0) / bicep12)

I get the following error

Error in mutate_impl(.data, dots) : 
Column `bicep0` must be length 1 (the group size), not 0

EDIT 2

Tried the new suggestion, still the same error

ancestral1 %>% 
group_by(subject) %>% 
mutate(bicep0 = if(any(time == 5)) bicep[time == 5] else NA, 
     bicep12 = bicep[time == 3], 
     bicepind = (bicep0 - bicep12) / bicep0)

Error in mutate_impl(.data, dots) : 
Column `bicep12` must be length 1 (the group size), not 0
6
  • The reason for the error is while you filter, the number of rows differ for both objects Commented Sep 21, 2018 at 15:30
  • In the example you showed, both 'subject' have the 1 and 3. If it is not the case, it will result in error. You may have to change the example and also show the expected output in that case Commented Sep 21, 2018 at 15:55
  • Thanks, in my data set all subjects have outcomes at only times 0,6,12 weeks. There are about 40 subjects. I am not sure what is going wrong. Commented Sep 21, 2018 at 16:03
  • Please check the code data %>% group_by(subject) %>% mutate(outcome1t0 = if(any(time == 5)) outcome1[time == 5] else NA, outcome1t2 = outcome1[time == 3], newvariable = (outcome1t0 - outcome1t2) / outcome1t0) Commented Sep 21, 2018 at 16:04
  • Thanks, I tried, that the same error. I have updated the main post. Commented Sep 21, 2018 at 16:12

1 Answer 1

1

Instead of doing the filter, we create new variables

data %>%
  group_by(subject) %>% 
  mutate(outcome1t0 = outcome1[time == 1],
       outcome1t2 = outcome1[time == 3], 
       newvariable = (outcome1t0 - outcome1t2) / outcome1t0) %>%
  select(-outcome1t0, -outcome1t2)
# A tibble: 6 x 6
# Groups:   subject [2]
#  subject treatment  time outcome1 outcome2 newvariable
#    <int> <chr>     <int>    <int>    <int>       <dbl>
#1       1 a             1       80       15       0.075
#2       1 a             2       75       14       0.075
#3       1 a             3       74       12       0.075
#4       2 b             1       90       16       0.156
#5       2 b             2       81       15       0.156
#6       2 b             3       76       15       0.156
Sign up to request clarification or add additional context in comments.

9 Comments

Thanks @akrun I tried doing this, but get an error. I have update my main post to show the error.
@DiscoR If you have only unique value of time for each 'subject' it should work
@DiscoR Create an if/else condition for e.g. there is not time 5. If I use data %>% group_by(subject) %>% mutate(outcome1t0 = if(any(time == 5)) outcome1[time == 5] else NA, outcome1t2 = outcome1[time == 3], newvariable = (outcome1t0 - outcome1t2) / outcome1t0)
Okay, Thanks a lot. I think it worked. I am not seeing the new variable in the data frame that I pull from the environment though. How do I create this new column and save it to data frame in the environment?
@DiscoR You should probably use first/last instead of hard-coding the times, eg DT %>% group_by(subject) %>% summarise(change = if (n() == 1) NA_real_ else (last(outcome1) - first(outcome1))/first(outcome1)) I guess there's some summarise_at or mutate_at if you try to tackle multiple outcomes this way
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.