0

imagine I have the following tibble as my data. As you can see there are columns with names like name1.x, and name2.x, but actually there are tens of them sitting in my real data set.

MWE

test <- tibble(subject = c(1,1,1, 2, 2, 2, 2, 3, 3), 
               name1.x = c(2, 4, 5, 2, 3, 6, 8, 1.2, 10), 
               name2.x = c(1, 6, 8, 1.2, 9, 6, 66, 1.2, 20), 
               visit = factor(c("base", "v1", "v2", "base", "v1", "v2", "v3", "base", "v1")))

  subject name1.x name2.x visit
    <dbl>   <dbl>   <dbl> <fct>
1       1     2       1   base 
2       1     4       6   v1   
3       1     5       8   v2   
4       2     2       1.2 base 
5       2     3       9   v1   
6       2     6       6   v2   
7       2     8      66   v3   
8       3     1.2     1.2 base 
9       3    10      20   v1   

What do I want?

to do re-scaling to the values in my target columns based on their base visit and of course creating new columns based on their names. All v values corresponding to a subject shall be re-scaled to their corresponding base values. So for subject 1, name1.x first value is 2 and it happens to be a base value so it must give us 2/2=1, second value is 4 must give us 4/2=2, and so on... Those new values must be created in a new column with a name corresponding to its cognate column, name1.x shall be given name1.y and name2.x -> name2.y. How to achieve this using the tidyverse?

1 Answer 1

3

If I am correctly reading your question, I think the following is one way to solve your question. First, I defined groups using subject. For each subject, I handled division as you described. Basically, I handled the division for each column that contains name in column names. When I did that, I used the value which has base in the same row as denominator. Finally, I modified column names so that I could have the desired column names. I hope this will help you.

library(dplyr)

group_by(test, subject) %>% 
  mutate_at(vars(contains("name")),
            .funs = list(y = ~./.[visit == "base"])) %>% 
  rename_at(vars(contains("_")),
            .funs = list(~sub(x = ., pattern = "(?<=.)._", replacement = "", perl = TRUE)))

#  subject name1.x name2.x visit name1.y name2.y
#    <dbl>   <dbl>   <dbl> <fct>   <dbl>   <dbl>
#1       1     2       1   base     1        1  
#2       1     4       6   v1       2        6  
#3       1     5       8   v2       2.5      8  
#4       2     2       1.2 base     1        1  
#5       2     3       9   v1       1.5      7.5
#6       2     6       6   v2       3        5  
#7       2     8      66   v3       4       55  
#8       3     1.2     1.2 base     1        1  
#9       3    10      20   v1       8.33    16.7
Sign up to request clarification or add additional context in comments.

2 Comments

it works, I wonder what is the meaning of the tilde ~ part? could you please explain its importance? I resolved to another naming method: dplyr::rename_at(vars(ends_with(".x_y")), ~gsub(".x_y", ".res", .))
@doctorate ~ is necessary when you run the function. So you need it. For the second point, if that is what you wanna do, go for it. In the end, you wanna do what you wanna do based on my suggestion here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.