0

I have a data set with column names like below:

colnames(samp) 

[1] "RESPID"             "SAMPLE"             "Weight"             "Q1"                 "Q19A_1"            
 [6] "Q19B_1"             "Q19C_1"             "Q19E_1"             "Q19F_1"             "RECORDERLOOP_Q20_1"
[11] "RECORDERLOOP_Q20_2" "RECORDERLOOP_Q20_3" "RECORDERLOOP_Q20_4" "Q20_1_1"            "Q20_2_1"           
[16] "Q20_3_1" 

For the column names that start with "Q19" or "Q20" (i.e. a certain pattern), I want to remove _1 (i.e. _ and the number).

I know how it works for one column (e.g. Q19). It would be something like this:

library(dplyr)

samp_subset = samp %>%
  select(dplyr::contains("Q19")) 

colnames(samp_subset) = sub('.{02}$', '', colnames(samp_subset))

However, I don't know how to define the expression of certain columns (e.g. for Q19 and Q20 but not for RESPID or Sample etc.).

2 Answers 2

2

Using dplyr, you can try rename_at

library(dplyr)
df %>%  rename_at(vars(matches("^Q19|^Q20")), ~sub("_\\d+$", "", .))

Using base R, I think would involve two steps identify the columns and replace the values.

vals <- grep("^Q19|^Q20", names(df))
names(df)[vals] <- sub("_\\d+$", "", names(df)[vals])
Sign up to request clarification or add additional context in comments.

Comments

1

We can use

library(dplyr)
library(stringr)
df %>%
    rename_at(vars(matches("^Q(19|20)")), ~ str_remove(., "_\\d+$"))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.