Changing variable names in R based on key in another dataframe

Question

Suppose I have a dataset in R with variables named dog cat and cow. Then there are some number of values under each variable (ie. each is =1 if the respondent owns one and =0 if they don't):

household_ID   dog   cat   cow
00001          0     1     1
00002          1     0     1
00003          0     0     0

Suppose I have another dataset where one column contains my current variables, and another column contains new variable names such that each row contains the new name that should replace the old name:

oldname  newname
dog      canine
cat      feline
cow      bovine

My goal in this oversimplified example is to replace the variable names of the first dataset using the second dataset. I'm imagining some kind of loop where you replace var = newname if var = oldname but I can't get the syntax right and I'm a little stumped. Here's what I'm hypothetically after:

household_ID   canine   feline   bovine
00001          0        1        1
00002          1        0        1
00003          0        0        0

Fixed it. Realized I couldn't use fish when I couldn't think of the word like "canine" or "feline" for "fish". Dumb. — bricevk
– bricevk, Commented Jun 23, 2021 at 17:48

akrun · Accepted Answer · 2021-06-23 18:00:36Z

1

If 'cow' is the 'oldname' third value in second data, we can use rename_with

library(dplyr)
df1 <- df1 %>%
     rename_with(~ df2$newname, df2$oldname)

-output

df1
   household_ID canine feline bovine
1            1      0      1      1
2            2      1      0      1
3            3      0      0      0

Or may use setnames from data.table

library(data.table)
setDT(df1)
setnames(df1, df2$oldname, df2$newname)

Update

If the OP's data have columns such as 'dog1', 'dog2', 'cat1', 'cat2' etc and wanted to replace with canine1, canine2 etc, we can use str_replace_all

library(stringr)
library(tibble)
names(df1new)[-1] <- str_replace_all(names(df1new)[-1], deframe(df2))

-output

df1new
  household_ID canine1 feline1 bovine1 canine2 feline2
1            1       0       1       1       0       1
2            2       1       0       1       1       0
3            3       0       0       0       0       0

data

df1 <- structure(list(household_ID = 1:3, dog = c(0L, 1L, 0L), cat = c(1L, 
0L, 0L), cow = c(1L, 1L, 0L)), class = "data.frame", row.names = c(NA, 
-3L))

df2 <- structure(list(oldname = c("dog", "cat", "cow"), newname = c("canine", 
"feline", "bovine")), class = "data.frame", row.names = c(NA, 
-3L))

df1new <- structure(list(household_ID = 1:3, dog1 = c(0L, 1L, 0L), cat1 = c(1L, 
0L, 0L), cow1 = c(1L, 1L, 0L), dog2 = c(0, 1, 0), cat2 = c(1, 
0, 0)), row.names = c(NA, -3L), class = "data.frame")

edited Jun 23, 2021 at 18:00

answered Jun 23, 2021 at 17:47

akrun

891k38 gold badges590 silver badges700 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

bricevk Over a year ago

Thats great, thank you. Let me add one more nuance. Suppose df1 actually contains variables dog1 dog2 cat1 cow1 cow2 cow3. And we have the same df2. So df2 doesn't contain the old and new variable names with the index at the end, but only the base "dog" "cat" or "cow". Any thoughts on how to get from there to a place where my new variable names are canine1 canine2 feline1 etc.

Collectives™ on Stack Overflow

Changing variable names in R based on key in another dataframe

1 Answer 1

Update

data

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Update

data

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related