2

I want to set the value based on specific matching of rowname and colname in an R data frame. I have the row names (var1, var2, var3, var4 etc.) and the column names (x-var1-t1, x-var2-t1, x-var1-t4, x-var3-t1, x-var3-t7 etc). The row name needs to match the first "x-varN" part of the column name. For example, var1 row name should match with x-var1-t1 and x-var1-t4.

So this data frame:

      x-var1-t1   x-var2-t1   x-var1-t4   x-var3-t1   x-var3-t7
var1          0           0           0           0           0
var2          0           0           0           0           0
var3          0           0           0           0           0
var4          0           0           0           0           0

would change to this:

      x-var1-t1   x-var2-t1   x-var1-t4   x-var3-t1   x-var3-t7
var1          1           0           1           0           0
var2          0           1           0           0           0
var3          0           0           0           1           1
var4          0           0           0           0           0

What's the best way to perform this function?

2 Answers 2

2

We can use sapply to loop through rownames of df and use grepl to check which column has that row name and convert the value to 1 for those.

df[] <- t(sapply(rownames(df), function(x) as.numeric(grepl(x, colnames(df)))))
df

#     x.var1.t1 x.var2.t1 x.var1.t4 x.var3.t1 x.var3.t7
#var1         1         0         1         0         0
#var2         0         1         0         0         0
#var3         0         0         0         1         1
#var4         0         0         0         0         0

Or as suggested by @Dan Y we can skip the anonymous call and make this more compact by:

df[] <- +t(sapply(rownames(df), grepl, colnames(df)))
Sign up to request clarification or add additional context in comments.

Comments

2

We can use adist to compare the rownames to columnnames.

 dat[] = +(!do.call(adist, c(partial = TRUE, dimnames(dat))))
 dat
     x.var1.t1 x.var2.t1 x.var1.t4 x.var3.t1 x.var3.t7
var1         1         0         1         0         0
var2         0         1         0         0         0
var3         0         0         0         1         1
var4         0         0         0         0         0

This is equivalent to:

  (adist(rownames(dat),colnames(dat),partial=TRUE)==0)+0

The reason I am adding 0 is to change it from logical to numeric. You can use *1. These are just identities. adist with partial=TRUE is equivalent with agrep.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.