R data table: Assign a value to column based on reference column

Question

I would like to assign a value into a column from a larger table, using another column as a reference.

E.g. data:

require(data.table)
dt <- data.table(N=c(1:5),GPa1=c(sample(0:5,5)),GPa2=c(sample(5:15,5)),
GPb1=c(sample(0:20,5)),GPb2=c(sample(0:10,5)),id=c("b","a","b","b","a"))

   N GPa1 GPa2 GPb1 GPb2 id
1: 1    4   10    7    0  b
2: 2    5   15   19    7  a
3: 3    1    5   20    5  b
4: 4    0   13    3    4  b
5: 5    3    7    8    1  a

The idea is to get new columns Val1 and Val2. Any GP column ending in 1 is eligible for Val1 and any ending in 2 is eligible for Val2. The value to be insterted into the column is determined by the id column, per row.

So you can see for Val1, you'd draw on the GPb1 column, then GPa1, GPb1, GPb1 again and finally GPa1.

The final result would be;

   N GPa1 GPa2 GPb1 GPb2 id Val1 Val2
1: 1    4   10    7    0  b   7    0
2: 2    5   15   19    7  a   5   15
3: 3    1    5   20    5  b  20    5
4: 4    0   13    3    4  b   3    4
5: 5    3    7    8    1  a   3    7

I did achieve the answer but in quite a few lines after melting it etc, but i'm sure there must be an elegant way to do this in data.table. I was initially frustrated by the fact paste0 doesn't work in data.table;

dt[1,paste0("GP",id,"1")]

but;

# The following gives a vector that is correct for Val1 (and works for 2)
diag(as.matrix(dt[,.SD,.SDcols=dt[,paste0("GP",id,"1")]]))

# I think the answer lies in `set`, but i've not had any luck.
for (i in 1:nrow(dt)) set(dt, i=dt[i,.SD,.SDcols=dt[,paste0("GP",id,"2")]], j=i, value=0)

The data is quite ugly this way so perhaps it's better to just use the melt method.

Roland · Accepted Answer · 2018-04-12 07:38:39Z

3

dt[id == "a", c("Val1", "Val2") := .(GPa1, GPa2)]
dt[id == "b", c("Val1", "Val2") := .(GPb1, GPb2)]
#   N GPa1 GPa2 GPb1 GPb2 id Val1 Val2
#1: 1    2   13    5    8  b    5    8
#2: 2    3    8    7    2  a    3    8
#3: 3    5   11   19    1  b   19    1
#4: 4    4    5    6    9  b    6    9
#5: 5    1   15    1   10  a    1   15

answered Apr 12, 2018 at 7:38

Roland

134k12 gold badges203 silver badges305 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Roland Over a year ago

The result differs because OP did not use set.seed.

Sam Over a year ago

thanks, this is very simple. I guess i didn't think of this as my raw data goes up to GPh8. And for each letter it is not always the same amount of integers, requiring 8 individually tailored lines of code. shall i adjust the original qu to reflect this?

Roland Over a year ago

No, that would be a different question, so post a new question. The best answer then would almost certainly involve a reorganization/reshaping of your data.

Collectives™ on Stack Overflow

R data table: Assign a value to column based on reference column

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related