0

I would like to assign a value into a column from a larger table, using another column as a reference.

E.g. data:

require(data.table)
dt <- data.table(N=c(1:5),GPa1=c(sample(0:5,5)),GPa2=c(sample(5:15,5)),
GPb1=c(sample(0:20,5)),GPb2=c(sample(0:10,5)),id=c("b","a","b","b","a"))

   N GPa1 GPa2 GPb1 GPb2 id
1: 1    4   10    7    0  b
2: 2    5   15   19    7  a
3: 3    1    5   20    5  b
4: 4    0   13    3    4  b
5: 5    3    7    8    1  a

The idea is to get new columns Val1 and Val2. Any GP column ending in 1 is eligible for Val1 and any ending in 2 is eligible for Val2. The value to be insterted into the column is determined by the id column, per row.

So you can see for Val1, you'd draw on the GPb1 column, then GPa1, GPb1, GPb1 again and finally GPa1.

The final result would be;

   N GPa1 GPa2 GPb1 GPb2 id Val1 Val2
1: 1    4   10    7    0  b   7    0
2: 2    5   15   19    7  a   5   15
3: 3    1    5   20    5  b  20    5
4: 4    0   13    3    4  b   3    4
5: 5    3    7    8    1  a   3    7

I did achieve the answer but in quite a few lines after melting it etc, but i'm sure there must be an elegant way to do this in data.table. I was initially frustrated by the fact paste0 doesn't work in data.table;

dt[1,paste0("GP",id,"1")]

but;

# The following gives a vector that is correct for Val1 (and works for 2)
diag(as.matrix(dt[,.SD,.SDcols=dt[,paste0("GP",id,"1")]]))

# I think the answer lies in `set`, but i've not had any luck.
for (i in 1:nrow(dt)) set(dt, i=dt[i,.SD,.SDcols=dt[,paste0("GP",id,"2")]], j=i, value=0)

The data is quite ugly this way so perhaps it's better to just use the melt method.

0

1 Answer 1

3
dt[id == "a", c("Val1", "Val2") := .(GPa1, GPa2)]
dt[id == "b", c("Val1", "Val2") := .(GPb1, GPb2)]
#   N GPa1 GPa2 GPb1 GPb2 id Val1 Val2
#1: 1    2   13    5    8  b    5    8
#2: 2    3    8    7    2  a    3    8
#3: 3    5   11   19    1  b   19    1
#4: 4    4    5    6    9  b    6    9
#5: 5    1   15    1   10  a    1   15
Sign up to request clarification or add additional context in comments.

3 Comments

The result differs because OP did not use set.seed.
thanks, this is very simple. I guess i didn't think of this as my raw data goes up to GPh8. And for each letter it is not always the same amount of integers, requiring 8 individually tailored lines of code. shall i adjust the original qu to reflect this?
No, that would be a different question, so post a new question. The best answer then would almost certainly involve a reorganization/reshaping of your data.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.