8

I have a data table:

library(data.table)
(f <- data.table(id1=c(1,2,3,1,2,3),
                 id2=as.factor(c("a","a","b","c","b","d")),
                 v=1:6,
                 key=c("id1","id2")))
   id1 id2 v
1:   1   a 1
2:   1   c 4
3:   2   a 2
4:   2   b 5
5:   3   b 3
6:   3   d 6
> str(f)
Classes ‘data.table’ and 'data.frame':  6 obs. of  3 variables:
 $ id1: num  1 1 2 2 3 3
 $ id2: Factor w/ 4 levels "a","b","c","d": 1 3 1 2 2 4
 $ v  : int  1 4 2 5 3 6
 - attr(*, "sorted")= chr  "id1" "id2"
 - attr(*, ".internal.selfref")=<externalptr> 

How do I add the "missing" rows?

I.e., for each existing id1 I want all possible values of id2 to be present (with v=0).

So, I need to add 6 rows (3 possible values of id1 * 4 possible values of id2 - 6 existing rows).

2 Answers 2

21

I'd get the unique values in id1 and id2 and do a join using data.table's cross join function CJ as follows:

# if you've already set the key:
ans <- f[CJ(id1, id2, unique=TRUE)][is.na(v), v := 0L][]

# or, if f is not keyed:
ans <- f[CJ(id1 = id1, id2 = id2, unique=TRUE), on=.(id1, id2)][is.na(v), v := 0L][]

ans
Sign up to request clarification or add additional context in comments.

Comments

2
f[, {
  tab = table(id2)
  x = as.numeric(tab)
  x[x != 0] = v
  list(id2 = names(tab), v = x)
}, by = id1]

##    id1 id2 v
##  1:   1   a 1
##  2:   1   b 0
##  3:   1   c 4
##  4:   1   d 0
##  5:   2   a 2
##  6:   2   b 5
##  7:   2   c 0
##  8:   2   d 0
##  9:   3   a 0
## 10:   3   b 3
## 11:   3   c 0
## 12:   3   d 6

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.