I have a data table like dt below. It's mostly complete, but has a few missing values that I'm trying to fill in a reasonable way.
set.seed(2015)
require(data.table)
dt<-data.table(id=1:10, x=sample(letters[1:3],10,replace=TRUE), y=sample(letters[4:6],10,replace=TRUE), key="id")
dt[sample(10,3), y:=""]
dt
id x y
1: 1 a f
2: 2 c
3: 3 a d
4: 4 a
5: 5 a f
6: 6 b f
7: 7 b
8: 8 a d
9: 9 b f
10: 10 b e
For each missing y, I would like to set the y value equal to the most frequent (non blank) y value for its class in x. In the case of a tie, choose a random y of the tied winners. If no winner exists, leave y blank. In this example my data table should get transformed to
id x y
1: 1 a f
2: 2 c
3: 3 a d
4: 4 a d
5: 5 a f
6: 6 b f
7: 7 b f
8: 8 a d
9: 9 b f
10: 10 b e
or
id x y
1: 1 a f
2: 2 c
3: 3 a d
4: 4 a f
5: 5 a f
6: 6 b f
7: 7 b f
8: 8 a d
9: 9 b f
10: 10 b e
(the y value in row 4 could become d or f)
Couldn't figure out an efficient way to do this.