Another solution
dt[, .(dt[, 1], Freq = Prop * 1000)]
gender Freq
1: Male 490
2: Female 510
Some benchmark of the options given in all answers
Note that I increased the sample data quite a bit, but I was just curious about the differences between the methods for other data sets as well.
Transform is very slow here and not recommended, the other methods are quite similar and the power of .SD and .SDcols are the fastest, although in this case, keeping all your rows and do not update anything by reference using the first method is hardly slower.
set.seed(42)
dt <- data.table(
gender = rep(LETTERS[1:25], 40000),
Prop = runif(n = 1000000))
library(rbenchmark)
benchmark(
"dt[, .(dt[, 1], Freq = Prop * 1000)]" = {
dt[, .(dt[, 1], Freq = Prop * 1000)]
},
"dt[, c(.SD, .(Freq = Prop * 1000)), .SDcols = 1]" = {
dt[, c(.SD, .(Freq = Prop * 1000)), .SDcols = 1]
},
"dt[, c(.SD, .(Freq = Prop * 1000)), .SDcols = -\"Prop\"]" = {
dt[, c(.SD, .(Freq = Prop * 1000)), .SDcols = -"Prop"]
},
"dt[, transform(.SD, Freq = Prop * 1000, Prop = NULL)]" = {
dt[, transform(.SD, Freq = Prop * 1000, Prop = NULL)]
},
"transform(dt, Freq = Prop * 1000, Prop = NULL)" = {
transform(dt, Freq = Prop * 1000, Prop = NULL)
},
replications = 1000,
columns = c("test", "replications", "elapsed", "relative")
)
# test replications elapsed relative
# 1 dt[, .(dt[, 1], Freq = Prop * 1000)] 1000 18.66 1.112
# 3 dt[, c(.SD, .(Freq = Prop * 1000)), .SDcols = -"Prop"] 1000 17.02 1.014
# 2 dt[, c(.SD, .(Freq = Prop * 1000)), .SDcols = 1] 1000 16.78 1.000
# 4 dt[, transform(.SD, Freq = Prop * 1000, Prop = NULL)] 1000 333.51 19.875
# 5 transform(dt, Freq = Prop * 1000, Prop = NULL) 1000 329.41 19.631
Sidenote
Keep in mind that creating the column by reference is like a 5-fold faster
dt[, Freq := Prop * 1000] and OP uses the argument the table is re-used later. I would suggest always to do all calculations and preparations by reference on the table when it gains in speed. You can always subset your output from there.
# test replications elapsed relative
# 1 dt[, .(dt[, 1], Freq = Prop * 1000)] 1000 16.25 5.783
# 2 dt[, c(.SD, .(Freq = Prop * 1000)), .SDcols = 1] 1000 13.33 4.744
# 3 t[, Freq := Prop * 1000] 1000 2.81 1.000