1

I want to get a list of unique numeric id values across multiple numeric id columns. My goal is to help summarize the flow of changes in a database across users changing multiple tables, in my example from table A to B then back to A.

I know I could do this by appending a list of each columns, but I want to make use of data.table internal to improve efficiency if possible.

set.seed(1)
dt <- data.table(tbl_A_create_uid=sample(1:2),
                 tbl_A_update_uid=sample(1:4))
dt[,tbl_B_create_uid:=tbl_A_update_uid]
dt[,tbl_B_update_uid:=sample(1:4)]
dt_after_update<-rbind(dt,data.table(tbl_A_create_uid=dt[,tbl_B_update_uid])
                       ,use.names=TRUE
                       ,fill=TRUE
                       )
dt_after_update
# > dt_after_update
#    tbl_A_create_uid tbl_A_update_uid tbl_B_create_uid tbl_B_update_uid
# 1:                1                3                3                4
# 2:                2                4                4                2
# 3:                1                1                1                3
# 4:                2                2                2                1
# 5:                4               NA               NA               NA
# 6:                2               NA               NA               NA
# 7:                3               NA               NA               NA
# 8:                1               NA               NA               NA

wanted: vector or data.table with unique values, e.g., c(1,2,3,4)

1 Answer 1

2

Would this work?

melt(dt_after_update)[, unique(value)] #ignore the warning

If you don't want the NAs:

melt(dt_after_update)[!is.na(value), unique(value)] #ignore the warning
Sign up to request clarification or add additional context in comments.

3 Comments

thanks - yes I didn't realize you could melt without giving column names in that way, great solution
another option without melting: dt_after_update[, unique(unlist(lapply(.SD, unique)))]
Great idea to use melt(), but this only works if the full dataset is non-integer numeric. The error thrown by melt() says "all non-numeric/integer/logical type columns are considered id.vars" when both id.vars and measure.vars are NULL. Suggestion by @chinsoon12 works independently of data type.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.