Subsetting a R data.table using another data.table

Question

I have a some what same question as Subsetting a data.table using another data.table and Subset a data.table by matching columns of another data.table

dt is the same.

dt

   id year event
1:  2 2005     1
2:  2 2006     1
3:  2 2007     1
4:  4 2008     1
5:  4 2009     1
6:  2 2005     0
7:  4 2006     0
8:  4 2007     0
9:  2 2008     0

dt <- data.table(id = c(2,2,2,4,4,2,4,4,2), year = c(2005:2009,2005:2008),
                 event = rep(1:0, times=c(5, 4)))

But, the dt1 is a little bit different

dt1

   year performance  event
1: 2005        1000      1
2: 2006        1001      1
3: 2007        1002      1
4: 2008        1003      1
5: 2009        1004      1
6: 2005        1005      0
7: 2006        1006      0
8: 2007        1007      0
9: 2008        1008      0

dt1 <- data.table(year = c(2005:2009,2005:2008), performance = 1000:1008,
                  event = rep(1:0, times=c(5, 4)))

I want to split dt1 based on dt's id and group by event. The desired output would like to be two different data.tables:

dt1.sub1
   year performance  event
1: 2005        1000      1
2: 2006        1001      1
3: 2007        1002      1
4: 2005        1005      0
5: 2008        1008      0


dt1.sub2
   year performance  event
1: 2008        1003      1
2: 2009        1004      1
3: 2006        1006      0
4: 2007        1007      0

Is there a way to achieve this without using merge?

No, I made a mistake, everything is the same in dt and dt except dt has an additional id column. And I want to split dt1 based on dt's id. — morningfin
– morningfin, Commented Apr 13, 2016 at 21:32
You should edit/cleanup your question. It is unclear what you are asking. — jangorecki
– jangorecki, Commented Apr 13, 2016 at 23:01

akrun · Accepted Answer · 2016-04-14 02:40:38Z

2

We can use split to create a list of 'data.tables'.

lst <- split(dt1, dt$id)
names(lst) <- paste0('dt1.sub', seq_along(lst))
lst
#$dt1.sub1
#   year performance event
#1: 2005        1000     1
#2: 2006        1001     1
#3: 2007        1002     1
#4: 2005        1005     0
#5: 2008        1008     0

#$dt1.sub2
#   year performance event
#1: 2008        1003     1
#2: 2009        1004     1
#3: 2006        1006     0
#4: 2007        1007     0

It is better to work within the list. However, if it is really needed, then separate data.table objects can be created in the global environment with list2env

list2env(lst, envir = .GlobalEnv)

edited Apr 14, 2016 at 2:40

answered Apr 14, 2016 at 2:33

akrun

891k38 gold badges590 silver badges700 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

eddi · Accepted Answer · 2016-04-14 14:35:31Z

2

dt[dt1, on = c('year', 'event')][, .(list(.SD)), by = id]$V1
#[[1]]
#   year event performance
#1: 2005     1        1000
#2: 2006     1        1001
#3: 2007     1        1002
#4: 2005     0        1005
#5: 2008     0        1008
#
#[[2]]
#   year event performance
#1: 2008     1        1003
#2: 2009     1        1004
#3: 2006     0        1006
#4: 2007     0        1007

answered Apr 14, 2016 at 14:35

eddi

49.5k6 gold badges109 silver badges157 bronze badges

Collectives™ on Stack Overflow

Subsetting a R data.table using another data.table

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related