-1

I have a some what same question as Subsetting a data.table using another data.table and Subset a data.table by matching columns of another data.table

dt is the same.

dt

   id year event
1:  2 2005     1
2:  2 2006     1
3:  2 2007     1
4:  4 2008     1
5:  4 2009     1
6:  2 2005     0
7:  4 2006     0
8:  4 2007     0
9:  2 2008     0

dt <- data.table(id = c(2,2,2,4,4,2,4,4,2), year = c(2005:2009,2005:2008),
                 event = rep(1:0, times=c(5, 4)))

But, the dt1 is a little bit different

dt1

   year performance  event
1: 2005        1000      1
2: 2006        1001      1
3: 2007        1002      1
4: 2008        1003      1
5: 2009        1004      1
6: 2005        1005      0
7: 2006        1006      0
8: 2007        1007      0
9: 2008        1008      0

dt1 <- data.table(year = c(2005:2009,2005:2008), performance = 1000:1008,
                  event = rep(1:0, times=c(5, 4)))

I want to split dt1 based on dt's id and group by event. The desired output would like to be two different data.tables:

dt1.sub1
   year performance  event
1: 2005        1000      1
2: 2006        1001      1
3: 2007        1002      1
4: 2005        1005      0
5: 2008        1008      0


dt1.sub2
   year performance  event
1: 2008        1003      1
2: 2009        1004      1
3: 2006        1006      0
4: 2007        1007      0

Is there a way to achieve this without using merge?

3
  • No, I made a mistake, everything is the same in dt and dt except dt has an additional id column. And I want to split dt1 based on dt's id. Commented Apr 13, 2016 at 21:32
  • No, I don't think so Commented Apr 13, 2016 at 21:52
  • 2
    You should edit/cleanup your question. It is unclear what you are asking. Commented Apr 13, 2016 at 23:01

2 Answers 2

2

We can use split to create a list of 'data.tables'.

lst <- split(dt1, dt$id)
names(lst) <- paste0('dt1.sub', seq_along(lst))
lst
#$dt1.sub1
#   year performance event
#1: 2005        1000     1
#2: 2006        1001     1
#3: 2007        1002     1
#4: 2005        1005     0
#5: 2008        1008     0

#$dt1.sub2
#   year performance event
#1: 2008        1003     1
#2: 2009        1004     1
#3: 2006        1006     0
#4: 2007        1007     0

It is better to work within the list. However, if it is really needed, then separate data.table objects can be created in the global environment with list2env

list2env(lst, envir = .GlobalEnv)
Sign up to request clarification or add additional context in comments.

Comments

2
dt[dt1, on = c('year', 'event')][, .(list(.SD)), by = id]$V1
#[[1]]
#   year event performance
#1: 2005     1        1000
#2: 2006     1        1001
#3: 2007     1        1002
#4: 2005     0        1005
#5: 2008     0        1008
#
#[[2]]
#   year event performance
#1: 2008     1        1003
#2: 2009     1        1004
#3: 2006     0        1006
#4: 2007     0        1007

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.