5

in the following snippet, data.table does not seem to recognize logicals when used in i.

All my attempts to reproduce the problem in a minimal example failed, that's why I am posting the complete section here. I expect it to be related to the part "as.logical(cumsum(CURRENT_TRIP))", but just a gut feeling...

# Testdata
timetable <- data.table(rbind(
    c("r1", "t1_1", "p1", 10, 10),
    c("r1", "t1_1", "p2", 11, 11),
    c("r1", "t1_1", "p3", 12, 12),
    c("r1", "t1_1", "p4", 13, 13),
    c("r1", "t1_1", "p5", 14, 14),
    c("r1", "t1_1", "p6", 15, 15),
    c("r1", "t1_1", "p7", 16, 16),
    c("r1", "t1_1", "p8", 17, 17),
    c("r1", "t1_1", "p9", 18, 18),
    c("r1", "t1_1", "p10", 19, 19),

    c("r2", "t2", "p11", 9, 9),
    c("r2", "t2", "p12", 10, 10),
    c("r2", "t2", "p3", 11, 11),
    c("r2", "t2", "p13", 12, 12),
    c("r2", "t2", "p14", 13, 13),
    c("r2", "t2", "p15", 14, 14),
    c("r2", "t2", "p16", 15, 15),
    c("r2", "t2", "p17", 16, 16),
    c("r2", "t2", "p18", 17, 17)
  ))
setnames(timetable, c("ROUTE", "TRIP", "STOP", "ARRIVAL", "DEPARTURE"))
timetable[, ':='(ARRIVAL = as.integer(ARRIVAL), DEPARTURE = as.integer(DEPARTURE))]


# Input
startStation <- "p3"
startTime <- 8

setorder(timetable, TRIP, ARRIVAL)
timetable[, ID := .I]

timetable[,':='(ARR_ROUND_PREV = Inf, ARR_ROUND = Inf, ARR_BEST = Inf, MARKED = F, CURRENT_TRIP = F)]
timetable[STOP == startStation, ':='(ARR_ROUND_PREV = startTime, ARR_ROUND = startTime, ARR_BEST = startTime, MARKED = T)]

routes <- timetable[MARKED == T, unique(ROUTE)] 
ids <- timetable[MARKED == T & DEPARTURE > ARR_ROUND, .(ID = ID[DEPARTURE == min(DEPARTURE)]), by = ROUTE][, ID]

timetable[ID %in% ids, CURRENT_TRIP := T]
timetable[, MARKED := F]

trips <- timetable[CURRENT_TRIP == T, unique(TRIP)]
timetable[TRIP %in% trips, CURRENT_TRIP := as.logical(cumsum(CURRENT_TRIP)), by = TRIP]

# ?
timetable
nrow(timetable[CURRENT_TRIP == T]) #8
sum(timetable$CURRENT_TRIP == T) #15

# but 
nrow(timetable[CURRENT_TRIP > 0]) #15
nrow(timetable[CURRENT_TRIP == 1L]) #15

any ideas?

Problem shows up using newest 1.9.7 and 1.9.6 and R 3.2.3 on Win 64bit

Fab

7
  • 2
    Seems like a bug to me. You can set options(datatable.auto.index = FALSE) or use nrow(timetable[(CURRENT_TRIP == T)]). Btw, your way of creating the initial data.table is stupid. Don't use rbind/cbind for this. Commented Dec 23, 2015 at 10:24
  • Besides the construction of your data.table, you are using ':=' instead of ``` around the :=. Furthermore: I can't reproduce the problem. Commented Dec 23, 2015 at 10:42
  • Could you point out where you do have a problem? as.logical(cumsum(CURRENT_TRIP)) is working as expected (using data.table 1.9.7 & R 3.2.2 on OSX). Commented Dec 23, 2015 at 10:47
  • @Jaap The problem is the result of nrow(timetable[CURRENT_TRIP == T]), which should be 15, not 8. (But I admit, the example is reproducible, but far from minimal.) Commented Dec 23, 2015 at 10:51
  • 1
    @Roland that is indeed strange Commented Dec 23, 2015 at 11:22

1 Answer 1

2

You have exactly the same bug that I have!!!

Strange issue with data.table row search

I also could not reproduce it with a minimal code!

My solution to your code is changing how you set the column CURRENT_TRIP.

timetable[ID %in% ids]$CURRENT_TRIP <- T
timetable[, MARKED := F]

trips <- timetable[CURRENT_TRIP == T, unique(TRIP)]
timetable[TRIP %in% trips]$CURRENT_TRIP <- timetable[,as.logical(cumsum(CURRENT_TRIP)), by = TRIP]$V1

# ?
timetable
nrow(timetable[CURRENT_TRIP == T]) #8
sum(timetable$CURRENT_TRIP == T) #15

# but 
nrow(timetable[CURRENT_TRIP > 0]) #15
nrow(timetable[CURRENT_TRIP == 1L]) #15

Using the dT[,Column:=T] notation for setting up columns also caused me the same issue! I am not sure why and I am in touch with the creator of data.tables to fix this!

Sign up to request clarification or add additional context in comments.

3 Comments

Also to add this has been tested on the latest data.table stable release (1.9.6) as we speak and R 3.2.2 on Mac OSX. Both the bugged code and the fix behaved as expected.
hi, bug was reported and accepted by the package developers. See here for updates: github.com/Rdatatable/data.table/issues/1479
Thank you! This is nice!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.