I would like to speed up the following code in R. This is a loop to define new individuals (column newID) when the variable change is equal to 1. Any idea on how to improve the loop will be greatly appreciated.
Here is the code:
## Build the data frame
dat <- expand.grid(x = 1:1000, ID = as.character(seq(0, 4000, 1)))
dat$change <- 0
dat[which(dat$x == 1), c("change")] <- 1
dat[which(dat$x == 300), c("change")] <- 1
dat[which(dat$x == 700), c("change")] <- 1
dat[1, c("change")] <- 0
## Add a column "newID"
dat$newID <- NA
index <- c(1, which(dat$change == 1), nrow(dat))
j <- 1
i <- 1
system.time(while (j < length(index)){
print(paste(j, "/", length(index), sep = " "))
i <- ifelse((j > 1) && (dat[index[j], c("ID")] != dat[index[j - 1], c("ID")]), 1, i)
## print(i)
if(j == length(index) - 1){
dat[seq(index[j], index[j + 1], by = 1), c("newID")] <- paste("Ind ", dat[index[j], c("ID")], "|", i, sep="")
} else{
dat[seq(index[j], index[j + 1] - 1, by = 1), c("newID")] <- paste("Ind ", dat[index[j], c("ID")], "|", i, sep="")
}
j <- j + 1
i <- i + 1
})
## summary(dat)
Here is an example:
The input data frame has 3 columns. In particular, ID is the ID number of each individual and change takes the value of 1 when the individual is renewed.
x ID change
1 1 0 0
2 2 0 0
3 3 0 0
4 4 0 0
5 5 0 1
6 6 0 0
7 7 0 1
8 8 0 0
9 9 0 0
10 10 0 0
11 1 1 1
12 2 1 0
13 3 1 0
14 4 1 0
15 5 1 1
16 6 1 0
17 7 1 1
18 8 1 0
19 9 1 0
20 10 1 0
21 1 2 1
22 2 2 0
23 3 2 0
24 4 2 0
25 5 2 1
26 6 2 0
27 7 2 1
28 8 2 0
29 9 2 0
30 10 2 0
The variable newID is created as follows:
When change is equal to 1, newID takes the old ID number and the increment value. Thus, in the example, the expected result is:
x ID change newID
1 1 0 0 Ind 0|1
2 2 0 0 Ind 0|1
3 3 0 0 Ind 0|1
4 4 0 0 Ind 0|1
5 5 0 1 Ind 0|2
6 6 0 0 Ind 0|2
7 7 0 1 Ind 0|3
8 8 0 0 Ind 0|3
9 9 0 0 Ind 0|3
10 10 0 0 Ind 0|3
11 1 1 1 Ind 1|1
12 2 1 0 Ind 1|1
13 3 1 0 Ind 1|1
14 4 1 0 Ind 1|1
15 5 1 1 Ind 1|2
16 6 1 0 Ind 1|2
17 7 1 1 Ind 1|3
18 8 1 0 Ind 1|3
19 9 1 0 Ind 1|3
20 10 1 0 Ind 1|3
21 1 2 1 Ind 2|1
22 2 2 0 Ind 2|1
23 3 2 0 Ind 2|1
24 4 2 0 Ind 2|1
25 5 2 1 Ind 2|2
26 6 2 0 Ind 2|2
27 7 2 1 Ind 2|3
28 8 2 0 Ind 2|3
29 9 2 0 Ind 2|3
30 10 2 0 Ind 2|3