3

I am new to Programming and got stuck in it. I wanted to calculate the hourly temperature variation of an object throughout the year using some variables, which changes in every hour. The original data contains 60 columns and 8760 rows for the calculation.

I got the desired output using the for loop, but the model is taking a lot of time for the calculation. I wonder if there is any way to replace the loop with functions, which I suspect, can also increase the speed of the calculations.

Here is a small reproducible example to show what I did.

table <- data.table("A" = c(1), "B" = c(1:5), "C" = c(10))

table
   A B  C
1: 1 1 10
2: 1 2 10
3: 1 3 10
4: 1 4 10
5: 1 5 10

The forloop

for (j in (2: nrow(table))) {
  table$A[j] = (table$A[j-1] + table$B[j-1]) * table$B[j]
  table$C[j] = table$B[j] * table$A[j] 
 }

I got the output as I desired:

     A B    C
1:   1 1   10
2:   4 2    8
3:  18 3   54
4:  84 4  336
5: 440 5 2200

but it took 15 min to run the whole program in my case (not this!)

So I tried to use function instead of the for loop.

I tried this:

table <- data.table("A" = c(1), "B" = c(1:5), "C" = c(10))


myfun <- function(df){
  df = df %>% mutate(A = (lag(A) + lag(B)) * B, 
                     C = B * A)
  return(df)
}

myfun(table)

But the output was

   A B   C
1 NA 1  NA
2  4 2   8
3  9 3  27
4 16 4  64
5 25 5 125

As it seems that the function refers to the rows of the first table not the updated rows after the calculation. Is there any way to obtain the desired output using functions? It is my first R project, any help is very much appreciated. Thank you.

2
  • I guess this is just a toy example but if not A will grow very fast... O(n!) numbers will very soon get too big. Commented Nov 11, 2022 at 13:49
  • Yes, this example is just to understand the possible ways. With the Answers I would like to reprogram my model and will definitely update the results. Commented Nov 11, 2022 at 13:54

3 Answers 3

3

A much faster alternative using data.table. Note that the calculation of C can be separated from the calculation of A so we can do less within the loop:

for (i in 2:nrow(table)) {
  set(table, i = i, j = "A", value = with(table, (A[i-1] + B[i-1]) * B[i]))
}
table[-1, C := A * B]
table

#        A     B     C
#    <num> <int> <num>
# 1:     1     1    10
# 2:     4     2     8
# 3:    18     3    54
# 4:    84     4   336
# 5:   440     5  2200
Sign up to request clarification or add additional context in comments.

Comments

1

Here's a solution using purrr::accumulate2 which lets you use the result of the previous computation as the input to the next one:

library(data.table)
library(purrr)
library(magrittr)

table <- data.table("A" = c(1), "B" = c(1:5), "C" = c(10))

table$A <- accumulate2(
  table$A,
  seq(table$A),
  ~ (..1 + table$B[..3]) * table$B[..3 + 1],
  .init = table$A[1]
) %>%
  unlist() %>%
  extract(1:nrow(table))
  
table$C <- table$B * table$A

table
#      A B    C
# 1:   1 1    1
# 2:   4 2    8
# 3:  18 3   54
# 4:  84 4  336
# 5: 440 5 2200

Comments

1

You can try Reduce like below

dt[
  ,
  A := Reduce(function(x, Y) (x + Y[2]) * Y[1],
    asplit(embed(B, 2), 1),
    init = A[1],
    accumulate = TRUE
  )
][
  ,
  C := A * B
]

which updates dt as

> dt
     A B    C
1:   1 1    1
2:   4 2    8
3:  18 3   54
4:  84 4  336
5: 440 5 2200

data

dt <- data.table("A" = c(1), "B" = c(1:5), "C" = c(10))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.