3

I have a simple time series dataset with 10 variables - I want to create a for loop (or a function) that creates a 'change from previous month' variable and a 'percent change from previous month variable' for each variable in the timeseries (except for the date). I know I can simply write code for each particular column but would like to optimize this since there are a lot of columns.

Here is what my data looks like, "Date", "Sales", "Price" are some column names:

+----+---+---+---+---+---+---+---+--
| Date       |   Sales   |  Price  | 
+----+---+---+---+---+---+---+---+--
| 01Aug2019  | 4         | 15      |
| 01Sept2019 | 6         | 30      |
| 01Oct2019  | 10        | 44      |
+----+---+---+---+---+---+---+---+--

Here is what I want it to look like with the use of the for loop (or any function)

+----+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| Date       |   Sales   |  chg_Sales  | pct_chg_Sales |   Price |  chg_Price  | pct_chg_Price| 
+----+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 01Aug2019  | 4         | NA          |NA             |  15     | NA          |NA            |
| 01Sept2019 | 6         | 2           |50%            |  30     | 15          |100%          |
| 01Oct2019  | 10        | 4           |66%            |  44     | 14          |46%           |
+----+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

I tried the code below but it did not work

add_column <- function (x, y){
  setDT (x)[,pct_chg_y:= (y - shift (y,1, type="lag")/shift (,1, type="lag")*100]

}

2 Answers 2

2

Here is an option with data.table where we specify the columns of interest in .SDcols, create the 'chg_' column by subtracting the .SD (Subset of Data.table) from the lag i.e. shift of .SD, then in the second step, create the 'pct_chg, by dividing the shift with the 'chg_' columns using Map

nm1 <- c("Sales", "Price")
setDT(df1)[,  paste0("chg_", nm1)  :=  .SD - shift(.SD), .SDcols = nm1]
df1[, paste0("pct_chg_", nm1) :=   
      Map(function(x, y)  100 * (y/shift(x)), .SD, mget(paste0("chg_", nm1))),
               .SDcols = nm1]
df1
#         Date Sales Price chg_Sales chg_Price pct_chg_Sales pct_chg_Price
#1:  01Aug2019     4    15        NA        NA            NA            NA
#2: 01Sept2019     6    30         2        15      50.00000     100.00000
#3:  01Oct2019    10    44         4        14      66.66667      46.66667

data

df1 <- structure(list(Date = c("01Aug2019", "01Sept2019", "01Oct2019"
), Sales = c(4, 6, 10), Price = c(15, 30, 44)), 
        class = "data.frame", row.names = c(NA, 
-3L))
Sign up to request clarification or add additional context in comments.

Comments

1
library(dplyr)
library(scales)

df1 %>% 
  arrange(Date) %>% 
  mutate_at(.vars = c("Sales", "Price"), list(chg = ~(. - lag(.)),
                                              pct_chg = ~percent((. - lag(.))/lag(.))))

  #         Date Sales Price Sales_chg Price_chg Sales_pct_chg Price_pct_chg
  # 1 2019-08-01     4    15        NA        NA           NA%           NA%
  # 2 2019-09-01     6    30         2        15         50.0%        100.0%
  # 3 2019-10-01    10    44         4        14         66.7%         46.7%

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.