3

I am trying to break down my previous questions and made a plan to achieve in different steps to what I am ultimately looking for. Currently I am trying to do a loop to find out whether a mechanical system is switched on or not for each unique source as shown in first table below in source column.

For example I have been given the following profile which tells me what hours on a typical weekday the system is on for each of the 4 seasons. Please note the some sources are on more than one period during a day, hence you can see stack 2 repeated for 2 periods.

enter image description here

What I am trying to achieve now is that I have created some sample dates and would like to do a loop through each unique sources and just say whether for a specific hour the system is on or off based on the information provided in Profile table. So far what I have done is created the following table with the codes below:

enter image description here

And the code below will create the above table:

# create dates table
dates =data.frame(dates=seq(
  from=as.POSIXct("2010-1-1 0:00", tz="UTC"),
  to=as.POSIXct("2012-12-31 23:00", tz="UTC"),
  by="hour"))  

# add year month day hour weekday column

dates$year <- format(dates[,1], "%Y") # year
dates$month <- format(dates[,1], "%m") # month
dates$day <- format(dates[,1], "%d") # day
dates$hour <- format(dates[,1], "%H") # hour
dates$weekday <- format(dates[,1], "%a") # weekday

# set system locale for reproducibility

Sys.setlocale(category = "LC_TIME", locale = "en_US.UTF-8")

# calculate season column

d = function(month_day) which(lut$month_day == month_day)
lut <- data.frame(all_dates = as.POSIXct("2012-1-1") + ((0:365) * 3600 * 24),
                  season = NA)
lut <- within(lut, { month_day = strftime(all_dates, "%b-%d") })

lut[c(d("Jan-01"):d("Mar-15"), d("Nov-08"):d("Dec-31")), "season"] = "winter"
lut[c(d("Mar-16"):d("Apr-30")), "season"] = "spring"
lut[c(d("May-01"):d("Sep-27")), "season"] = "summer"
lut[c(d("Sep-28"):d("Nov-07")), "season"] = "autumn"
rownames(lut) = lut$month_day

dates = within(dates, {
  season = lut[strftime(dates, "%b-%d"), "season"]
})

What I am trying to do now is add columns to the right for each unique values in Source column in the profile table and estimate based on the following criteria weather the system was on or off for each hour in the dataset.

I am struggling with the programming concept of how to do similar to vlookup with multiple if conditions and paste value in the new columns. For example, for my sample data the loop should create 2 programs as the Source column has only 2 unique sources Stack 1 and Stack 2. The tricky bit is the if statement with it that will need something like:

As an example the first line of the table 2 should match the value of the season column with the profile table and see if that hour falls within the period of that particular season when the system will be on. If it falls in within the stated period then say 'on' and if outside just say off. So the result should look like this 2 red font columns of the figure below:

An example day in winter: enter image description here

An example day in spring: enter image description here I have managed to get the unique value of the column with the following code:

values <- unique(profile$Source)

But now its just not working with a for loop any further.

I am just wondering if anyone could give me any advise on how can I do the loop to create 2 more columns with the unique sources in table 2?

Below is the typical weekly 'profile' data table that I am using:

> dput(profile)
structure(list(`Source no` = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Source = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L), .Label = c("Stack 1", "Stack 2"), class = "factor"), 
    Period = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Day = structure(c(2L, 
    6L, 7L, 5L, 1L, 3L, 4L, 2L, 6L, 7L, 5L, 1L, 3L, 4L, 2L, 6L, 
    7L, 5L, 1L, 3L, 4L), .Label = c("Fri", "Mon", "Sat", "Sun", 
    "Thu", "Tue", "Wed"), class = "factor"), `Spring On` = c(0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 15L, 
    15L, 15L, 15L, 15L, 15L, 15L), `Spring Off` = c(23L, 23L, 
    23L, 23L, 23L, 23L, 23L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 18L, 
    18L, 18L, 18L, 18L, 18L, 18L), `Summer On` = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L), .Label = "off", class = "factor"), `Summer Off` = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L), .Label = "off", class = "factor"), `Autumn On` = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L), .Label = "off", class = "factor"), `Autumn Off` = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L), .Label = "off", class = "factor"), `Winter On` = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L), .Label = c("0", "off"), class = "factor"), 
    `Winter Off` = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("23", 
    "off"), class = "factor")), .Names = c("Source no", "Source", 
"Period", "Day", "Spring On", "Spring Off", "Summer On", "Summer Off", 
"Autumn On", "Autumn Off", "Winter On", "Winter Off"), class = "data.frame", row.names = c(NA, 
-21L))

many thanks

5
  • 1
    Your setup code does not work. Check this line of code dates = data.frame(dates =seq(as.Date('2010-01-01'),as.Date('2012-12-31'),by = "hour")) Commented Sep 15, 2015 at 13:13
  • Apologies I put the wrong code for that, please see the corrected one now, thanks dates =data.frame(dates=seq(from=as.POSIXct("2010-1-1 0:00", tz="UTC"), to=as.POSIXct("2012-12-31 23:00", tz="UTC"), by="hour")) Commented Sep 15, 2015 at 13:44
  • 1
    What does this line mean? "if the hour same is the value in the 'season' column then look up the hour, day of the week and return if the system is switched on or not." Commented Sep 15, 2015 at 13:49
  • 1
    What you should do is post in your question what the finished product should look like for your example. That way, there is no confusion as to what you are hoping to get in the end. Commented Sep 15, 2015 at 13:50
  • Hi Pierre, as an example the first line of the table 2 should match the value of the season column with the 'profile' table and see if that hour falls within the period of that particular season when the system will be on. If it falls in within the stated period then say 'on' and if outside just say 'off'. I am just posting another table to what I should see the final answer to be, thanks Commented Sep 15, 2015 at 13:59

1 Answer 1

6

In order to achieve the desired transfer of data from profile to dates, you will have to transform the profile data and then join it with dates. For the following steps I used the data.table package.

1) Load the data.table package and transform the datasets into data.tables (which are enhanced dataframes):

library(data.table)

setDT(profile)
setDT(dates)

2) Re-format the values in the profile dataset:

# set the 'off' values to NA
profile[profile=="off"] <- NA
# make sure that all the remaining values are numeric (which wasn't the case)
profile <- profile[, lapply(.SD, as.character), by=.(Source,Period,Day)][, lapply(.SD, as.numeric), by=.(Source,Period,Day)]

3) Create datasets for each season with values for eachhour one (or both) of the Source's is on. I only did it for Spring and Winter because Summer and Autumn have only off/NA values (we will deal with those later):

pr.spring <- profile[, .(season = "spring",
                         hour = c(`Spring On`:(`Spring Off`-1))),
                     by=.(Source,Period,Day)]
pr.winter <- profile[!is.na(`Winter On`), .(season = "winter",
                                            hour = c(`Winter On`:(`Winter Off`-1))),
                     by=.(Source,Period,Day)]

Note that I used Spring Off - 1. That is because I assumed that the Stack's were shut off at 23:00 hours. By using -1 I included the 22nd hour but not the 23rd. You can change this setting if that's needed.

4) Bind the datasets from step 3 together and prepare the resulting dataset for a dcast operation:

prof <- rbindlist(list(pr.spring,pr.winter))
prof <- prof[, .(weekday = Day, season, Source = gsub(" ",".",Source), hour = sprintf("%02d",hour))]

5) Transform the dataset from step 4 to a dataset with columns for each Stack and change the weekday column to character. The latter is needed for the join operation in the following step because the weekday column in the dates dataset is also a character column:

profw <- dcast(prof, weekday + season + hour ~ Source, value.var = "hour", fun.aggregate = length, fill = 0)
profw[, weekday := as.character(weekday)]

6) Join the two datasets together and fill the missing values with 0's (remeber I said: "we will deal with those later" in step 3):

dates.new <- profw[dates, on=c("weekday", "season", "hour")][is.na(Stack.1), `:=` (Stack.1 = 0, Stack.2 = 0)]

The resulting dataset has now the Stack-columns for each date in the dates dataset in which 1 ="on" and 0 = "off".


A snapshot from the resulting dataset:

> dates.new[weekday=="Fri" & hour=="03" & month %in% c("03","04","09")]
    weekday season hour Stack.1 Stack.2               dates year month day
 1:     Fri winter   03       1       1 2010-03-05 03:00:00 2010    03  05
 2:     Fri winter   03       1       1 2010-03-12 03:00:00 2010    03  12
 3:     Fri spring   03       1       0 2010-03-19 03:00:00 2010    03  19
 4:     Fri spring   03       1       0 2010-03-26 03:00:00 2010    03  26
 5:     Fri spring   03       1       0 2010-04-02 03:00:00 2010    04  02
 6:     Fri spring   03       1       0 2010-04-09 03:00:00 2010    04  09
 7:     Fri spring   03       1       0 2010-04-16 03:00:00 2010    04  16
 8:     Fri spring   03       1       0 2010-04-23 03:00:00 2010    04  23
 9:     Fri spring   03       1       0 2010-04-30 03:00:00 2010    04  30
10:     Fri summer   03       0       0 2010-09-03 03:00:00 2010    09  03
11:     Fri summer   03       0       0 2010-09-10 03:00:00 2010    09  10
12:     Fri summer   03       0       0 2010-09-17 03:00:00 2010    09  17
13:     Fri summer   03       0       0 2010-09-24 03:00:00 2010    09  24
14:     Fri winter   03       1       1 2011-03-04 03:00:00 2011    03  04
15:     Fri winter   03       1       1 2011-03-11 03:00:00 2011    03  11
16:     Fri spring   03       1       0 2011-03-18 03:00:00 2011    03  18
17:     Fri spring   03       1       0 2011-03-25 03:00:00 2011    03  25
18:     Fri spring   03       1       0 2011-04-01 03:00:00 2011    04  01
19:     Fri spring   03       1       0 2011-04-08 03:00:00 2011    04  08
20:     Fri spring   03       1       0 2011-04-15 03:00:00 2011    04  15
21:     Fri spring   03       1       0 2011-04-22 03:00:00 2011    04  22
22:     Fri spring   03       1       0 2011-04-29 03:00:00 2011    04  29
23:     Fri summer   03       0       0 2011-09-02 03:00:00 2011    09  02
24:     Fri summer   03       0       0 2011-09-09 03:00:00 2011    09  09
25:     Fri summer   03       0       0 2011-09-16 03:00:00 2011    09  16
26:     Fri summer   03       0       0 2011-09-23 03:00:00 2011    09  23
27:     Fri autumn   03       0       0 2011-09-30 03:00:00 2011    09  30
28:     Fri winter   03       1       1 2012-03-02 03:00:00 2012    03  02
29:     Fri winter   03       1       1 2012-03-09 03:00:00 2012    03  09
30:     Fri spring   03       1       0 2012-03-16 03:00:00 2012    03  16
31:     Fri spring   03       1       0 2012-03-23 03:00:00 2012    03  23
32:     Fri spring   03       1       0 2012-03-30 03:00:00 2012    03  30
33:     Fri spring   03       1       0 2012-04-06 03:00:00 2012    04  06
34:     Fri spring   03       1       0 2012-04-13 03:00:00 2012    04  13
35:     Fri spring   03       1       0 2012-04-20 03:00:00 2012    04  20
36:     Fri spring   03       1       0 2012-04-27 03:00:00 2012    04  27
37:     Fri summer   03       0       0 2012-09-07 03:00:00 2012    09  07
38:     Fri summer   03       0       0 2012-09-14 03:00:00 2012    09  14
39:     Fri summer   03       0       0 2012-09-21 03:00:00 2012    09  21
40:     Fri autumn   03       0       0 2012-09-28 03:00:00 2012    09  28
Sign up to request clarification or add additional context in comments.

1 Comment

Hi @Jaap,that make clear sense to me and I have very good idea to how to approach to my main data sets which has many more sources and years. But a good start to understand how to approach. thanks again

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.