3

I am trying to create a data frame (BOS.df) in order to explore the structure of a future analysis I will perform prior to receiving the actual data. In this scenario, lets say that there are 4 restaurants looking to run ad campaigns (the "Restaurant" variable). The total number of days that the campaign will last is cmp.lngth. I want random numbers for how much they are billing for the ads (ra.num). The ad campaigns start on StartDate. ultimately, I want to create a data frame the cycles through each restaurant, and adds a random billing number for each day of the ad campaign by adding rows.

#Create Data Placeholders
set.seed(123)
Restaurant <- c('B1', 'B2', 'B3', 'B4')
cmp.lngth <- 42
ra.num <- rnorm(cmp.lngth, mean = 100, sd = 10)
StartDate <- as.Date("2017-07-14")


BOS.df <- data.frame(matrix(NA, nrow =0, ncol = 3))
colnames(BOS.df) <- c("Restaurant", "Billings", "Date")


for(i in 1:length(Restaurant)){
  for(z in 1:cmp.lngth){
    BOS.row <- c(as.character(Restaurant[i]),ra.num[z],StartDate + 
    cmp.lngth[z]-1)
    BOS.df <- rbind(BOS.df, BOS.row)
  }
}

My code is not functioning correctly right now. The column names are incorrect, and the data is not being placed correctly if at all. The output comes through as follows:

  X.B1. X.94.3952435344779. X.17402.
1    B1    94.3952435344779    17402
2    B1                <NA>     <NA>
3    B1                <NA>     <NA>
4    B1                <NA>     <NA>
5    B1                <NA>     <NA>
6    B1                <NA>     <NA>

How can I obtain the correct output? Is there a more efficient way than using a for loop?

2
  • 1
    The spelling mistakes in lenght(Restuarant) won't help. And cmp.lngth[z] makes no sense as cmp.lngth is a single number, not a vector - you probably just want z here. Commented Jul 27, 2017 at 16:37
  • Hey, Andrew. Thanks for the feedback. The spelling mistakes came from me translating the code into my submission so that it is not remotely identifiable. Commented Jul 27, 2017 at 19:50

2 Answers 2

2

Using expand.grid:

cmp.lngth <- 2
StartDate <- as.Date("2017-07-14")

set.seed(1)
df1 <- data.frame(expand.grid(Restaurant, seq(cmp.lngth) + StartDate))
colnames(df1) <- c("Restaurant", "Date")
df1$Billings <- rnorm(nrow(df1), mean = 100, sd = 10)
df1 <- df1[ order(df1$Restaurant, df1$Date), ]

df1
#   Restaurant       Date  Billings
# 1         B1 2017-07-15  93.73546
# 5         B1 2017-07-16 103.29508
# 2         B2 2017-07-15 101.83643
# 6         B2 2017-07-16  91.79532
# 3         B3 2017-07-15  91.64371
# 7         B3 2017-07-16 104.87429
# 4         B4 2017-07-15 115.95281
# 8         B4 2017-07-16 107.38325
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks! I received an error message, but I still get the output I want. The error: Error in order(NULL, c(17362, 17362, 17362, 17362, 17363, 17363, 17363, : argument 1 is not a vector
works fine for me. From the error I would check the class of the object Restaurant before you run the code. Also you can simplify it a bit by naming the variables in expand.grid() and get rid of the call to data.frame: df1 <- expand.grid(Restaurant = Restaurant, Date = seq(cmp.lngth) + StartDate)
Thanks, atiretoo! I went through and incorporated these changes. I also checked the class of Restaurants and converted it from character to factor.
2

You can use rbind, but this would be another way to do it.
Also, the length of the data frame should be cmp.lngth*length(Restaurant), not cmp.lngth.

#Create Data Placeholders
set.seed(123)
Restaurant <- c('B1', 'B2', 'B3', 'B4')
cmp.lngth <- 42
ra.num <- rnorm(cmp.lngth, mean = 100, sd = 10)
StartDate <- as.Date("2017-07-14")


BOS.df <- data.frame(matrix(NA, nrow = cmp.lngth*length(Restaurant), ncol = 3))
colnames(BOS.df) <- c("Restaurant", "Billings", "Date")

count <- 1
for(name in Restaurant){
    for(z in 1:cmp.lngth){
        BOS.row <- c(name, ra.num[z], as.character(StartDate + z - 1))
        BOS.df[count,] <- BOS.row
        count <- count + 1
    }
}

I would also recommend you to look at the package called tidyverse and use add_row with tibble instead of data frame. Here is a sample code:

library(tidyverse)
BOS.tb <- tibble(Restaurant = character(),
                 Billings = numeric(),
                 Date = character())

for(name in Restaurant){
    for(z in 1:cmp.lngth){
        BOS.row <- c(name, ra.num[z], as.character(StartDate + z - 1))
        BOS.tb <- add_row(BOS.tb, 
                          Restaurant = name, 
                          Billings = ra.num[z], 
                          Date = as.character(StartDate + z - 1))
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.