splitted is a list of data frames coming from a split() on the main data frame.
After splitting, I'm applying a function to every data frame in the splitted list.
Here the function:
getCustomer <- function(df, numberOfProducts = 3){
Gender <- unique(df$gender)
Segment <- unique(df$Segment)
Net_Discount <- sum(df$Discount * df$Sales)
Number_of_Discounts <- sum(df$Discount>0)
Customer.ID <- unique(df$Customer.ID)
Sales <- sum(df$Sales)
Profit <- sum(df$Profit)
lat <- mean(df$lat)
lon <- mean(df$lon)
productsData <- df %>% arrange(Order.Date) %>% top_n(n =numberOfProducts)
Products <- 0
Products_Category <- 0
Products_Order_Date <- 0
for (j in 1:numberOfProducts){
Products[j] <- productsData %>% select(Product.ID) %>% filter(row_number()==j)
Products_Category[j] <- productsData %>% select(Category) %>% filter(row_number()==j)
Products_Order_Date[j] <- productsData %>% select(Order.Date) %>% filte(row_number()==j)
names(Products)[j]<-paste("Product",j)
names(Products_Category)[j]<-paste("Category Product",j)
names(Products_Order_Date)[j]<-paste("Order Date Product",j)
}
output <- data.frame(Customer.ID, Gender,Segment, Net_Discount, Number_of_Discounts, Sales, Profit,
Products, Products_Category, Products_Order_Date, lon,lat)
return(output[1,])
}
I get the right answer for any element of splitted
getCustomer(splitted[[687]],2)
I can even do well with
customer <- list()
customer[[1]]<- getCustomer(splitted[[1]],2)
customer[[2]]<- getCustomer(splitted[[2]],2)
.
.
.
customer[[1576]]<- getCustomer(splitted[[1576]],2)
That is, I can effectively build the whole customer list by assigning element by element.
However, I certainly don't have time for that (1576 single line data frames to assign to the customer list), so I'm trying:
customer <- list()
for (i in 1:length(splitted)){
customer[[i]]<-getCustomer(splitted[[i]],2)
}
After running this last chunk of code, I get:
Error in data.frame(Customer.ID, Gender, Segment, Net_Discount, Number_of_Discounts, : arguments imply differing number of rows: 0, 1
I can't understand this error, since I can build the customer list element by element at a time.
Would apreciate your help.
Solution
Editing this question to let you know the problem was indeed that some data frames in splitted had no rows. So I removed them (only 3).
for (i in 1:length(splitted)){
l[i]<-nrow(splitted[[i]])
}
indices<- which(l==0)
splitted<-splitted[-indices]
Just had to delete 3 samples. Got no error this time. Thank you all for your time.