I am trying to learn LSTM with keras in R. I am not being able to fully understand the conventions used in keras.
I have dataset that looks like below, with the first 3 columns considered as input and the last one as output.
Based on this, I am trying to build a stateless LSTM as follows:
model %>%
layer_lstm(units = 1024, input_shape = c(1, 3), return_sequences = T ) %>%
layer_lstm(units = 1024, return_sequences = F) %>%
# using linear activation on last layer, as output is needed in real number
layer_dense(units = 1, activation = "linear")
model %>% compile(loss = 'mse', optimizer = 'rmsprop')
The model looks like below
Layer (type) Output Shape Param #
=====================================================
lstm_1 (LSTM) (None, 1, 1024) 4210688
_____________________________________________________
lstm_2 (LSTM) (None, 1024) 8392704
_____________________________________________________
dense_3 (Dense) (None, 1) 1025
=====================================================
Total params: 12,604,417
Trainable params: 12,604,417
Non-trainable params: 0
_____________________________________________________
I am trying to train the model as follows:
history <- model %>% fit(dt[,1:3], dt[,4], epochs=50, shuffle=F)
However, i am getting the following error when I try to execute the code.
Error in py_call_impl(callable, dots$args, dots$keywords) : ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (3653, 3)
Not sure what I am missing here.
Update: After looking around in internet, it seems that I need to reshape the dataset into a 3 dimensional (batchsize, timestep, #features) array. However, I am not using any batch, thus not sure how to reshape my data.
Update on 29.01.2018: This is what worked for me. I used input_shape = c(1, 3) in my first LSTM layer, as I have 3 features and I am not using any batch. Thus, I also ended up reshaping my data using the following function:
reshapeDt <- function(data){ # data is the original train matrix (training dataset)
rows <- nrow(data)
cols <- ncol(data)-1
dt <- array(dim=c(rows, 1, cols))
for(i in 1:rows){
dt[i,1,] <- data[i,1:cols]
}
dt
}
This means that the call to fit looks like below:
model %>% fit(reshapeDt(dt), dt[,4], epochs=50, shuffle=F)
This means that dim(reshapeDt(dt)) returns number_of_rows_in_dt 1 3.
