I am using features of variable length videos to train one layer LSTM. Video sizes are changing from 10 to 35 frames. I am using batch size of 1. I have the following code:
lstm_model = LSTMModel(4096, 4096, 1, 64)
for step, (video_features, label) in enumerate(data_loader):
bx = Variable(score.view(-1, len(video_features), len(video_features[0]))) #examples = 1x12x4096, 1x5x4096
output = lstm_model(bx)
Lstm model is;
class LSTMModel(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, num_classes):
super(LSTMModel, self).__init__()
self.l1 = nn.LSTM(input_size=input_size, hidden_size=hidden_size, num_layers=num_layers, batch_first=True)
self.out = nn.Linear(hidden_size, num_classes)
def forward(self, x):
r_out, (h_n, h_c) = self.l1(x, None) #None represents zero initial hidden state
out = self.out(r_out[:, -1, :])
return out
I just want to ask; am I doing the right for training LSTM with variable size input. The code works okay and loss decreases but I am not sure if I am doing the right thing. Because I haven't used LSTMs in Pytorch before.