How to implement a skip-connection structure between LSTM layers

Question

I learnt ResNet's skip connection recently, and I found this structure of network can improve a lot in during training, and it also applies in convolutional networks such as U-net. However, I don't know how i can do to implement a similar structure with LSTM autoencoder network. it looks like I got trapped by some dimensional problems... I'm using keras' method to implement, but I kept getting errors. So here is the network code:

# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model
# from keras import regularizers
from keras.regularizers import l1
from keras.optimizers import Adam
import keras.backend as K
model = Sequential()
model.add(LSTM(512, activation='selu', input_shape=(n_in,1),return_sequences=True))
model.add(LSTM(256, activation='selu',return_sequences=True)) 
model.add(LSTM(20, activation='selu'))  
model.add(RepeatVector(n_in))
model.add(LSTM(20, activation='selu',return_sequences=True)) 
model.add(LSTM(256, activation='selu',return_sequences=True)) 
model.add(LSTM(512, activation='selu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
# model.add
plot_model(model=model, show_shapes=True)

Just like skip connection diagram in resnet or unet, I'm trying to modify the network like this:

The output of a encoder lstm layer also combines(concat, or add?) the former layer output as the input of a decoder lstm layer. As the pic shows, the coresponding layers are symmetry. Is such idea of connection possible? But I'm new to keras API and skip-connection structure, I don't know how I can implement it.

Pablo Werlang · Accepted Answer · 2020-10-07 14:29:34Z

First you need to start using the functional API instead of the Sequential. The functional API allows you to build arbitrary input and output connections in each layer, instead of stacked networks.

Learn more about the functional API in: https://keras.io/guides/functional_api/

About building skip connections from LSTM layers, it is as easy as building skip for any kind of layer. I will show you a sample code:

input = Input(shape=input_shape)
a = LSTM(32, return_sequences=True)(input)

x = LSTM(64, return_sequences=True)(a) # main1 
a = LSTM(64, return_sequences=True)(a) # skip1

x = LSTM(64, return_sequences=True)(x) # main1
x = LSTM(64, return_sequences=True)(x) # main1

b = Add()([a,x]) # main1 + skip1

x = LSTM(128, return_sequences=True)(b) # main2
b = LSTM(128, return_sequences=True)(b) # skip2

x = LSTM(128, return_sequences=True)(x) # main2
x = LSTM(128, return_sequences=True)(x) # main2

c = Add()([b,x]) # main2 + skip2

x = LSTM(256, return_sequences=False)(c)

x = Dense(512, activation='relu')(x)
x = Dense(128, activation='relu')(x)

x = Dense(2, activation='softmax')(x)
model = Model(input, x)

This code will produce the following network:

As you can see, the Add layer receive as arguments the previous layer plus the layer before the block (a in the first block).

As Add require all arguments having the same shape, you must add an extra LSTM in the skip side equalizing the shape of the start and the end of the blocks (same concept as the original ResNet).

Of course you should mess with this network, adding different kinds of layers, Dropout, regularizers, Activation, or whatever you choose to work for your case. This is only a stump network to show the skip connections with LSTM.

The rest is pretty much the same as any other networks you have already trained.

Collectives™ on Stack Overflow

How to implement a skip-connection structure between LSTM layers

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related