3

I learnt ResNet's skip connection recently, and I found this structure of network can improve a lot in during training, and it also applies in convolutional networks such as U-net. However, I don't know how i can do to implement a similar structure with LSTM autoencoder network. it looks like I got trapped by some dimensional problems... I'm using keras' method to implement, but I kept getting errors. So here is the network code:

# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model
# from keras import regularizers
from keras.regularizers import l1
from keras.optimizers import Adam
import keras.backend as K
model = Sequential()
model.add(LSTM(512, activation='selu', input_shape=(n_in,1),return_sequences=True))
model.add(LSTM(256, activation='selu',return_sequences=True)) 
model.add(LSTM(20, activation='selu'))  
model.add(RepeatVector(n_in))
model.add(LSTM(20, activation='selu',return_sequences=True)) 
model.add(LSTM(256, activation='selu',return_sequences=True)) 
model.add(LSTM(512, activation='selu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
# model.add
plot_model(model=model, show_shapes=True)

Just like skip connection diagram in resnet or unet, I'm trying to modify the network like this: network with out blue lines is the original network

The output of a encoder lstm layer also combines(concat, or add?) the former layer output as the input of a decoder lstm layer. As the pic shows, the coresponding layers are symmetry. Is such idea of connection possible? But I'm new to keras API and skip-connection structure, I don't know how I can implement it.

1 Answer 1

5

First you need to start using the functional API instead of the Sequential. The functional API allows you to build arbitrary input and output connections in each layer, instead of stacked networks.

Learn more about the functional API in: https://keras.io/guides/functional_api/

About building skip connections from LSTM layers, it is as easy as building skip for any kind of layer. I will show you a sample code:

input = Input(shape=input_shape)
a = LSTM(32, return_sequences=True)(input)

x = LSTM(64, return_sequences=True)(a) # main1 
a = LSTM(64, return_sequences=True)(a) # skip1

x = LSTM(64, return_sequences=True)(x) # main1
x = LSTM(64, return_sequences=True)(x) # main1

b = Add()([a,x]) # main1 + skip1

x = LSTM(128, return_sequences=True)(b) # main2
b = LSTM(128, return_sequences=True)(b) # skip2

x = LSTM(128, return_sequences=True)(x) # main2
x = LSTM(128, return_sequences=True)(x) # main2

c = Add()([b,x]) # main2 + skip2

x = LSTM(256, return_sequences=False)(c)

x = Dense(512, activation='relu')(x)
x = Dense(128, activation='relu')(x)

x = Dense(2, activation='softmax')(x)
model = Model(input, x)

This code will produce the following network:

enter image description here

As you can see, the Add layer receive as arguments the previous layer plus the layer before the block (a in the first block).

As Add require all arguments having the same shape, you must add an extra LSTM in the skip side equalizing the shape of the start and the end of the blocks (same concept as the original ResNet).

Of course you should mess with this network, adding different kinds of layers, Dropout, regularizers, Activation, or whatever you choose to work for your case. This is only a stump network to show the skip connections with LSTM.

The rest is pretty much the same as any other networks you have already trained.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.