0

I am implementing a simple LSTM network. I would like to include multiple features at the input layer. These features are a pre-trained word embeddings and a vector to flag a specific word in the given sentence.

For example:

Sentence = "I have a question"
feature_vector_1 = [4, 2, 281, 5201] #word2index which will be passed to the embedding layer
feature_vector_2 = [0, 1, 0, 0]

final features= [feature_vector_1 + feature_vector_2]

suppose that:

embedding is of dim = 100
index_flag is of dim = 50 
max sentence length = 50 

My network code is:

input= Input(shape=(None,))
embedded_layer_input=Embedding(input_dim=embedding_matrix.shape[0], output_dim=embedding_matrix.shape[1],
                     input_length=tweet_max_length, weights= [embedding_matrix], trainable=False)(input)
lstm_layer=Bidirectional(LSTM(64))(embedded_layer_input)
output_layer=Dense(1,activation='sigmoid')(lstm_layer)

model=Model(input, output_layer)

#complie and train
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc'])  

# Summarize the model
print(model.summary()) 

# Fit the model
model.fit(padded_train_x, y_train, epochs=epochs, batch_size=batch_size, shuffle=False, verbose=1, validation_data=(padded_dev_x,y_dev))  

My question is how and where to include the new feature vector? I looked at Concatenate but I am not sure how to prepare feature vector 2.

1 Answer 1

2

You can add a second input just like the first one and concatenate afterwards:

input= Input(shape=(None,))
flag_in = Input(shape=(None,)) ##
embedded_layer_input=Embedding(input_dim=embedding_matrix.shape[0], output_dim=embedding_matrix.shape[1],
                     input_length=tweet_max_length, weights= [embedding_matrix], trainable=False)(input)
combined = Concatenate()([embedded_layer_input, flag_in])
lstm_layer=Bidirectional(LSTM(64))(combined)
output_layer=Dense(1,activation='sigmoid')(lstm_layer)
# From now on you pass a list as your input to your model
model=Model([input, flag_in], output_layer)
# ...
model.fit([padded_xtrain, x_flag_inputs], ...)
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks a lot, @nuric for your prompt reply. I got an error at the Concatenate line saying: ValueError: A 'Concatenate' layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 50, 100), (None, None)] although I added axis = -1
I solved the error above by removing the input_length from the embedding layer. But I got another error while fitting the model which is ValueError: Error when checking input: expected input_2 to have 3 dimensions, but got array with shape (1573, 50) where 1573 is the number of sentences in the training set and 50 is the number of words in each sentence. The shape of padded_xtrain is also (1573,50) so I am not sure what is the problem here?
You need to expand_dims your numpy array of flags. So it is (1573, 50, 1) which means every input has 50 words each with 1 feature that you can concatenate. Hope that helps.
Thanks a lot, @nuric. I used expand_dims(x_flag_inputs, axis=-1) so the shape of flags is (1573,50,1) but I got another error while fitting the model ValueError: Error when checking input: expected input_2 to have shape (None, 50) but got array with shape (50, 1)
Oh, you also need to change flag_in = Input(shape=(None, 1)) so the input has the extra dimension.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.