1
model = tf.keras.Sequential([
    tf.keras.layers.Embedding(1000, 16, input_length=20), 
    tf.keras.layers.Dropout(0.2),                           # <- How does the dropout work?
    tf.keras.layers.Conv1D(64, 5, activation='relu'),
    tf.keras.layers.MaxPooling1D(pool_size=4),
    tf.keras.layers.LSTM(64),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

I can understand when dropout is applied between Dense layers, which randomly drops and prevents the former layer neurons from updating parameters. I don't understand how dropout works after an Embedding layer.

Let's say the output shape of the Embedding layer is (batch_size,20,16) or simply (20,16) if we ignore the batch size. How is dropout applied to the embedding layer's output?

Randomly dropout rows or columns?

1 Answer 1

4

The dropout layer drops the output of previous layers.
It will randomly force previous outputs to 0.
In your case, the output of your Embedding layer will be 3d tensor (size, 20, 16)

import tensorflow as tf
import numpy as np
tf.random.set_seed(0)
layer = tf.keras.layers.Dropout(0.5)
data = np.arange(1,37).reshape(3, 3, 4).astype(np.float32)
data

Output

array([[[ 1.,  2.,  3.,  4.],
        [ 5.,  6.,  7.,  8.],
        [ 9., 10., 11., 12.]],

       [[13., 14., 15., 16.],
        [17., 18., 19., 20.],
        [21., 22., 23., 24.]],

       [[25., 26., 27., 28.],
        [29., 30., 31., 32.],
        [33., 34., 35., 36.]]], dtype=float32)

Code:

outputs = layer(data, training=True)
outputs

Output:

<tf.Tensor: shape=(3, 3, 4), dtype=float32, numpy=
array([[[ 0.,  0.,  6.,  8.],
        [ 0., 12.,  0., 16.],
        [18.,  0., 22., 24.]],

       [[26.,  0.,  0., 32.],
        [34., 36., 38.,  0.],
        [ 0.,  0., 46., 48.]],

       [[50., 52., 54.,  0.],
        [ 0., 60.,  0.,  0.],
        [ 0.,  0.,  0., 72.]]], dtype=float32)>

One way you should consider is SpatialDropout1d which will essentially drop the entire column.

layer = tf.keras.layers.SpatialDropout1D(0.5)
outputs = layer(data, training=True)

Output:

<tf.Tensor: shape=(3, 3, 4), dtype=float32, numpy=
array([[[ 2.,  0.,  6.,  8.],
        [10.,  0., 14., 16.],
        [18.,  0., 22., 24.]],

       [[26., 28.,  0., 32.],
        [34., 36.,  0., 40.],
        [42., 44.,  0., 48.]],

       [[ 0.,  0., 54., 56.],
        [ 0.,  0., 62., 64.],
        [ 0.,  0., 70., 72.]]], dtype=float32)>

I hope this clears your confusion.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks. I find it not intuitive to imagine the embedding layer output as neurons. In this case, how many neurons are there before the dropout?
Can you plz accept and upvote if the solution worked.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.