4

I'm new to Keras, and have been struggling with understanding the usage of the variable z in the variational autoencoder example in their official github. I don't understand why z is not being used instead of the variable latent_inputs. I ran the code and it seems to work, but I don't understand if z is being used behind the scenes and what is the mechanism in Keras that is responsible for it. Here is the relevant code snippet:

# VAE model = encoder + decoder
# build encoder model
inputs = Input(shape=input_shape, name='encoder_input')
x = Dense(intermediate_dim, activation='relu')(inputs)
z_mean = Dense(latent_dim, name='z_mean')(x)
z_log_var = Dense(latent_dim, name='z_log_var')(x)

# use reparameterization trick to push the sampling out as input
# note that "output_shape" isn't necessary with the TensorFlow backend
z = Lambda(sampling, output_shape=(latent_dim,), name='z')([z_mean, z_log_var])

# instantiate encoder model
encoder = Model(inputs, [z_mean, z_log_var, z], name='encoder')
encoder.summary()
plot_model(encoder, to_file='vae_mlp_encoder.png', show_shapes=True)

# build decoder model
latent_inputs = Input(shape=(latent_dim,), name='z_sampling')
x = Dense(intermediate_dim, activation='relu')(latent_inputs)
outputs = Dense(original_dim, activation='sigmoid')(x)

# instantiate decoder model
decoder = Model(latent_inputs, outputs, name='decoder')
decoder.summary()
plot_model(decoder, to_file='vae_mlp_decoder.png', show_shapes=True)

# instantiate VAE model
outputs = decoder(encoder(inputs)[2])
vae = Model(inputs, outputs, name='vae_mlp')

2 Answers 2

6

Your encoder is defined as a model that takes inputs inputs and gives outputs [z_mean, z_log_var, z]. You then define your decoder separately to take some input, here called latent_inputs, and output outputs. Finally, your overall model is defined in the line that states:

outputs = decoder(encoder(inputs)[2])

This means you are going to run encoder on your inputs, which yields [z_mean, z_log_var, z], and then the third element of that (call it result[2]) gets passed in as the input argument to decoder. In other words, when you implement your network, you are setting latent_inputs equal to the third output of your encoder, or [z_mean, z_log_var, z][2] = z. You could view it as (probably not valid code):

encoder_outputs = encoder(inputs)  # [z_mean, z_log_var, z]
outputs = decoder(latent_inputs=encoder_outputs[2])  # latent_inputs = z
Sign up to request clarification or add additional context in comments.

1 Comment

last line outputs = decoder(encoder_outputs[2]) -- The VAE is only taking z as input, leaving z_mean and z_log_var ignored.
1

They are just defining separately the encoder and decoder, so that they can be used individually:

  • Given some inputs, encoder computes their latent vectors / lower representations z_mean, z_log_var, z (you could use the encoder by itself e.g. to store those lower-dimensional representations, or for easier comparison).

  • Given such a lower-dimensional representation latent_inputs, decoder returns the decoded information outputs (e.g. if you need to reuse the stored lower representations).

To train/use the complete VAE, both operation can just be chained the way they are actually doing: outputs = decoder(encoder(inputs)[2]) (latent_inputs of decoder receiving the z output of encoder).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.