Tensorflow embeddings

Question

I know what embeddings are and how they are trained. Precisely, while referring to the tensorflow's documentation, I came across two different articles. I wish to know what exactly is the difference between them.

link 1: Tensorflow | Vector Representations of words

In the first tutorial, they have explicitly trained embeddings on a specific dataset. There is a distinct session run to train those embeddings. I can then later on save the learnt embeddings as a numpy object and use the

tf.nn.embedding_lookup() function while training an LSTM network.

link 2: Tensorflow | Embeddings

In this second article however, I couldn't understand what is happening.

word_embeddings = tf.get_variable(“word_embeddings”,
[vocabulary_size, embedding_size])
embedded_word_ids = tf.gather(word_embeddings, word_ids)

This is given under the training embeddings sections. My doubt is: does the gather function train the embeddings automatically? I am not sure since this op ran very fast on my pc.

Generally: What is the right way to convert words into vectors (link1 or link2) in tensorflow for training a seq2seq model? Also, how to train the embeddings for a seq2seq dataset, since the data is in the form of separate sequences for my task unlike (a continuous sequence of words refer: link 1 dataset)

tf.gather doesn't do anything else beyond giving you the "row" of the word_embeddings variable corresponding to each word id in word_ids. But it will backpropagate the gradients correctly if you use it in a graph during a training session, updating word_embeddings appropriately. — javidcf
– javidcf, Commented Sep 19, 2017 at 11:08
The second snippet does not train the embeddings, it just creates the necessary variables. That link says afterwards: "The variable word_embeddings will be learned and at the end of the training it will contain the embeddings for all words in the vocabulary. The embeddings can be trained in many ways, ..." — E_net4
– E_net4, Commented Sep 19, 2017 at 11:10
So, am I correct in saying that the embeddings approach is more general than the first link's one where you first extract all the words from your dataset and explicitly train embeddings on it's sequence. While, in the tf.gather approach, it is more like a layer that gets trained while actual training of the LSTM? So, how do you propose I approach a seq2seq model? first link or second link? — Animesh Karnewar
– Animesh Karnewar, Commented Sep 19, 2017 at 11:12

Animesh Karnewar · Accepted Answer · 2017-09-27 10:46:18Z

1

Alright! anyway, I have found the answer to this question and I am posting it so that others might benefit from it.

The first link is more of a tutorial that steps you through the process of exactly how the embeddings are learnt.

In practical cases, such as training seq2seq models or Any other encoder-decoder models, we use the second approach where the embedding matrix gets tuned appropriately while the model gets trained.

answered Sep 27, 2017 at 10:46

Animesh Karnewar

4461 gold badge8 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Tensorflow embeddings

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related