What decoder_input_ids should be for sequence-to-sequence Transformer model?

Question

I use the HuggingFace's Transformers library for building a sequence-to-sequence model based on BART and T5. I carefully read the documentation and the research paper and I can't find what the input to the decoder (decoder_input_ids) should be for sequence-to-sequence tasks.

Should decoder input for both models (BART and T5) be same as lm_labels (output of the LM head) or should it be same as input_ids (input to the encoder)?

The decoder_input_ids are the labels (i.e. the target) training documentation from huggingface. — cronoik
– cronoik, Commented Aug 10, 2020 at 4:24

4th_haim_sister · Accepted Answer · 2021-07-06 08:27:52Z

1

The decoder_input_ids (optional) corresponds to labels, and labels are the preferred way to provide decoder_input_ids. https://huggingface.co/transformers/glossary.html#decoder-input-ids

This is because internally if decoder_input_ids are None, they will be derived by shifting labels to the right, so you don't have to do the shifting yourself.

answered Jul 6, 2021 at 8:27

4th_haim_sister

611 silver badge1 bronze badge

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

What decoder_input_ids should be for sequence-to-sequence Transformer model?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related