6

I use the HuggingFace's Transformers library for building a sequence-to-sequence model based on BART and T5. I carefully read the documentation and the research paper and I can't find what the input to the decoder (decoder_input_ids) should be for sequence-to-sequence tasks.

Should decoder input for both models (BART and T5) be same as lm_labels (output of the LM head) or should it be same as input_ids (input to the encoder)?

1
  • 1
    The decoder_input_ids are the labels (i.e. the target) training documentation from huggingface. Commented Aug 10, 2020 at 4:24

1 Answer 1

1

The decoder_input_ids (optional) corresponds to labels, and labels are the preferred way to provide decoder_input_ids. https://huggingface.co/transformers/glossary.html#decoder-input-ids

This is because internally if decoder_input_ids are None, they will be derived by shifting labels to the right, so you don't have to do the shifting yourself.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.