0

I’m just starting to explore the Hugging Face library and have a question related to Text2Text models.

Suppose I have a model1 (a Text2Text model, e.g. BART) pre-trained on a masked language modeling task, where it has learned the syntactic structure based on the tokenization strategy of tokenizer1.

Now, I want to fine-tune model1 using the same style of text related to the masked language modeling task as input, but aim to decode outputs into a different format using a separate tokenizer (tokenizer2).

Is this possible? The approach I had in mind involves sequential text generation:

  1. The original model1 generates text.
  2. A fine-tuned model2 continues the generation based on the output of model1.

Apologies if this is something trivial. Any comment or suggestion on specific tutorials is really appreciated!

1 Answer 1

1

the output from model1 is text form and the input to model2 is text form, too. So it is ok.

Sign up to request clarification or add additional context in comments.

1 Comment

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.