1

I can't figure out for the life of me how to generate text from the default model feeding in a prefix:

I have downloaded the model and here is my code:

import gpt_2_simple as gpt2

model_name = "124M"

sess = gpt2.start_tf_sess()

gpt2.generate(sess, model_name=model_name)

gpt2.generate(sess, model_name=model_name, prefix="<|My name is |>")

However when i run it i get the following error:

tensorflow.python.framework.errors_impl.FailedPreconditionError: 2 root error(s) found. (0) Failed precondition: Attempting to use uninitialized value model/h3/mlp/c_proj/w [[{{node model/h3/mlp/c_proj/w/read}}]] [[strided_slice/_33]] (1) Failed precondition: Attempting to use uninitialized value model/h3/mlp/c_proj/w [[{{node model/h3/mlp/c_proj/w/read}}]]

Any idea what I'm doing wrong?

1 Answer 1

1

You are trying to generate without loading parameters first.

It seems that the downloaded models are used for training ("finetuning") but they are not loaded for generation.

For generation, the library tries to run a previously saved Tensorflow model ("checkpoints" in TF terminology).

Finetuning

You can generate a checkpoint by training the model for a few epochs using your own dataset (or working from the dataset published by the researches).

Otherwise, gpt-2-simple makes it easy. Get a text file with some text and train it:

gpt_2_simple --sample_every 50 finetune yourtext.txt

Let it run for a few epochs and have a look at the result samples. A checkpoint will be saved every 100 epochs. Once you are happy, hit CTRL+C and it will save a last checkpoint.

You can then generate text using:

gpt_2_simple generate --prefix "Once upon a time"  --nsamples 5

The gpt_2_simple tool accepts a -h argument for help. Have a look at the other options. Using the library from code is similar to this tool workflow.

Generating without finetuning

The author explains in this GitHub question the procedure to skip finetuning entirely. Just copy the model to the checkpoint directory (you need to download the model first, have a look at that link):

mkdir -p checkpoint/
cp -r models/345M checkpoint/run1
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.