0

I have a GPT model

model = BioGptForCausalLM.from_pretrained("microsoft/biogpt").to(device)

When I send my batch to it I can get the logits and the hidden states:

out = model(batch["input_ids"].to(device), output_hidden_states=True, return_dict=True)
print(out.keys())
>>> odict_keys(['logits', 'past_key_values', 'hidden_states'])

The logits have shape of

torch.Size([2, 1024, 42386])

Corresponding to (batch, seq_length, vocab_length)

How can I get the vector embedding of the first (i.e., dim=0) token in the last layer (i.e., after the fully connected layer)? I believe it should be of size [2, 1024, 1024]

From here it seems like it should be under last_hidden_state, but I can't seem to generate it. out.hidden_states seems to be a tuple of length 25, where each is of dimension [2, 1024, 1024]. I'm wondering if the last one is the one I'm looking for, but I'm not sure.

1 Answer 1

3

You are right with output_hidden_state=True and watching out.hidden_states. This element is a tuple of length 25 as you mentioned. According to BioGPT paper and HuggingFace doc, your model contains 24 transformer layers, and the 25 elements in the tuple are the first embedding layer output and the outputs of each of the 24 layers.

The shape of each of these tensors is [B, L, E] where B is your batch size, L is the length of the input and E is the dimension of your embedding. It seems that you are padding your input to 1024 regarding the shape you indicated. So, the representation of your first token (in the first batched sentence) would be out.hidden_states[k][0,0,:], which is of shape [1024]. Here, k denotes the layer you want to use and it is up to you to decide which one you want depending on what you will do with it.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.