oekenta

Message

oekenta has not written any posts yet.

oekenta

Message

oekenta has not written any posts yet.

oekenta

Replying tointerpreting GPT: the logit lens

oekenta5y

interpreting GPT: the logit lens

Thanks for the info.

This was a great read, very informative.

Replying tointerpreting GPT: the logit lens

oekenta5y

interpreting GPT: the logit lens

I think I understand your question and was also confused by this for a bit so I wanted add in some points of clarification. First I want out that I really couldn't find a satisfactory explanation of this particular detail (at least one that I could understand) so I pieced this together myself from looking at the huggingface code for GPT2. I may get some details wrong.

During training at each step the GPT2 takes in an N tokens and outputs N tokens. But the i-th output token is computed in such away that it only relies on the information from tokens 1, ..., i and is meant to predict i+1-th... (read more)

Replying tointerpreting GPT: the logit lens

oekenta5y

interpreting GPT: the logit lens

Hey I'm not finished reading this yet but I noticed something off about what you said.

At the end, the final 1600-dimensional vector is multiplied by W's transpose to project back into vocab space.

This isn't quite right. They don't multiply by W's transpose at the end. Rather there is a completely new matrix at the end, whose shape is the same as the transpose of W.

You can see this in huggingface's code for GPT2. In the class GPT2LMHeadModel the final matrix multiplication is performed by the matrix called "lm_head", where as the matrix you call W which is used to map 50,257 dimensional vectors into 1600 dimensional space is called... (read more)

LESSWRONG
LW

LESSWRONG
LW

oekenta

oekenta

oekenta

oekenta