Posts

Sorted by New

Wiki Contributions

Comments

Sorted by

Thanks for your hard work. I wonder why in the layer 0 attention head, the positions of the query and value are 1?