In this post, I claim a few things and offer some evidence for these claims. Among these things are: * Language models have many redundant attention heads for a given task * In context learning works through addition of features, which are learnt through Bayesian updates * The model likely...
I agree with this
i've found that the lw wiki doesn't work as a wikipedia-like resource, at least for me
How useful is a wiki for alignment? There doesn't seem to be one now.
norm \in \mathbf{R}, doesn't matter
I've found the part about applying random search to be the among the best takeaways I had from PAIR! Novelty for the sake of Novelty is not a terrible idea. Specifically, I've found that even if you don't like the things you do, it makes it much easier to then make progress towards the larger goal
In this post, I claim a few things and offer some evidence for these claims. Among these things are:
To set some context, the task I'm going to be modelling is the task such that we give a pair of in the following format:
(x, y)\n
where for each example, . As a concrete... (read 1138 more words →)
I think my issue with the LW wiki is that it relies too much on Lesswrong? It seems like the expectation is you click on a tag, which then contains / is assigned to a number of LW posts, and then you read through the posts. This is not like how other wikis / encyclopedias work!
My gold standard for a technical wiki (other than wikipedia) is the chessprogramming wiki https://www.chessprogramming.org/Main_Page