Selection vs Control is a distinction I always point to when discussing optimization. Yet this is not the two takes on optimization I generally use. My favored ones are internal optimization (which is basically search/selection), and external optimization (optimizing systems from Alex Flint’s The ground of optimization). So I do without control, or at least without Abram’s exact definition of control.
Why? Simply because the internal structure vs behavior distinction mentioned in this post seems more important than the actual definitions (which seem constra...
This post states the problem of gradient hacking. It is valuable in that this problem is far from obvious, and if plausible, very dangerous. On the other hand, the presentation doesn’t go into enough details, and so leaves gradient hacking open to attacks and confusion. Thus instead of just reviewing this post, I would like to clarify certain points, while interweaving my critics about the way gradient hacking was initially stated, and explaining why I consider this problem so important.
(Caveat: I’m not pretending that any of my objections are unknown to E...
This post proposes 4 ideas to help building gears-level models from papers that already passed the standard epistemic check (statistics, incentives):
(The second section, “Zombie Theories”, sounds more like epistemic check than gears-level ...
How do you review a post that was not written for you? I’m already doing research in AI Alignment, and I don’t plan on creating a group of collaborators for the moment. Still, I found some parts of this useful.
Maybe that’s how you do it: by taking different profiles, and running through the most useful advice for each profile from the post. Let’s do that.
Full time researcher (no team or MIRIx chapter)
For this profile (which is mine, by the way), the most useful piece of advice from this post comes from the model of transmitters and receivers. I’m convinced...
In “Why Read The Classics?”, Italo Calvino proposes many different definitions of a classic work of literature, including this one:
For me, this captures what makes this sequence and corresponding paper a classic in the AI Alignment literature: it keeps on giving, readthrough after readthrough. That doesn’t mean I agree with everything in it, or that I don’t think it could have been improved in terms of structure. But when pushed to reread it, I found again and again that I had m... (read more)