I think he's using sloppy language.
Bengio et al. mix up "the policy/AI/agent is trained with RL and gets a high (maximal?) score on the training distribution" and "the policy/AI/agent is trained such that it wants to maximize reward (or some correlates) even outside of training".
Wait, does the friend elsewhere add "... and the author is right" or "and sloppiness isn't that bad"? My read of the quote you've provided is a critique and isn't excusing the sloppiness.
Ironically, given that it's currently June 11th (two days after my last tweet was posted) my final tweet provides two examples of the planning fallacy.
"Hopefully" is not a prediction!
Sort of? Indeed this is more accessible to smaller groups than training big models, but small groups don't have access to the biggest models, and you can still do a bunch of non-mechinterpy things with the models you do have, so the effect isn't super overwhelming.
(In case this isn't a joke, Mars Hill church was named after Mars Hill / the Areopagus / Hill of Ares, which in the New Testament is where the apostle Paul gives a speech to a bunch of pagans about Jesus. That hill is named after the Greek god. The church was located on Earth, in particular in Seattle.)
Note that this post does not encourage people to withhold being politically active, or to totally refrain from making political donations.
I don't know. I would imagine it's more like "it's bad to donate to the 'wrong' party" than "it's good to donate to the 'right' party".
Below is a list of some of the advisors we used for mentor selection. Notes:
With that out of the way, here are some advisors who helped us for the Winter 2024-25 cohort:
The above people also advised us for the Summer 2025 cohort. We also added the below advisors for that cohort:
fun fact: more people die of heat in Europe per year than Americans who die of guns.