I think he's using sloppy language.
Bengio et al. mix up "the policy/AI/agent is trained with RL and gets a high (maximal?) score on the training distribution" and "the policy/AI/agent is trained such that it wants to maximize reward (or some correlates) even outside of training".

Wait, does the friend elsewhere add "... and the author is right" or "and sloppiness isn't that bad"? My read of the quote you've provided is a critique and isn't excusing the sloppiness.

Reply

Beware General Claims about “Generalizable Reasoning Capabilities” (of Modern AI Systems)

DanielFilan24dΩ220

Ironically, given that it's currently June 11th (two days after my last tweet was posted) my final tweet provides two examples of the planning fallacy.

"Hopefully" is not a prediction!

Reply

Chris Olah’s views on AGI safety

DanielFilan1mo30

Sort of? Indeed this is more accessible to smaller groups than training big models, but small groups don't have access to the biggest models, and you can still do a bunch of non-mechinterpy things with the models you do have, so the effect isn't super overwhelming.

Reply

Elizabeth's Shortform

DanielFilan2mo63

(In case this isn't a joke, Mars Hill church was named after Mars Hill / the Areopagus / Hill of Ares, which in the New Testament is where the apostle Paul gives a speech to a bunch of pagans about Jesus. That hill is named after the Greek god. The church was located on Earth, in particular in Seattle.)

Reply

Consider not donating under $100 to political candidates

DanielFilan2mo40

Note that this post does not encourage people to withhold being politically active, or to totally refrain from making political donations.

Reply

Consider not donating under $100 to political candidates

DanielFilan2mo63

I don't know. I would imagine it's more like "it's bad to donate to the 'wrong' party" than "it's good to donate to the 'right' party".

Reply

MATS mentor selection

DanielFilan5mo20

In this comment we list the names of some of our advisors.

Reply

MATS mentor selection

DanielFilan5mo20

In this comment we list the names of some of our advisors.

Reply

1

MATS mentor selection

DanielFilan5mo202

Below is a list of some of the advisors we used for mentor selection. Notes:

Two advisors asked not to be named and do not appear here.
Advisors by and large focussed their efforts on areas they had some expertise in.
Advisors had to flag conflicts of interest, meaning that (for example) we did not take their ratings of themselves into account.

With that out of the way, here are some advisors who helped us for the Winter 2024-25 cohort:

Adam Gleave
Alex Lawsen
Buck Shlegeris
Ethan Perez
Lawrence Chan
Lee Sharkey
Lewis Hammond
Marius Hobbhahn
Michael Aird
Neel Nanda

The above people also advised us for the Summer 2025 cohort. We also added the below advisors for that cohort:

Alexander Gietelink Oldenziel
Ben Garfinkel
Caspar Oesterheld
Jesse Clifton
Nate Thomas

Reply