DanielFilan

Sequences

AXRP - the AI X-risk Research Podcast

Comments

Sorted by

fun fact: more people die of heat in Europe per year than Americans who die of guns.

I think he's using sloppy language.

Bengio et al. mix up "the policy/AI/agent is trained with RL and gets a high (maximal?) score on the training distribution" and "the policy/AI/agent is trained such that it wants to maximize reward (or some correlates) even outside of training".

Wait, does the friend elsewhere add "... and the author is right" or "and sloppiness isn't that bad"? My read of the quote you've provided is a critique and isn't excusing the sloppiness.

DanielFilanΩ220

Ironically, given that it's currently June 11th (two days after my last tweet was posted) my final tweet provides two examples of the planning fallacy.

"Hopefully" is not a prediction!

Sort of? Indeed this is more accessible to smaller groups than training big models, but small groups don't have access to the biggest models, and you can still do a bunch of non-mechinterpy things with the models you do have, so the effect isn't super overwhelming.

(In case this isn't a joke, Mars Hill church was named after Mars Hill / the Areopagus / Hill of Ares, which in the New Testament is where the apostle Paul gives a speech to a bunch of pagans about Jesus. That hill is named after the Greek god. The church was located on Earth, in particular in Seattle.)

Note that this post does not encourage people to withhold being politically active, or to totally refrain from making political donations.

I don't know. I would imagine it's more like "it's bad to donate to the 'wrong' party" than "it's good to donate to the 'right' party".

In this comment we list the names of some of our advisors.

In this comment we list the names of some of our advisors.

Below is a list of some of the advisors we used for mentor selection. Notes:

  • Two advisors asked not to be named and do not appear here.
  • Advisors by and large focussed their efforts on areas they had some expertise in.
  • Advisors had to flag conflicts of interest, meaning that (for example) we did not take their ratings of themselves into account.

With that out of the way, here are some advisors who helped us for the Winter 2024-25 cohort:

  • Adam Gleave
  • Alex Lawsen
  • Buck Shlegeris
  • Ethan Perez
  • Lawrence Chan
  • Lee Sharkey
  • Lewis Hammond
  • Marius Hobbhahn
  • Michael Aird
  • Neel Nanda

The above people also advised us for the Summer 2025 cohort. We also added the below advisors for that cohort:

  • Alexander Gietelink Oldenziel
  • Ben Garfinkel
  • Caspar Oesterheld
  • Jesse Clifton
  • Nate Thomas
Load More