Gyrodiot

I'm Jérémy Perret. Based in France. PhD in AI (NLP). AI Safety & EA meetup organizer. Information sponge. Mostly lurking since 2014. Seeking more experience, and eventually a position, in AI safety/governance.

Extremely annoyed by the lack of an explorable framework for AI risk/benefits. Working on that.

Sequences

XiXiDu's AI Risk Interview Series

Posts

Sorted by New

37Review of "Learning Normativity: A Research Agenda"

65Review of "Fun with +12 OOMs of Compute"

11Learning from counterfactuals

43Mapping Out Alignment

45Resources for AI Alignment Cartography

19Layers of Expertise and the Curse of Curiosity

10Willpower duality

7Open thread, Oct. 31 - Nov. 6, 2016

Wikitag Contributions

HPMOR (discussion & meta)

(+80)

(+188)

(+93)

(+665)

(+13/-12)

(+181/-195)

(+11/-11)

(+296)

(+6/-6)

(+290/-4)

Comments

Sorted by

Newest

How far are Western welfare states from coddling the population into becoming useless?

Gyrodiot12h30

Hi! Nearly all the statements in your question would benefit from some unpacking. Could you expand on what would count as coddling? My intuition says you're gesturing at a whole heap of trade-offs, it might help to pick one in particular and study it further. Any proper answer to the question, as stated, is the stuff of entire books.

*NYT Op-Ed* The Government Knows A.G.I. Is Coming

Gyrodiot1mo60

Archive link (not paywalled)

The Parable of the King and the Random Process

Gyrodiot4mo10Review for 2023 Review

The post makes clear that two very different models of the world will lead to very different action steps, and the "average" of those steps isn't what follows the average of probabilities. See how the previous sentence felt awkward and technical, compared to the story? Sure, it's much longer, but the point gets across better, that's the value. I have added this story to my collection of useful parables.

Re-reading it, the language remains technical, one needs to understand a bit more probability theory to get the latter parts. I would like to see a retelling of the story, same points, different style, to test if it speaks to a different audience.

2024 Unofficial LessWrong Census/Survey

Gyrodiot4mo101

I filled out the survey. Thank you so much for running this!

(The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser

Gyrodiot4mo10

Oh, glad I scrolled to find this comment. Adding a request for France, which does have charity tax deductions... but needs an appropriate receipt.

Fully booked - LessWrong Community weekend

Gyrodiot9mo63

Best weekend of the year. Been there in 2017, 2018, 2019, 2023, will be delighted to attend again. Consistent source of excellent discussions, assorted activities, fun and snacks. Does indeed feel like home.

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

Gyrodiot11mo6-1

My raw and mostly confused/snarky comments as I was going through the paper can be found here (third section).

Cleaner version: this is not a technical agenda. This is not something that would elicit interesting research questions from a technical alignment researcher. There are however interesting claims:

what a safe system ought to be like; it proposes three scales describing its reliability;
how far up the scales we should aim for at minimum;
how low on the scales currently large deployed models are.

While it positions a variety of technical agendas (mainly those of the co-authors) on the scales, the paper does not advocate for a particular approach, only the broad direction of "here are the properties we would like to have". Uncharitably, it's a reformulation of the problem.

The scales can be useful to compare the agenda that belong to the "let's prove that the system adheres to this specification" family. It makes no claims over what the specification entails, nor failure modes of various (combinations of) levels.

I appreciate this paper as a gateway to the related agendas and relevant literature, but I'm not enthusiastic about it.

On Lex Fridman’s Second Podcast with Altman

Gyrodiot1y42

Typo: Mira Murati, not Mutari.