Gyrodiot

I'm Jérémy Perret. Based in France. PhD in AI (NLP). AI Safety & EA meetup organizer. Information sponge. Mostly lurking since 2014. Seeking more experience, and eventually a position, in AI safety/governance.

Extremely annoyed by the lack of an explorable framework for AI risk/benefits. Working on that.

Sequences

Methods of Psychomagic

XiXiDu's AI Risk Interview Series

Posts

Sorted by New

37Review of "Learning Normativity: A Research Agenda"

65Review of "Fun with +12 OOMs of Compute"

11Learning from counterfactuals

43Mapping Out Alignment

45Resources for AI Alignment Cartography

19Layers of Expertise and the Curse of Curiosity

10Willpower duality

7Open thread, Oct. 31 - Nov. 6, 2016

Wikitag Contributions

HPMOR (discussion & meta)

3mo

(+80)

HPMOR Fanfiction

3mo

(+188)

In Russian

3mo

(+93)

(+665)

(+13/-12)

(+181/-195)

(+11/-11)

(+296)

(+6/-6)

(+290/-4)

Comments

Sorted by

Newest

"It isn't magic"

Gyrodiot15d116

Related recent post: Intelligence Is Not Magic, But Your Threshold For "Magic" Is Pretty Low (similar point, focused on human short-timeframe feats rather than technological achievements).

AISEC: Why to not to be shy.

Gyrodiot18d10

Format note: your list is missing a number 3.

Roman Malov's Shortform

Gyrodiot1mo42

Two separate points:

compared to physics, the field of alignment has a slow-changing set of questions (e.g. corrigibility, interpretability, control, goal robustness, etc.) but a fast-evolving subject matter, as capability progresses. I use the analogy of a biologist suddenly working on a place where evolution runs 1000x faster, some insights get stale very fast and it's hard to know which ones in advance. Keeping up with the frontier is, then, used to know whether one's work still seems relevant (or where to send newcomers). Agent foundations as a class of research agendas was the answer to this volatility, but progress is slow and the ground keeps shifting.
there is some effort to unify alignment research, or at least provide a textbook to get to the frontier. My prime example is the AI Safety Atlas, I would also consider the BlueDot courses as structure-building, AIsafety.info as giving some initial directions. There's also a host of papers attempting to categorize the sub-problems but they're not focused on tentative answers.

All Rationalists hate & sabotage Strategy without having any awareness of it.

Gyrodiot1mo10

Ah. Thank you for your attempt to get through to this community anyway, in the face of such incompatibility. Enjoy your freedom then, I hope you'll do better than us.

All Rationalists hate & sabotage Strategy without having any awareness of it.

Gyrodiot1mo20

Alright, the first chunk of my frowning was from claims about Rationality as a generic concept (and my immediate reaction to it). Second, I am puzzled by a few of your sentences.

Likewise, I consistently see Rationalists have no awareness or care of goals in the first place. Every human acts for a goal. If you don't set an external one, then your default one becomes the goals motivated by human motivations systems.

What do you make of Goal Factoring, one of the techniques designed to patch that class of behaviors ? If I see a self-identified rationalist not being aware of their own goals, and there are a bunch, goal factoring would be my first suggestion. I would expect them to be curious about it.

If improving your ability to think by going through the uncomfortable process of utilizing a system of the brain that you are unfamiliar with is not something that interests you, then this document is not for you.

Mostly unnecessary caveat; one of the main draws of this website is to study the flaws of our own lenses.

Please be undeterred by the negative karma, it's only a signal that this particular post may fail at its intended purpose. Namely:

I say all this to bring context to this document's demand that the. reader does not ask for external justifications of claims. Instead, this document requires that readers test the concepts explored in this document in the real-world. It demands that the readers do not use validity-based reasoning to understand it.

...where is this document? Here I see a warning about the document, a surface clash of concepts, another warning of ignoring advice from other groups, and a bullet point list with too little guidance on how to get those heuristics understood.

Listing the virtues is a starting point, but one does not simply say "go forth and learn for yourself what Good Strategy is" and see that done without a lot of nudging, or else one might stay in the comfort of "validity-based reasoning" all call it a day. Which I would find disappointing.

Lao Mein's Shortform

Gyrodiot1mo10

"Internal betting markets" may be a reference to the Logical Induction paper? Unsure it ties strongly to stop-button/corrigibility.

How far are Western welfare states from coddling the population into becoming useless?

Gyrodiot3mo50

Hi! Nearly all the statements in your question would benefit from some unpacking. Could you expand on what would count as coddling? My intuition says you're gesturing at a whole heap of trade-offs, it might help to pick one in particular and study it further. Any proper answer to the question, as stated, is the stuff of entire books.

*NYT Op-Ed* The Government Knows A.G.I. Is Coming

Gyrodiot4mo60

Archive link (not paywalled)

The Parable of the King and the Random Process

Gyrodiot7mo10Review for 2023 Review

The post makes clear that two very different models of the world will lead to very different action steps, and the "average" of those steps isn't what follows the average of probabilities. See how the previous sentence felt awkward and technical, compared to the story? Sure, it's much longer, but the point gets across better, that's the value. I have added this story to my collection of useful parables.

Re-reading it, the language remains technical, one needs to understand a bit more probability theory to get the latter parts. I would like to see a retelling of the story, same points, different style, to test if it speaks to a different audience.

2024 Unofficial LessWrong Census/Survey

Gyrodiot7mo101

I filled out the survey. Thank you so much for running this!