Gyrodiot

Review of "Learning Normativity: A Research Agenda"

Introduction We (Adam Shimi, Joe Collman & myself) are trying to emulate peer review feedback for Alignment Forum posts. This is the second review in the series. The first’s introduction sums up our motivation and approach rather well, we will not duplicate it here. Instead, let’s dive into today’s reviewed...

Jun 6, 202137

Review of "Fun with +12 OOMs of Compute"

Introduction This review is part of a project with Joe Collman and Jérémy Perret to try to get as close as possible to peer review when giving feedback on the Alignment Forum. Our reasons behind this endeavor are detailed in our original post asking for suggestions of works to review;...

Mar 28, 202165

Learning from counterfactuals

Alternate title: learning from fictional evidence. I've seen echoes of this idea elsewhere but couldn't find a description that suits me. My main idea is: you can update from your observed reaction to fiction and/or counterfactuals. The fallacy of generalizing from fictional evidence happens when you treat events having happened...

Nov 25, 202011

Mapping Out Alignment

This week, the key alignment group, we answered two questions, 5-minute timer style: 1. Map out all of alignment (25 minutes) 2. Create an image/ table representing alignment (10 min.) You are free to stop here, to actually try to answer the questions yourself. Here is a link for a...

Aug 15, 202043

Resources for AI Alignment Cartography

I want to make an actionable map of AI alignment. After years of reading papers, blog posts, online exchanges, books, and occasionally hidden documents about AI alignment and AI risk, and having extremely interesting conversations about it, most arguments I encounter now feel familiar at best, rehashed at worst. This...

Apr 4, 202046

Layers of Expertise and the Curse of Curiosity

Epistemic status: oversimplification of a process I'm confident about; meant as proof of concept. Related to: Double-Dipping in Dunning-Kruger Expertise comes in different, mostly independent layers. To illustrate them, I will describe the rough process of a curious mind discovering a field of study. Discovery In the beginning, the Rookie...

Feb 12, 201919

Willpower duality

Rationality is designed to make you win, to help you attain your objectives. One of the most prominent phenomena getting in the way is akrasia, the lack of willpower preventing us to perform whatever action we want to do. So, wait, do we want to act or not? There are...

Jan 20, 201710

LESSWRONG
LW

LESSWRONG
LW

Gyrodiot

Gyrodiot

Review of "Fun with +12 OOMs of Compute"

Resources for AI Alignment Cartography

Mapping Out Alignment

Review of "Learning Normativity: A Research Agenda"

Gyrodiot

Review of "Fun with +12 OOMs of Compute"

Resources for AI Alignment Cartography

Mapping Out Alignment

Review of "Learning Normativity: A Research Agenda"

Review of "Learning Normativity: A Research Agenda"

Review of "Fun with +12 OOMs of Compute"

Learning from counterfactuals

Mapping Out Alignment

Resources for AI Alignment Cartography

Layers of Expertise and the Curse of Curiosity

Willpower duality