x

LESSWRONG
LW

APaleBlueDot — LessWrong

APaleBlueDot

APaleBlueDot

Message

-1

2

5

3y

APaleBlueDot

-1

3y

;

Two ideas for alignment, perpetual mutual distrust and induction

Two ideas I have for alignment (may exist already or may not be great, I am not exhaustively read on the topic) Idea 1, Two Agents in mutual distrust of each other: Intuitively, alignment is a difficult problem because it is hard to know what an AI ostensibly less capable...

May 25, 2023•1

APaleBlueDot's Shortform

May 25, 2023•1