User Comment Replies

Virtue signaling is sometimes the best or the only metric we have

I think your trust-o-meter is looking for people who have an unusually low level of self-deception. The energy is Great if you share my axioms or moral judgments, but for Pete's sake, at least be consistent with your own.

What suggests this to me is the Breaking Bad example, because Walter White really does move on a slow gradient from more to less self-decieved throughout the show in my read of his character - it just so happens that the less self-decieved he is, the more at home he becomes with perpetuating monstrous acts as a result of the previous histo... (read more)

Intuitions about solving hard problems

iivonen3y70

One thing I'm interested in but don't know where to start looking for it, is seeing people who are working instead on the reverse direction - mathematical approaches which show aligned AI is not possible or likely. By this I mean formal work that suggests something like "almost all AGIs are unsafe", in the same way that the chances of picking a rational number at random from $(0, 1)$ is zero because almost all real numbers are irrational.

I don't say this to be a downer! I mean it in the sense of a mathematician who spent 7 years attempting to prove X exists, ... (read more)

anonymousaisafety3y100

I have been working on an argument from that angle.

I've been developing it independently from my own background in autonomous safety-critical hardware/software systems, but I discovered recently that it's very similar to Drexler's CAIS from 2019, except with more focus on low-level evidence or rationale for why certain claims are justified.

It isn't so much a pure mathematical approach as it is a systems engineering or systems safety perspective on all of the problems^[1]^[2] that would remain even if someone showed up tomorrow and dropped a formally ver... (read more)

Why pessimism sounds smart

iivonen3y40

No known solutions can solve our hardest problems—that’s why they’re the hardest ones.

I like the energy, but I have to register a note of dissent here.

Quite a few of our hardest problems do have known solutions - it's just that those known solutions are, or appear, too hard to implement.

Brute force algorithms exist for almost everything we care about, up to and including AGI.
Overweight individuals know that if they eat less, they will eventually lose weight; it's just often frustratingly beyond them, for one reason or another. Same with alcoholics an

... (read more)

9jasoncrawford3y

It's not really a solution if it can't be implemented: if it doesn't work, or is unaffordable, or otherwise isn't practical.

LESSWRONG
LW

All of iivonen's Comments + Replies