Steelmanning MIRI critics

fowlertm

I'm giving a talk to the Boulder Future Salon in Boulder, Colorado in a few weeks on the Intelligence Explosion hypothesis. I've given it once before in Korea but I think the crowd I'm addressing will be more savvy than the last one (many of them have met Eliezer personally). It could end up being important, so I was wondering if anyone considers themselves especially capable of playing Devil's Advocate so I could shape up a bit before my talk? I'd like there to be no real surprises.

I'd be up for just messaging back and forth or skyping, whatever is convenient.

This paragraph is a simplification rather than the whole story, but: Our research tends to be focused on mathematical logic and proof systems these days because those are expressive frameworks with which to build toy models that can give researchers some general insight into the shape of the novel problems of AGI control. Methods like testing and probabilistic failure analysis require more knowledge of the target system than we now have for AGI.

When somebody says they are doing A for reason X, then reason X is criticized and they claim they are actually doing A for reason Y, and they have always been, I tend to be wary.

In this case A is "research on mathematical logic and formal proof systems",
X is "self-improving AI is unboxable and untestable, we need to get it provably right on the first try"
and Y is "Our research tends to be focused on mathematical logic and proof systems these days because those are expressive frameworks with which to build toy models that can give researchers some general insight into the shape of the novel problems of AGI control".

If Y is better than X, as it seems to me in this case, this is indeed an improvement, but when you modify your reasons and somehow conclude that your previously chosen course of action is still optimal, then I doubt your judgment.

as far as I can tell, we've never claimed that.

Well... (trigger wa-...)

"And if Novamente should ever cross the finish line, we all die. That is what I believe or I would be working for Ben this instant."
"I intend to plunge into the decision theory of self-modifying decision systems and never look back. (And finish the decision theory and implement it and run the AI, at which point, if all goes well, we Win.)"
"Take metaethics, a solved problem: what are the odds that someone who still thought metaethics was a Deep Mystery could write an AI algorithm that could come up with a correct metaethics? I tried that, you know, and in retrospect it didn’t work."
"Find whatever you’re best at; if that thing that you’re best at is inventing new math[s] of artificial intelligence, then come work for the Singularity Institute. [ ... ] Aside from that, though, I think that saving the human species eventually comes down to, metaphorically speaking, nine people and a brain in a box in a basement, and everything else feeds into that."

X is "self-improving AI is unboxable and untestable, we need to get it provably right on the first try"

But where did somebody from MIRI say "we need to get it provably right on the first try"? Also, what would that even mean? You can't write a formal specification that includes the entire universe and than formally verify an AI against that formal specification. I couldn't find any Yudkowsky quotes about "getting it provably right on the first try" at the link you provided.

8

Steelmanning MIRI critics

8

8

8

Steelmanning MIRI critics

8

8