You admit that friendliness is guaranteed.
Typo?
In order to get that provably friendly thing to work
Again, I think "provably friendly thing" mischaracterizes what MIRI thinks will be possible.
I'm not sure exactly what you're saying in the rest of your comment. Have you read the section on indirect normativity in Superintelligence? I'd start there.
Typo?
Fixed.
Again, I think "provably friendly thing" mischaracterizes what MIRI thinks will be possible.
From what I can gather, there's still supposed to be some kind of proof, even if it's just the mathematical kind where you're not really certain because there might be an error in it. The intent is to have some sort of program that maximizes utility function U, and then explicitly write the utility function as something along the lines of "do what I mean".
...Have you read the section on indirect normativity in Superintelligence?
I'm giving a talk to the Boulder Future Salon in Boulder, Colorado in a few weeks on the Intelligence Explosion hypothesis. I've given it once before in Korea but I think the crowd I'm addressing will be more savvy than the last one (many of them have met Eliezer personally). It could end up being important, so I was wondering if anyone considers themselves especially capable of playing Devil's Advocate so I could shape up a bit before my talk? I'd like there to be no real surprises.
I'd be up for just messaging back and forth or skyping, whatever is convenient.