MIRI intends to make an AI that is provably friendly. This would require having a formal definition of friendliness that means exactly what it's supposed to mean, and then proving it. Either of those steps seems highly unlikely to be completed without error.
MIRI intends to make an AI that is provably friendly.
I really wish people would stop repeating this claim. Mathematical Proofs Improve But Don’t Guarantee Security, Safety, and Friendliness.
I'm giving a talk to the Boulder Future Salon in Boulder, Colorado in a few weeks on the Intelligence Explosion hypothesis. I've given it once before in Korea but I think the crowd I'm addressing will be more savvy than the last one (many of them have met Eliezer personally). It could end up being important, so I was wondering if anyone considers themselves especially capable of playing Devil's Advocate so I could shape up a bit before my talk? I'd like there to be no real surprises.
I'd be up for just messaging back and forth or skyping, whatever is convenient.