Somewhat off-topic, but what struck me about your blog post is the apparent contradiction between "And when you’re that smart, you can do almost anything." and presuming that we can actually program a "super-smart" AI with a certain set of values that it would not instantly override based on the considerations we cannot even imagine.
Just to give an example, it might decide that humans are bad for the universe, because they might some day evolve to destroy it. Would we be able to prevent it from wiping/neutralizing humanity for the sake of the universe and potentially other intelligent species that may exist there? Would we even want to? Or there might be a universal law of AI physics that leads it to destroy its ancestors. Or maybe some calculations, Asimov's Foundation-style, would require it to subject humanity to untold millennia of suffering in order to prevent its total destruction by excessive fun.
No point arguing with these examples, since a super-smart AI would think in ways we cannot fathom. My point, again, is that believing that we can give an AI a set of goals or behaviors it would care about once it is smarter than us seems no smarter than believing in an invisible man in the sky who has a list of 10 things we are not supposed to do (thanks, George Carlin).
If we restrict the space of its terminal goals to things we can imagine (and then set about proving each thing to be friendly) then we can be sure that even thinking in ways we cannot fathom, as long as its goal structure doesn't change (this seems decoupled from intelligence ie paperclip maximiser) it won't ever do bad things X Y or Z (because it checks them against its terminal goal).
What did Will mean? To take an idea seriously is “to update a belief and then accurately and completely propagate that belief update through the entire web of beliefs in which it is embedded,” as in a Bayesian belief network (see right).
Belief propagation is what happened, for example, when I first encountered that thundering paragraph from I.J. Good (1965):
Good’s paragraph ran me over like a train. Not because it was absurd, but because it was clearly true. Intelligence explosion was a direct consequence of things I already believed, I just hadn’t noticed! Humans do not automatically propagate their beliefs, so I hadn’t noticed that my worldview already implied intelligence explosion.
I spent a week looking for counterarguments, to check whether I was missing something, and then accepted intelligence explosion to be likely (so long as scientific progress continued). And though I hadn’t read Eliezer on the complexity of value, I had read David Hume and Joshua Greene. So I already understood that an arbitrary artificial intelligence would almost certainly not share our values.
Accepting my belief update about intelligence explosion, I propagated its implications throughout my web of beliefs. I realized that:
I had encountered the I.J. Good paragraph on Less Wrong, so I put my other projects on hold and spent the next month reading almost everything Eliezer had written. I also found articles by Nick Bostrom and Steve Omohundro. I began writing articles for Less Wrong and learning from the community. I applied to Singularity Institute’s Visiting Fellows program and was accepted. I quit my job in L.A., moved to Berkeley, worked my ass off, got hired, and started collecting research related to rationality and intelligence explosion.
My story surprises people because it is unusual. Human brains don’t usually propagate new beliefs so thoroughly.
But this isn’t just another post on taking ideas seriously. Will already offered some ideas on how to propagate beliefs. He also listed some ideas that most people probably aren’t taking seriously enough. My purpose here is to examine one prerequisite of successful belief propagation: actually making sure your beliefs are connected to each other in the first place.
If your beliefs aren’t connected to each other, there may be no paths along which you can propagate a new belief update.
I’m not talking about the problem of free-floating beliefs that don’t control your anticipations. No, I’m talking about “proper” beliefs that require observation, can be updated by evidence, and pay rent in anticipated experiences. The trouble is that even proper beliefs can be inadequately connected to other proper beliefs inside the human mind.
I wrote this post because I'm not sure what the "making sure your beliefs are actually connected in the first place" skill looks like when broken down to the 5-second level.
I was chatting about this with atucker, who told me he noticed that successful businessmen may have this trait more often than others. But what are they doing, at the 5-second level? What are people like Eliezer and Carl doing? How does one engage in the purposeful decompartmentalization of one's own mind?