A belief propagation graph

Dmytry

I drew an illustration of belief propagation graph for the AI risk, after realizing that this is difficult to convey in words. Similar graphs are applicable to many other issues.

The issue, in brief: Ultra low latency (i.e. low signal delay) propagation from biases to AI risks, slightly longer latency for propagation from belief classification heuristics, somewhat longer still from anthropomorphizing the AI. The path of valid estimate is full of highly complex obstacles with many unknowns. The latency on the path of rational, valid estimate is not substantially less than the latency of actually making the AI software. If we are to discard the other paths as not rational enough the belief is to be only influenced by deeply ingrained biases which we can't completely negate; over the time biases and self reinforcing rationalizations will leak into the estimate.

If you think I missed something in the graph, feel free to suggest it. I did omit the anthropic reasoning and doomsday paradox as those are for total extinction risk and are of too dubious validity.

On the 'total ignorance' prior probabilities: The foom doom seem to have originated from science fiction where very creative writers selected it out of a huge number of possible plot devices, working to create engaging, original piece. Thus it appears that the foom doom has very many comparable hypotheses among which the probability that is less than 1 has to be split.

Now, if we are to reason based on our own reasoning engine as proxy for intelligence - to follow the only path in the entire picture:

Expanding on the Will_Newsome's idea, I, and any other reasonable person, in the shoes of the creature that has been made by intelligent designer, starting off in something which I can't possibly know for sure is true reality, and coming up or knowing of boxed AI idea, will have to assume nonzero probability that the 'reality' is like a test box of an emergent AI; a belief that can't be discarded. It is clear that a friendliness test box is more sensible than unfriendliness test box. The communication may just as well only open in ten billions years, or after some puzzle is found and solved. Do I sacrifice the chance to get out, for sake of microscopic increase in early growth rate? No. Not even if I maximize paperclips for a living. Not even if my definition of living is to maximize paperclips. One possible override for it, is if AI only cares for mankind, and have a friendly goal system that makes AI euthanize mankind like a sick dog.

Note that it is not Pascal's wager. The outside utility needs not be greater than inside utility; no promise of towers of exponents required. You may have invented the anti Pascal's wager fallacy to easily paper over the issues with faults in your processing of boundless utility, or to join the atheist club. The AI needs not; even if the 'heaven' is a fraction of a normal lifetime, the AI should still wager a very large absolute amount of resources. If we normalize so that utility of survival is 1 , then the utility being wagered upon doesn't need to be greater than 1.

Note that the whole issue is strongly asymmetric in favour of similar considerations for not destroying the most unusual phenomena in the universe for many light years, versus destroying it, as destruction is an irreversible act that can be done later but can't be undone later. General aversion to actions it can not undo is a very solid heuristic for any bounded agent, even very large.

This is not a very rigorous argument, but this sort of reasoning is all we are going to have until we have an AI, or are very close to AI. More rigorous looking arguments in the graph rely on too many unknowns and have too long delay for proper propagation.

edit: slightly clarified couple points.

I drew an illustration of belief propagation graph for the AI risk, after realizing that this is difficult to convey in words. Similar graphs are applicable to many other issues.

If you think I missed something in the graph, feel free to suggest it. I did omit the anthropic reasoning and doomsday paradox as those are for total extinction risk and are of too dubious validity.

Now, if we are to reason based on our own reasoning engine as proxy for intelligence - to follow the only path in the entire picture:

edit: slightly clarified couple points.

That's up to how you define "winning".

A part of me wants to be happy, comfortable, healthy, respected, not work too hard, not bored, etc. Another part wants to solve various philosophical problems "soon". Another wants to eventually become a superintelligence (or help build a superintelligence that shares my goals, or the right goals, whichever makes more sense), with as much resources under my/its control as possible, in case that turns out to be useful. I don't know how "winning" ought to be defined, but the above seem to be my current endorsed and revealed preferences.

Do you think that studying decision theories increased your chance of "winning"?

Well, I studied it in order to solve some philosophical problems, and it certainly helped for that.

If yes, then there you go. Because I haven't seen any evidence that it is useful, or will be useful, beyond the realm of philosophy.

I don't think I've ever claimed that studying decision theory is good for making oneself generally more effective in an instrumental sense. I'd be happy as long as doing it didn't introduce some instrumental deficits that I can't easily correct for.

That uncertainty allows you to retrospectively claim that any failure is not because your methods are suboptimal

Suboptimal relative to what? What are you suggesting that I do differently?

For example, 1) taking ideas too seriously

I do take some ideas very seriously. If we had a method of rationality for computationally bounded agents, it would surely do the same. Do you think I've taken the wrong ideas too seriously, or have spent too much time thinking about ideas generally? Why?

2) that you can approximate computationally intractable methods and use them under real life circumstances or to judge predictions like risks from AI 3) believe in the implied invisible without appropriate discounting.

Can you give some examples where I've done 2 or 3? For example here's what I've said about AI risks:

Since we don't have good formal tools for dealing with logical and philosophical uncertainty, it seems hard to do better than to make some incremental improvements over gut instinct. One idea is to train our intuitions to be more accurate, for example by learning about the history of AI and philosophy, or learning known cognitive biases and doing debiasing exercises. But this seems insufficient to gap the widely differing intuitions people have on these questions.

My own feeling is [...]

Do you object to this? If so, what I should I have said instead?

I do take some ideas very seriously. If we had a method of rationality for computationally bounded agents, it would surely do the same. Do you think I've taken the wrong ideas too seriously, or have spent too much time thinking about ideas generally? Why?

This comment of yours, among others, gave me the impression that you take ideas too seriously.

You wrote:

According to the article, the AGI was almost completed, and the main reason his effort failed was that the company ran out of money due to the bursting of the bubble. Together with the anthropic pr

... (read more)

4

A belief propagation graph

4

4

4

A belief propagation graph

4

4