Comment author: Vaniver 03 October 2016 02:43:03AM *  0 points [-]

You don't build any intelligent system without a risk budget. Initial budgets are distributed to humans, e.g. 10^-15 to each human alive in 2016.

But where did that number come from? At some point, an intelligent system that was not handed a budget selects a budget for itself. Presumably the number is set according to some cost-benefit criterion, instead of chosen because it's three hands worth of fingers in a log scale based on two hands worth of fingers.

Whether or not your utility is dominated by survival of humanity is an individual question.

If it isn't, how do you expect the agent to actually stick to such a budget?

Not at all. A risk budget is decreased by your best estimate of your total risk "emission", which is what fraction of the future multiverse (weighted by probability) you spoiled.

I understood your proposal. My point is that it doesn't carve reality at the joints: if you play six-chambered Russian Roulette once, then one sixth of your future vanishes, but given that it came up empty, then you still have 100% of your future, because conditioning on the past in the branch where you survive eliminates the branch where you fail to survive.

What you're proposing is a rule where, if your budget starts off at 1, you only play it six times over your life. But if it makes sense to play it once, it might make sense to play it many times--playing it seven times, for example, still gives you a 28% chance of survival (assuming the chambers are randomized after every trigger pull).

Which suggests a better way to point out what I want to point out--you're subtracting probabilities when it makes sense to multiply probabilities. You're penalizing later risks as if they were the first risk to occur, which leads to double-counting, and means the system is vulnerable to redefinitions. If I view the seven pulls as independent events, it depletes my budget by 7/6, but if I treat them as one event, it depletes my budget by only 1-(5/6)^7, which is about 72%.

Comment author: SquirrelInHell 05 October 2016 01:18:07PM -1 points [-]

But where did that number come from? At some point, an intelligent system that was not handed a budget selects a budget for itself. Presumably the number is set according to some cost-benefit criterion, instead of chosen because it's three hands worth of fingers in a log scale based on two hands worth of fingers.

Of course, my point is to build all intelligent systems so that they do not hand themselves a new budget, with probability that is within our risk budget (which we choose arbitrarily).

If it isn't, how do you expect the agent to actually stick to such a budget?

I hope that survival of humanity dominates the utility function of people who build AI, and they will do their best to carry it over to the AI. You can individually have another utility function, if it serves you well in your life. (As long as you won't build any AIs). But that was a wrong way to answer your previous point:

One, it looks like simple utility maximization (go to the movie if the benefits outweigh the costs) gives the right answer, and being more or less cautious than that suggests is a mistake (at least, of how the utility is measured).

Not in case of multiple agents, who cannot easily coordinate. E.g. what if each human's utility function makes it look reasonable to have a 1/1000 risk of destroying the world, for potential huge personal gains?

If I view the seven pulls as independent events, it depletes my budget by 7/6, but if I treat them as one event, it depletes my budget by only 1-(5/6)^7, which is about 72%.

I am well aware of this, but the effect is negligible if we speak of small probabilities.

Comment author: ChristianKl 01 October 2016 08:37:03PM *  1 point [-]

It is possible to implement verifiable upper bounds

Why do you think this happens to be the case?

The upper bound is nearly always that there a black swan reason that makes you destroy the world.

Comment author: SquirrelInHell 01 October 2016 08:49:36PM *  0 points [-]

It is my impression that there are at least some examples in which this is done in practice: as far as I know, in rocket design you do in fact calculate those for most components, including software used on the on-board computers. This information is used to e.g. decide on the amount of duplication of electronics components in critical systems of the rocket. I am, however, not an expert on rockets.

It seems plausible that at least in some concepts, we can indeed build safeguards that have a certain efficiency that we know at reducing our overall risk. Even if this is true only sometimes, than it would be useful to have a way to calculate the maximum allowed risk levels for extinction-like events.

Incidentally, I am also of the opinion that having any kind of calculation would work better than making a non-zero extinction risk taboo, or not subject to negotiation (which seems to be the case currently).

However of course, I am not claiming that my idea is so great. I stand behind my opinion that we need some such system to make sensible tradeoffs on "emissions" of existential risk.

The upper bound is nearly always that there a black swan reason that makes you destroy the world.

Ah, I see you added this part.

I generally agree. Still, sometimes you'll want something to guide your design even if you know that there might be some such black swan. You are surely not suggesting that existence of black swans is enough to make us abandon all effort and do whatever.

Comment author: Lumifer 30 September 2016 04:16:43PM *  0 points [-]

Are you looking for the expression "risk budget"?

What is "internal currency" when you are talking about "multiple agents acting independently"?

Comment author: SquirrelInHell 01 October 2016 08:38:23PM -1 points [-]

Sorry, that was unclear. I meant "internal" to the decision making process. This process is implemented on many individuals in case of human collective intelligence. But in some cases it makes sense to think about it as a single decision making process in the abstract.

Comment author: chron 01 October 2016 04:45:48AM 2 points [-]

So e.g. if I want to run a dangerous experiment that might destroy the world, it's totally OK as long as I can purchase enough of a risk budget.

And how does this system determine the probability that the experiment might destroy the world? You do realize that's the hard part.

Comment author: SquirrelInHell 01 October 2016 08:36:29PM -1 points [-]

Yes. as I mentioned in other comments, in practice you build safeguards and they give you some reduction of the upper bound on risk. So you use your risk budget to calculate how many safeguards you need to build.

Comment author: Vaniver 30 September 2016 06:50:13PM *  2 points [-]

CO2 emissions have the virtues that they are both easy to measure and their effects are roughly linear.* I don't see a similar thing being true for perceived risk, and I think conserved budgets are probably worse than overall preferences.

First: measuring probabilities of world destruction is very hard; being able to measure them at the 1e-12 level seems very, very hard, especially if most probabilities of world destruction are based around conflict. ("Will threatening my opponent here increase or decrease the probably of the world ending?")

Second: suppose we grant that the system has the ability to measure the probability of the world being destroyed, to arbitrary precision. How should it decide what budget level to give itself? (Suppose it's the original agent, instead of one handed a budget by its creator.)

To make it easier to think about, you can reformulate the question in terms of your own life. You can take actions that increase the chance that you die sooner rather than later, and gain some benefit from doing so. (Perhaps you decide to drive to a movie theater to see a new movie instead of something on Netflix.)

But now a few interesting things pop up. One, it looks like simple utility maximization (go to the movie if the benefits outweigh the costs) gives the right answer, and being more or less cautious than that suggests is a mistake (at least, of how the utility is measured).

Two, the budget replenishes. If I go to the theater on Friday and come back unharmed, then from the perspective of Thursday!me I took on some risk, but from the perspective of Saturday!me that risk turned out to not cost anything. That is, Thursday!me thinks I'm picking up 1e-7 in additional risk but Saturday!me knows that I survived, and still has '100%' of risk to allocate anew.

So I think budgets are the wrong way to think about this--they rely too heavily on subjective perceptions of risk, they encourage being too cautious (or too risky) instead of seeing tail risks as linear in probability, and they don't update on survival when they should.


*I don't mean that the overall effect of CO2 emissions are linear, which seems false, but instead that participants are small enough relative to overall CO2 production that they don't expect their choices to affect the overall CO2 price, and thus the price is linear for them individually.

Comment author: SquirrelInHell 01 October 2016 08:34:52PM -1 points [-]

I do not argue that my idea is sane; however I think your critique doesn't do it justice. So let me briefly point out that:

measuring probabilities of world destruction is very hard; being able to measure them at the 1e-12 level seems very, very hard

It's enough to use upper bounds. If we have e.g. an additional module to check our AI source code for errors, and such a module decreases probability of one of the bits being flipped, we can use our risk budget to calculate how many modules at minimum we need. Etc.

How should it decide what budget level to give itself?

It doesn't. You don't build any intelligent system without a risk budget. Initial budgets are distributed to humans, e.g. 10^-15 to each human alive in 2016.

looks like simple utility maximization (go to the movie if the benefits outweigh the costs) gives the right answer

If utility is dominated by survival of humanity, then simple utility maximization is exactly the same as reducing total "existential risk emissions" in the sense I want to use them above.

Whether or not your utility is dominated by survival of humanity is an individual question.

the budget replenishes

Not at all. A risk budget is decreased by your best estimate of your total risk "emission", which is what fraction of the future multiverse (weighted by probability) you spoiled.

So I think budgets are the wrong way to think about this--they rely too heavily on subjective perceptions of risk, they encourage being too cautious (or too risky) instead of seeing tail risks as linear in probability, and they don't update on survival when they should.

Quite likely they are - but probably not for these reasons.

Comment author: ChristianKl 30 September 2016 10:42:42PM 3 points [-]

For all those reasons Nassim Taleb wrote about, it's a bad idea to treat risk like it can be that precisely measured.

Comment author: SquirrelInHell 01 October 2016 08:24:00PM -1 points [-]

Yes, but to implement risk budgets it's enough to know upper bounds with reasonable certainty. It is possible to implement verifiable upper bounds, esp. in tech contexts such as AI.

Comment author: SquirrelInHell 01 October 2016 08:22:06PM 0 points [-]

Good job with the main idea. However your speculation about past tech civilizations on Earth, artifacts preserved on moon etc. seems only half lucid.

Comment author: ike 30 September 2016 02:50:30PM *  0 points [-]

I will offer you a bet at any odds you want that humanity will still be around in 10 years.

See http://lesswrong.com/lw/ie/the_apocalypse_bet/

Comment author: SquirrelInHell 30 September 2016 04:12:24PM 1 point [-]

If you think this is related, then I failed to communicate my idea.

You can think of risk contracts as of an internal currency of decision making, that allows to coordinate risk management in a situation when there are multiple agents acting independently (or new agents are created).

Definitely different from betting on extinction events, or betting on predictions about those events.

Comment author: ChristianKl 20 September 2016 09:05:23AM 0 points [-]

"Special" seems to have a more positive connotation.

Comment author: SquirrelInHell 20 September 2016 07:37:32PM 0 points [-]

But there's a strong negative connection with "special care" and "special needs"...

In response to Willpower Schedule
Comment author: entirelyuseless 28 August 2016 05:23:45AM 1 point [-]

I don't see why this would "supersede" other models. I don't have to test it in my own case, because I already know that I am less willing to work if I did not expect to have to work. That doesn't mean that willpower is not a consumable resource. For example, you can compare it with money. If I go out and expect to spend $20, I might tell people, "I can't afford that," if the thing is going to cost $100. But if I had expected it to cost $100, I might have spent that amount. None of that shows that money is not a consumable resource.

Comment author: SquirrelInHell 30 August 2016 07:41:58AM 0 points [-]

Using your money analogy, what I'm saying above is that if you expect the item to cost $20, you will go to the shop with only $20 or $25 in your wallet. So you won't buy the item if it costs $100.

This theory is compatible with willpower being expendable - obviously, you can't carry more cash in your wallet than the total amount you own.

So it is a more detailed model, in which you can be short of money in two ways: you can't afford this at all, or you didn't think to have that much cash on you when you left the house.

View more: Next