Preference for uncertainty and impact overestimation bias in altruistic systems.

Luck

1 Preference for uncertainty and impact overestimation bias in altruistic systems.

15th Feb 2025

1 min read

1

Let's say a system receives reward when it believes that it's doing some good. Kind of like RL with actor-critic.
Estimated Good things -> max
We can do some rewriting. I'll use notation -> inc. that means incentivised to be increased. It's like direction, towards which the gradients point.

Preventing_estimated_catastrophe -> inc

Estimated_p_catastrophe * estimated_magnitude_of_catastroph-> inc
(P_catastrophe + p_estimation_error) * (catastrophe_magnitude + magnitude_estimation_error) -> inc
p_estimation_error -> inc
magnitude_estimation_error -> inc.

Estimation_error = k*estimation_uncertainty
estimation_uncertainty -> inc.

So, what we got: system is incentivised to have some biases:
1. It's biased to overestimate the probability and magnitude of a catastrophe.
2. It's biased to take actions in areas, where uncertainty is higher (because higher uncertainty gives more freedom to increase the first bias).
Or in plain English, if someone wants to maximize help, they will be looking for a big and probable catastrophe to prevent. Then wishful thinking will bias them to overestimate both magnitude and likelihood, because overestimation gives them higher internal reward. This effect is proportional to uncertainty of the outcome, so it creates a force towards choosing actions with higher uncertainty of the outcome.
Harm of first bias (overestimation) is limited to increased stress levels.
But harm of the second bias is that system will prefer actions with much higher uncertainty of outcome - this one directly affects the strategy, so it's unbounded.
The compensation for the second bias is to take less contraversial actions, and focus more on such actions that give positive results with higher certainty, even at the cost of lower expected magnitude.

At first I thought about presence of this bias in altruistic systems, but I guess it is present in other systems too.

Heuristics & BiasesAI

Frontpage

1

New Comment

Moderation Log