Value Deathism

26 Vladimir_Nesov 30 October 2010 06:20PM

Ben Goertzel:

I doubt human value is particularly fragile. Human value has evolved and morphed over time and will continue to do so. It already takes multiple different forms. It will likely evolve in future in coordination with AGI and other technology. I think it's fairly robust.

Robin Hanson:

Like Ben, I think it is ok (if not ideal) if our descendants' values deviate from ours, as ours have from our ancestors. The risks of attempting a world government anytime soon to prevent this outcome seem worse overall.

We all know the problem with deathism: a strong belief that death is almost impossible to avoid, clashing with undesirability of the outcome, leads people to rationalize either the illusory nature of death (afterlife memes), or desirability of death (deathism proper). But of course the claims are separate, and shouldn't influence each other.

Change in values of the future agents, however sudden of gradual, means that the Future (the whole freackin' Future!) won't be optimized according to our values, won't be anywhere as good as it could've been otherwise. It's easier to see a sudden change as morally relevant, and easier to rationalize gradual development as morally "business as usual", but if we look at the end result, the risks of value drift are the same. And it is difficult to make it so that the future is optimized: to stop uncontrolled "evolution" of value (value drift) or recover more of astronomical waste.

Regardless of difficulty of the challenge, it's NOT OK to lose the Future. The loss might prove impossible to avert, but still it's not OK, the value judgment cares not for feasibility of its desire. Let's not succumb to the deathist pattern and lose the battle before it's done. Have the courage and rationality to admit that the loss is real, even if it's too great for mere human emotions to express.

Recommended Reading for Friendly AI Research

26 Vladimir_Nesov 09 October 2010 01:46PM

This post enumerates texts that I consider (potentially) useful training for making progress on Friendly AI/decision theory/metaethics.

continue reading »

Notion of Preference in Ambient Control

14 Vladimir_Nesov 07 October 2010 09:21PM

This post considers ambient control in a more abstract setting, where controlled structures are not restricted to being programs. It then introduces a notion of preference, as an axiomatic definition of constant (actual) utility. The notion of preference subsumes possible worlds and utility functions traditionally considered in decision theory.

Followup to: Controlling Constant Programs.

In the previous post I described the sense in which one program without parameters (the agent) can control the output of another program without parameters (the world program). These programs define (compute) constant values, respectively actual action and actual outcome. The agent decides on its action by trying to prove statements of a certain form, the moral arguments, such as [agent()=1 => world()=1000000]. When the time is up, the agent performs the action associated with the moral argument that promises the best outcome, thus making that outcome actual.

Let's now move this construction into a more rigorous setting. Consider a first-order language and a theory in that language (defining the way agent reasons, the kinds of concepts it can understand and the kinds of statements it can prove). This could be a set theory such as ZFC or a theory of arithmetic such as PA. The theory should provide sufficient tools to define recursive functions and/or other necessary concepts. Now, extend that theory by definitions of two constant symbols: A (the actual action) and O (the actual outcome). (The new symbols extend the language, while their definitions, obtained from agent and world programs respectively by standard methods of defining recursively enumerable functions, extend the theory.) With new definitions, moral arguments don't have to explicitly cite the code of corresponding programs, and look like this: [A=1 => O=10000].

continue reading »

Controlling Constant Programs

25 Vladimir_Nesov 05 September 2010 01:45PM

This post explains the sense in which UDT and its descendants can control programs with no parameters, without using explicit control variables.

Related to: Towards a New Decision Theory, What a reduction of "could" could look like.

Usually, a control problem is given by an explicit (functional) dependence of outcome on control variables (together with a cost function over the possible outcomes). Solution then consists in finding the values of control variables that lead to the optimal outcome. On the face of it, if we are given no control variables, or no explicit dependence of the outcome on control variables, then the problem is meaningless and cannot be solved.

Consider what is being controlled in UDT and in the model of control described by Vladimir Slepnev. It might be counterintuitive, but in both cases the agent controls constant programs, in other words programs without explicit parameters. And for constant programs, their output is completely determined by their code, nothing else.

Let's take, for example, Vladimir Slepnev's model of Newcomb's problem, written as follows:

def world(): 
  box1 = 1000
  box2 = (agent() == 2) ? 0 : 1000000
  return box2 + ((agent() == 2) ? box1 : 0)

The control problem that the agent faces is to optimize the output of program world() that has no parameters. It might be tempting to say that there is a parameter, namely the sites where agent() is included in the program, but it's not really so: all these entries can be substituted with the code of program agent() (which is also a constant program), at which point there remains no one element in the program world() that can be called a control variable.

continue reading »

Restraint Bias

16 Vladimir_Nesov 10 November 2009 05:23PM

Ed Yong over at Not Exactly Rocket Science has an article on a study demonstrating "restraint bias" (reference), which seems like an important thing to be aware of in fighting akrasia:

People who think they are more restrained are more likely to succumb to temptation

In a series of four experiments, Loran Nordgren from Northwestern University showed that people suffer from a "restraint bias", where they overestimate their ability to control their own impulses. Those who fall prey to this fallacy most strongly are more likely to dive into tempting situations. Smokers, for example, who are trying to quit, are more likely to put themselves in situations if they think they're invulnerable to temptation. As a result, they're more likely to relapse.

Thus, not only do people overestimate their abilities to carry out non-immediate plans (far-mode thinking, like in planning fallacy), but also the more confident ones turn out to be least able. This might have something to do with how public commitment may be counterproductive: once you've effectively signaled your intentions, the pressure to actually implement them fades away. Once you believe yourself to have asserted self-image of a person with good self-control, maintaining the actual self-control loses priority.

See also: Akrasia, Planning fallacy, Near/far thinking.

Related to: Image vs. Impact: Can public commitment be counterproductive for achievement?

Circular Altruism vs. Personal Preference

7 Vladimir_Nesov 26 October 2009 01:43AM

Suppose there is a diagnostic procedure that allows to catch a relatively rare disease with absolute precision. If left untreated, the disease if fatal, but when diagnosed it's easily treatable (I suppose there are some real-world approximations). The diagnostics involves an uncomfortable procedure and inevitable loss of time. At what a priori probability would you not care to take the test, leaving this outcome to chance? Say, you decide it's 0.0001%.

Enter timeless decision theory. Your decision to take or not take the test may be as well considered a decision for the whole population (let's also assume you are typical and everyone is similar in this decision). By deciding to personally not take the test, you've decided that most people won't take the test, and thus, for example, with 0.00005% of the population having the condition, about 3000 people will die. While personal tradeoff is fixed, this number obviously depends on the size of the population.

It seems like a horrible thing to do, making a decision that results in 3000 deaths. Thus, taking the test seems like a small personal sacrifice for this gift to others. Yet this is circular: everyone would be thinking that, reversing decision solely to help others, not benefiting personally. Nobody benefits.

Obviously, together with 3000 lives saved, there is a factor of 6 billion accepting the test, and that harm is also part of the outcome chosen by the decision. If everyone personally prefers to not take the test, then inflicting the opposite on the whole population is only so much worse.

Or is it?

continue reading »

Counterfactual Mugging and Logical Uncertainty

6 Vladimir_Nesov 05 September 2009 10:31PM

Followup to: Counterfactual Mugging.

Let's see what happens with Counterfactual Mugging, if we replace the uncertainty about an external fact of how a coin lands, with logical uncertainty, for example about what is the n-th place in the decimal expansion of pi.

The original thought experiment is as follows:

Omega appears and says that it has just tossed a fair coin, and given that the coin came up tails, it decided to ask you to give it $100. Whatever you do in this situation, nothing else will happen differently in reality as a result. Naturally you don't want to give up your $100. But Omega also tells you that if the coin came up heads instead of tails, it'd give you $10000, but only if you'd agree to give it $100 if the coin came up tails.

Let's change "coin came up tails" to "10000-th digit of pi is even", and correspondingly for heads. This gives Logical Counterfactual Mugging:

Omega appears and says that it has just found out what that 10000th decimal digit of pi is 8, and given that it is even, it decided to ask you to give it $100. Whatever you do in this situation, nothing else will happen differently in reality as a result. Naturally you don't want to give up your $100. But Omega also tells you that if the 10000th digit of pi turned out to be odd instead, it'd give you $10000, but only if you'd agree to give it $100 given that the 10000th digit is even.

This form of Counterfactual Mugging may be instructive, as it slaughters the following false intuition, or equivalently conceptualization of "could": "the coin could land either way, but a logical truth couldn't be either way".

continue reading »

Bloggingheads: Yudkowsky and Aaronson talk about AI and Many-worlds

18 Vladimir_Nesov 16 August 2009 04:06PM

Eliezer Yudkowsky and Scott Aaronson - Percontations: Artificial Intelligence and Quantum Mechanics

Sections of the diavlog:

  • When will we build the first superintelligence?
  • Why quantum computing isn’t a recipe for robot apocalypse
  • How to guilt-trip a machine
  • The evolutionary psychology of artificial intelligence
  • Eliezer contends many-worlds is obviously correct
  • Scott contends many-worlds is ridiculous (but might still be true)

 

Sense, Denotation and Semantics

9 Vladimir_Nesov 11 August 2009 12:47PM

J. Y. Girard, et al. (1989). Proofs and types. Cambridge University Press, New York, NY, USA. (PDF)

I found introductory description of a number of ideas given in the beginning of this book very intuitively clear, and these ideas should be relevant to our discussion, preoccupied with the meaning of meaning as we are. Though the book itself is quite technical, the first chapter should be accessible to many readers.

From the beginning of the chapter:

Let us start with an example. There is a standard procedure for multiplication, which yields for the inputs 27 and 37 the result 999. What can we say about that?

A first attempt is to say that we have an equality

27 × 37 = 999

This equality makes sense in the mainstream of mathematics by saying that the two sides denote the same integer and that × is a function in the Cantorian sense of a graph.

This is the denotational aspect, which is undoubtedly correct, but it misses the essential point:

There is a finite computation process which shows that the denotations are equal. It is an abuse (and this is not cheap philosophy — it is a concrete question) to say that 27 × 37 equals 999, since if the two things we have were the same then we would never feel the need to state their equality. Concretely we ask a question, 27 × 37, and get an answer, 999. The two expressions have different senses and we must do something (make a proof or a calculation, or at least look in an encyclopedia) to show that these two senses have the same denotation.

continue reading »

Rationality Quotes - August 2009

6 Vladimir_Nesov 06 August 2009 01:58AM

A monthly thread for posting any interesting rationality-related quotes you've seen recently on the Internet, or had stored in your quotesfile for ages.

  • Please post all quotes separately (so that they can be voted up/down separately) unless they are strongly related/ordered.
  • Do not quote yourself.
  • Do not quote comments/posts on LW/OB - if we do this, there should be a separate thread for it.
  • No more than 5 quotes per person per monthly thread, please.

View more: Prev | Next