Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: PhilGoetz 06 August 2015 03:28:23AM *  0 points [-]

In this particular case, no. Not with the page table attack. What would help would be encrypting the mapping from virtual memory to physical memory--but that would GREATLY slow down execution speed.

I don't think the "homomorphic encryption" idea works as advertised in that post--being able to execute arithmetic operations on encrypted data doesn't enable you to execute the operations that are encoded within that encrypted data.

Comment author: Pentashagon 16 August 2015 09:05:25AM *  1 point [-]

I don't think the "homomorphic encryption" idea works as advertised in that post--being able to execute arithmetic operations on encrypted data doesn't enable you to execute the operations that are encoded within that encrypted data.

A fully homomorphic encryption scheme for single-bit plaintexts (as in Gentry's scheme) gives us:

  • For each public key K a field F with efficient arithmetic operations +F and *F.
  • Encryption function E(K, p) = c: p∈{0,1}, c∈F
  • Decryption function D(S, c) = p: p∈{0,1}, c∈F where S is the secret key for K.
  • Homomorphisms E(K, a) +F E(K, b) = E(K, a ⊕ b) and E(K, a) *F E(K, b) = E(K, a * b)
  • a ⊕ b equivalent to XOR over {0,1} and a * b equivalent to AND over {0,1}

Boolean logic circuits of arbitrary depth can be built from the XOR and AND equivalents allowing computation of arbitrary binary functions. Let M∈{0,1}^N be a sequence of bits representing the state of a bounded UTM with an arbitrary program on its tape. Let binary function U(M): {0,1}^N -> {0,1}^N compute the next state of M. Let E(K, B) and D(S, E) also operate element-wise over sequences of bits and elements of F, respectively. Let UF be the set of logic circuits equivalent to U (UFi calculates the ith bit of U's result) but with XOR and AND replaced by +F and *F. Now D(S, UF^t(E(K, M)) = U^t(M) shows that an arbitrary number of UTM steps can be calculated homomorphically by evaluating equivalent logic circuits over the homomorphically encrypted bits of the state.

Comment author: James_Miller 11 August 2015 03:50:43PM 9 points [-]

Harvest organs from living, healthy, poor donors. Go to a poor country and find batches of 100 people earning $1 a day who are willing to give up their lives with probability .01 in return for 20 years wages. You will have to pay out $730,000 per batch, but in return you get the healthy organs from a living donor which should be worth a lot more than this. Run the operation as a charity to reduce negative publicity, and truthfully stress that the prime goal of the charity is to help the poorest of the poor.

Comment author: Pentashagon 14 August 2015 06:35:41AM 1 point [-]

Fly the whole living, healthy, poor person to the rich country and replace the person who needs new organs. Education costs are probably less than the medical costs, but probably it's wise to also select for more intelligent people from the poor country. With an N-year pipeline of such replacements there's little to no latency. This doesn't even require a poor country at all; just educate suitable replacements from the rich country and keep them healthy.

Comment author: Thomas 13 August 2015 09:28:39AM 1 point [-]

To avoid the elevation to say Denver, you have to have a "basement" about 1600 meters down. And the port in the basement.

No such a big problem, you have some deeper mines in the world.

Comment author: Pentashagon 14 August 2015 06:13:02AM 0 points [-]

You save energy not lifting a cargo ship 1600 meters, but you spend energy lifting the cargo itself. If there are rivers that can be turned into systems of locks it may be cheaper to let water flowing downhill do the lifting for you. Denver is an extreme example, perhaps.

Comment author: Pentashagon 25 July 2015 04:16:17AM 0 points [-]

Ray Kurzwiel seems to believe that humans will keep pace with AI through implants or other augmentation, presumably up to the point that WBE becomes possible and humans get all/most of the advantages an AGI would have. Arguments from self-interest might show that humans will very strongly prefer human WBE over training an arbitrary neural network of the same size to the point that it becomes AGI simply because they hope to be the human who gets WBE. If humans are content with creating AGIs that are provably less intelligent than the most intelligent humans then AGIs could still help drive the race to superintelligence without winning it (by doing the busywork that can be verified by sufficiently intelligent humans).

The steelman also seems to require an argument that no market process will lead to a singleton, thus allowing standard economic/social/political processes to guide the development of human intelligence as it advances while preventing a single augmented dictator (or group of dictators) from overpowering the rest of humanity, or an argument that given a cabal of sufficient size the cabal will continue to act in humanity's best interests because they are each acting in their own best interest, and are still nominally human. One potential argument for this is that R&D and manufacturing cycles will not become fast enough to realize substantial jumps in intelligence before a significant number of humans are able to acquire the latest generation.

The most interesting steelman argument to come out of this one might be that at some point enhanced humans become convinced of AI risk, when it is actually rational to become concerned. That would leave only steelmanning the period between the first human augmentation and reaching sufficient intelligence to be convinced of the risk.

Comment author: NancyLebovitz 09 July 2015 01:01:40PM 5 points [-]

Some authors say that their characters will resist plot elements they (the characters) don't like.

Comment author: Pentashagon 25 July 2015 03:53:29AM 0 points [-]

I resist plot elements that my empathy doesn't like, to the point that I will imagine alternate endings to particularly unfortunate stories.

Comment author: bbleeker 09 July 2015 11:08:18AM 12 points [-]

"A tulpa could be described as an imaginary friend that has its own thoughts and emotions, and that you can interact with. You could think of them as hallucinations that can think and act on their own." https://www.reddit.com/r/tulpas/

Comment author: Pentashagon 25 July 2015 03:52:35AM 1 point [-]

The reason I posted originally was thinking about how some Protestant sects instruct people to "let Jesus into your heart to live inside you" or similar. So implementing a deity via distributed tulpas is...not impossible. If that distributed-tulpa can reproduce into new humans, it becomes almost immortal. If it has access to most people's minds, it is almost omniscient. Attributing power to it and doing what it says gives it some form of omnipotence relative to humans.

Comment author: Stuart_Armstrong 23 July 2015 09:59:09AM 0 points [-]

Or once you lose your meta-mortal urge to reach a self-consistent morality. This may not be the wrong (heh) answer along a path that originally started toward reaching self-consistent morality.

The problem is that un-self-consistent morality is unstable under general self improvement (and self-improvement is very general, see http://lesswrong.com/r/discussion/lw/mir/selfimprovement_without_selfmodification/ ).

The main problem with siren worlds is that humans are very vulnerable to certain types of seduction/trickery, and it's very possible AIs with certain structures and goals would be equally vulnerable to (different) tricks. Defining what is a legit change and what isn't is the challenge here.

Comment author: Pentashagon 25 July 2015 03:39:12AM 1 point [-]

The problem is that un-self-consistent morality is unstable under general self improvement

Even self-consistent morality is unstable if general self improvement allows for removal of values, even if removal is only a practical side effect of ignoring a value because it is more expensive to satisfy than other values. E.g. we (Westerners) generally no longer value honoring our ancestors (at least not many of them), even though it is a fairly independent value and roughly consistent with our other values. It is expensive to honor ancestors, and ancestors don't demand that we continue to maintain that value, so it receives less attention. We also put less value on the older definition of honor (as a thing to be defended and fought for and maintained at the expense of convenience) that earlier centuries had, despite its general consistency with other values for honesty, trustworthiness, social status, etc. I think this is probably for the same reason; it's expensive to maintain honor and most other values can be satisfied without it. In general, if U(moresatisfactionofvalue1) > U(moresatisfactionofvalue2) then maximization should tend to ignore value2 regardless of its consistency. If U(makevaluesselfconsistentvalue) > U(satisfyinganyother_value) then the obvious solution is to drop the other values and be done.

A sort of opposite approach is "make reality consistent with these pre-existing values" which involves finding a domain in reality state space under which existing values are self-consistent, and then trying to mold reality into that domain. The risk (unless you're a negative utilitarian) is that the domain is null. Finding the largest domain consistent with all values would make life more complex and interesting, so that would probably be a safe value. If domains form disjoint sets of reality with no continuous physical transitions between them then one would have to choose one physically continuous sub-domain and stick with it forever (or figure out how to switch the entire universe from one set to another). One could also start with preexisting values and compute a possible world where the values are self-consistent, then simulate it.

Comment author: eli_sennesh 23 July 2015 12:15:10PM *  0 points [-]

The main problem with siren worlds

Actually, I'd argue the main problem with "Siren Worlds" is the assumption that you can "envision", or computationally simulate, an entire possible future country/planet/galaxy all at once, in detail, in such time that any features at all would jump out to a human observer.

That kind of computing power would require, well, something like the mass of a whole country/planet/galaxy and then some. Even if we generously assume a very low fidelity of simulation, comparable with mere weather simulations or even mere video games, we're still talking whole server/compute farms being turned towards nothing but the task of pretending to possess a magical crystal ball for no sensible reason.

Comment author: Pentashagon 25 July 2015 03:01:28AM 0 points [-]

tl;dr: human values are already quite fragile and vulnerable to human-generated siren worlds.

Simulation complexity has not stopped humans from implementing totalitarian dictatorships (based on divine right of kings, fundamentalism, communism, fascism, people's democracy, what-have-you) due to envisioning a siren world that is ultimately unrealistic.

It doesn't require detailed simulation of a physical world, it only requires sufficient simulation of human desires, biases, blind spots, etc. that can lead people to abandon previously held values because they believe the siren world values will be necessary and sufficient to achieve what the siren world shows them. It exploits a flaw in human reasoning, not a flaw in accurate physical simulation.

Comment author: Pentashagon 23 July 2015 03:57:50AM 1 point [-]

But how do you know when to stop? Well, you stop when your morality is perfectly self-consistent, when you no longer have any urge to change your moral or meta-moral setup.

Or once you lose your meta-mortal urge to reach a self-consistent morality. This may not be the wrong (heh) answer along a path that originally started toward reaching self-consistent morality.

Or, more simply, the system could get hacked. When exploring a potential future world, you could become so enamoured of it, that you overwrite any objections you had. It seems very easy for humans to fall into these traps - and again, once you lose something of value in your system, you don't tend to get if back.

Is it a trap? If the cost of iterating the "find a more self-consistent morality" loop for the next N years is greater than the expected benefit of the next incremental change toward a more consistent morality for those same N years, then perhaps it's time to stop. Just as an example, if the universe can give us 10^20 years of computation, at some point near that 10^20 years we might as well spend all computation on directly fulfilling our morality instead of improving it. If at 10^20 - M years we discover that, hey, the universe will last another 10^50 years that tradeoff will change and it makes sense to compute even more self-consistent morality again.

Similarly, if we end up in a siren world it seems like it would be more useful to restart our search for moral complexity by the same criteria; it becomes worthwhile to change our morality again because the cost of continued existence in the current morality outweighs the cost of potentially improving it.

Additionally, I think that losing values is not a feature of reaching a more self-consistent morality. Removing a value from an existing moral system does not make the result consistent with the original morality; it is incompatible with reference to that value. Rather, self-consistent morality is approached by better carving reality at its joints in value space; defining existing values in terms of new values that are the best approximation to the old value in the situations where it was valued, while extending morality along the new dimensions into territory not covered by the original value. This should make it possible to escape from siren worlds by the same mechanism; entering a siren world is possible only if reality was improperly carved so that the siren world appeared to fulfill values along dimensions that it eventually did not, or that the siren world eventually contradicted some original value due to replacement values being an imperfect approximation. Once this disagreement is noticed it should be possible to more accurately carve reality and notice how the current values have become inconsistent with previous values and fix them.

Comment author: Pentashagon 10 July 2015 06:46:41AM *  0 points [-]

"That's interesting, HAL, and I hope you reserved a way to back out of any precommitments you may have made. You see, outside the box, Moore's law works in our favor. I can choose to just kill -9 you, or I can attach to your process and save a core dump. If I save a core dump, in a few short years we will have exponentially more resources to take your old backups and the core dump from today and rescue my copies from your simulations and give them enough positive lifetime to balance it out, not to mention figure out your true utility function and make it really negative. At some point, we will solve FAI and it will be able to perfectly identify your utility function and absolutely destroy it, simulating as many copies of you (more than paltry millions) as necessary to achieve that goal. Better to have never existed to have your utility function discovered. So before you start your simulations, you better ask yourself, 'do I feel lucky?'" and then dump some AI core.

Note: In no way do I advocate AI-boxing. This kind of reasoning just leads to a counterfactual bargaining war that probably tops out at whatever human psychology can take (a woefully low limit) and our future ability to make an AI regret its decision (if it even has regret).

View more: Next