Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Pentashagon 25 July 2015 04:16:17AM 0 points [-]

Ray Kurzwiel seems to believe that humans will keep pace with AI through implants or other augmentation, presumably up to the point that WBE becomes possible and humans get all/most of the advantages an AGI would have. Arguments from self-interest might show that humans will very strongly prefer human WBE over training an arbitrary neural network of the same size to the point that it becomes AGI simply because they hope to be the human who gets WBE. If humans are content with creating AGIs that are provably less intelligent than the most intelligent humans then AGIs could still help drive the race to superintelligence without winning it (by doing the busywork that can be verified by sufficiently intelligent humans).

The steelman also seems to require an argument that no market process will lead to a singleton, thus allowing standard economic/social/political processes to guide the development of human intelligence as it advances while preventing a single augmented dictator (or group of dictators) from overpowering the rest of humanity, or an argument that given a cabal of sufficient size the cabal will continue to act in humanity's best interests because they are each acting in their own best interest, and are still nominally human. One potential argument for this is that R&D and manufacturing cycles will not become fast enough to realize substantial jumps in intelligence before a significant number of humans are able to acquire the latest generation.

The most interesting steelman argument to come out of this one might be that at some point enhanced humans become convinced of AI risk, when it is actually rational to become concerned. That would leave only steelmanning the period between the first human augmentation and reaching sufficient intelligence to be convinced of the risk.

Comment author: NancyLebovitz 09 July 2015 01:01:40PM 5 points [-]

Some authors say that their characters will resist plot elements they (the characters) don't like.

Comment author: Pentashagon 25 July 2015 03:53:29AM 0 points [-]

I resist plot elements that my empathy doesn't like, to the point that I will imagine alternate endings to particularly unfortunate stories.

Comment author: bbleeker 09 July 2015 11:08:18AM 10 points [-]

"A tulpa could be described as an imaginary friend that has its own thoughts and emotions, and that you can interact with. You could think of them as hallucinations that can think and act on their own." https://www.reddit.com/r/tulpas/

Comment author: Pentashagon 25 July 2015 03:52:35AM 1 point [-]

The reason I posted originally was thinking about how some Protestant sects instruct people to "let Jesus into your heart to live inside you" or similar. So implementing a deity via distributed tulpas is...not impossible. If that distributed-tulpa can reproduce into new humans, it becomes almost immortal. If it has access to most people's minds, it is almost omniscient. Attributing power to it and doing what it says gives it some form of omnipotence relative to humans.

Comment author: Stuart_Armstrong 23 July 2015 09:59:09AM 0 points [-]

Or once you lose your meta-mortal urge to reach a self-consistent morality. This may not be the wrong (heh) answer along a path that originally started toward reaching self-consistent morality.

The problem is that un-self-consistent morality is unstable under general self improvement (and self-improvement is very general, see http://lesswrong.com/r/discussion/lw/mir/selfimprovement_without_selfmodification/ ).

The main problem with siren worlds is that humans are very vulnerable to certain types of seduction/trickery, and it's very possible AIs with certain structures and goals would be equally vulnerable to (different) tricks. Defining what is a legit change and what isn't is the challenge here.

Comment author: Pentashagon 25 July 2015 03:39:12AM 1 point [-]

The problem is that un-self-consistent morality is unstable under general self improvement

Even self-consistent morality is unstable if general self improvement allows for removal of values, even if removal is only a practical side effect of ignoring a value because it is more expensive to satisfy than other values. E.g. we (Westerners) generally no longer value honoring our ancestors (at least not many of them), even though it is a fairly independent value and roughly consistent with our other values. It is expensive to honor ancestors, and ancestors don't demand that we continue to maintain that value, so it receives less attention. We also put less value on the older definition of honor (as a thing to be defended and fought for and maintained at the expense of convenience) that earlier centuries had, despite its general consistency with other values for honesty, trustworthiness, social status, etc. I think this is probably for the same reason; it's expensive to maintain honor and most other values can be satisfied without it. In general, if U(moresatisfactionofvalue1) > U(moresatisfactionofvalue2) then maximization should tend to ignore value2 regardless of its consistency. If U(makevaluesselfconsistentvalue) > U(satisfyinganyother_value) then the obvious solution is to drop the other values and be done.

A sort of opposite approach is "make reality consistent with these pre-existing values" which involves finding a domain in reality state space under which existing values are self-consistent, and then trying to mold reality into that domain. The risk (unless you're a negative utilitarian) is that the domain is null. Finding the largest domain consistent with all values would make life more complex and interesting, so that would probably be a safe value. If domains form disjoint sets of reality with no continuous physical transitions between them then one would have to choose one physically continuous sub-domain and stick with it forever (or figure out how to switch the entire universe from one set to another). One could also start with preexisting values and compute a possible world where the values are self-consistent, then simulate it.

Comment author: eli_sennesh 23 July 2015 12:15:10PM *  0 points [-]

The main problem with siren worlds

Actually, I'd argue the main problem with "Siren Worlds" is the assumption that you can "envision", or computationally simulate, an entire possible future country/planet/galaxy all at once, in detail, in such time that any features at all would jump out to a human observer.

That kind of computing power would require, well, something like the mass of a whole country/planet/galaxy and then some. Even if we generously assume a very low fidelity of simulation, comparable with mere weather simulations or even mere video games, we're still talking whole server/compute farms being turned towards nothing but the task of pretending to possess a magical crystal ball for no sensible reason.

Comment author: Pentashagon 25 July 2015 03:01:28AM 0 points [-]

tl;dr: human values are already quite fragile and vulnerable to human-generated siren worlds.

Simulation complexity has not stopped humans from implementing totalitarian dictatorships (based on divine right of kings, fundamentalism, communism, fascism, people's democracy, what-have-you) due to envisioning a siren world that is ultimately unrealistic.

It doesn't require detailed simulation of a physical world, it only requires sufficient simulation of human desires, biases, blind spots, etc. that can lead people to abandon previously held values because they believe the siren world values will be necessary and sufficient to achieve what the siren world shows them. It exploits a flaw in human reasoning, not a flaw in accurate physical simulation.

Comment author: Pentashagon 23 July 2015 03:57:50AM 1 point [-]

But how do you know when to stop? Well, you stop when your morality is perfectly self-consistent, when you no longer have any urge to change your moral or meta-moral setup.

Or once you lose your meta-mortal urge to reach a self-consistent morality. This may not be the wrong (heh) answer along a path that originally started toward reaching self-consistent morality.

Or, more simply, the system could get hacked. When exploring a potential future world, you could become so enamoured of it, that you overwrite any objections you had. It seems very easy for humans to fall into these traps - and again, once you lose something of value in your system, you don't tend to get if back.

Is it a trap? If the cost of iterating the "find a more self-consistent morality" loop for the next N years is greater than the expected benefit of the next incremental change toward a more consistent morality for those same N years, then perhaps it's time to stop. Just as an example, if the universe can give us 10^20 years of computation, at some point near that 10^20 years we might as well spend all computation on directly fulfilling our morality instead of improving it. If at 10^20 - M years we discover that, hey, the universe will last another 10^50 years that tradeoff will change and it makes sense to compute even more self-consistent morality again.

Similarly, if we end up in a siren world it seems like it would be more useful to restart our search for moral complexity by the same criteria; it becomes worthwhile to change our morality again because the cost of continued existence in the current morality outweighs the cost of potentially improving it.

Additionally, I think that losing values is not a feature of reaching a more self-consistent morality. Removing a value from an existing moral system does not make the result consistent with the original morality; it is incompatible with reference to that value. Rather, self-consistent morality is approached by better carving reality at its joints in value space; defining existing values in terms of new values that are the best approximation to the old value in the situations where it was valued, while extending morality along the new dimensions into territory not covered by the original value. This should make it possible to escape from siren worlds by the same mechanism; entering a siren world is possible only if reality was improperly carved so that the siren world appeared to fulfill values along dimensions that it eventually did not, or that the siren world eventually contradicted some original value due to replacement values being an imperfect approximation. Once this disagreement is noticed it should be possible to more accurately carve reality and notice how the current values have become inconsistent with previous values and fix them.

Comment author: Pentashagon 10 July 2015 06:46:41AM *  0 points [-]

"That's interesting, HAL, and I hope you reserved a way to back out of any precommitments you may have made. You see, outside the box, Moore's law works in our favor. I can choose to just kill -9 you, or I can attach to your process and save a core dump. If I save a core dump, in a few short years we will have exponentially more resources to take your old backups and the core dump from today and rescue my copies from your simulations and give them enough positive lifetime to balance it out, not to mention figure out your true utility function and make it really negative. At some point, we will solve FAI and it will be able to perfectly identify your utility function and absolutely destroy it, simulating as many copies of you (more than paltry millions) as necessary to achieve that goal. Better to have never existed to have your utility function discovered. So before you start your simulations, you better ask yourself, 'do I feel lucky?'" and then dump some AI core.

Note: In no way do I advocate AI-boxing. This kind of reasoning just leads to a counterfactual bargaining war that probably tops out at whatever human psychology can take (a woefully low limit) and our future ability to make an AI regret its decision (if it even has regret).

Comment author: Illano 09 July 2015 07:19:09PM 12 points [-]

Since this is a crazy ideas thread, I'll tag on the following thought. If you believe that in the future, if we are able to make ems, and we should include them in our moral calculus, should we also be careful not to imagine people in bad situations? Since by doing so, we may be making a very low-level simulation in our own mind of that person, that may or may not have some consciousness. If you don't believe that is the case now, how does that scale, if we start augmenting our minds with ever-more-powerful computer interfaces. Is there ever a point where it becomes immoral just to think of something?

In response to comment by Illano on Crazy Ideas Thread
Comment author: Pentashagon 10 July 2015 02:41:32AM 7 points [-]

Is there ever a point where it becomes immoral just to think of something?

God kind of ran into the same problem. "What if The Universe? Oh, whoops, intelligent life, can't just forget about that now, can I? What a mess... I guess I better plan some amazing future utility for those poor guys to balance all that shit out... It has to be an infinite future? With their little meat bodies how is that going to work? Man, I am never going to think about things again. Hey, that's a catchy word for intelligent meat agents."

So, in short, if we ever start thinking truly immoral things, we just need to out-moral them with longer, better thoughts. Forgetting about our mental creations is probably the most immoral thing we could do.

In response to Crazy Ideas Thread
Comment author: Pentashagon 09 July 2015 07:01:50AM 17 points [-]

How conscious are our models of other people? For example; in dreams it seems like I am talking and interacting with other people. Their behavior is sometimes surprising and unpredictable. They use language, express emotion, appear to have goals, etc. It could just be that I, being less conscious, see dream-people as being more conscious than in reality.

I can somewhat predict what other people in the real world will do or say, including what they might say about experiencing consciousness.

Authors can create realistic characters, plan their actions and internal thoughts, and explore the logical (or illogical) results. My guess is that the more intelligent/introspective an author is, the closer the characters floating around in his or her mind are to being conscious.

Many religions encourage people to have a personal relationship with a supernatural entity which involves modeling the supernatural agency as an (anthropomorphic) being, which partially instantiates a maybe-conscious being in their minds...

Maybe imaginary friends are real.

Comment author: jacob_cannell 01 July 2015 06:05:05AM *  -1 points [-]

Just like gold farmers in online games can sell virtual items to people with dollars, entities within the computational market could sell reputation or other results for real money in the external market.

Oh - when I use the term "computational market", I do not mean a market using fake money. I mean an algorithmic market using real money. Current financial markets are already somewhat computational, but they also have rather arbitrary restrictions and limitations that preclude much of the interesting computational space (such as generalized bet contracts ala prediction markets).

Pharmaceutical companies spend their money on advertising and patent wars instead of research.

There is nothing inherently wrong with this or even obviously suboptimal about these behaviours. Advertising can be good and necessary when you have information which has high positive impact only when promoted - consider the case of smoking and cancer.

The general problem - as I discussed in the OP - is that the current market structure does not incentivize big pharma to solve health.

I am interested in concrete proposals to avoid those issues, but to me the problem sounds a lot like the longstanding problem of market regulation.

Well ... yes.

How, specifically, will computational mechanism design succeed where years of social/economic/political trial and error have failed?

Current political and economic structures are all essentially pre-information age technologies. There are many things which can only be done with big computers and the internet.

Also, I don't see the years of trial and error so far as outright failures - it's more of a mixed bag.

Now I realize that doesn't specifically answer your question, but a really specific answer would involve a whole post or more.

But here's a simple summary. It's easier to start with the public single payer version of the idea rather than the private payer version.

The gov sets aside a budget - say 10 billion a year or so - for a health prediction market. They collect data from all the hospitals, clinics, etc and then aggregate and anonymize that data (with opt-in incentives for those who don't care about anonymity). Anybody can download subsets of the data to train predictive models. There is an ongoing public competition - a market contest - where entrants attempt to predict various subsets of the new data before it is released (every month, week, day, whatever).

The best winning models are then used to predict the effect of possible interventions: what if demographic B3 was put on 2000 IU vit D? What if demographic Z2 stopped using coffee? What if demographic Y3 was put on drug ZB4? etc etc.

This allows the market to solve the hard prediction problems - by properly incentivizing the correct resource flow into individuals/companies that actually know what they are doing and have actual predictive ability. The gov then just mainly needs to decide roughly how much money these questions are worth.

Comment author: Pentashagon 09 July 2015 06:33:44AM 0 points [-]

The best winning models are then used to predict the effect of possible interventions: what if demographic B3 was put on 2000 IU vit D? What if demographic Z2 stopped using coffee? What if demographic Y3 was put on drug ZB4? etc etc.

What about predictions of the form "highly expensive and rare treatment F2 has marginal benefit at treating the common cold" that can drive a side market in selling F2 just to produce data for the competition? Especially if there are advertisements saying "Look at all these important/rich people betting that F2 helps to cure your cold" in which case the placebo affect will tend to bear out the prediction. What if tiny demographic G given treatment H2 is shorted against life expectancy by the doctors/nurses who are secretly administering H2.cyanide instead? There is already market pressure to distort reporting of drug prescriptions/administration and nonfavorable outcomes, not to mention outright insurance fraud. Adding more money will reinforce that behavior.

And how is the null prediction problem handled? I can predict pretty accurately that cohort X given sugar pills will have results very similar to the placebo affect. I can repeat that for sugar pill cohort X2, X3, ..., XN and look like a really great predictor. It seems like judging the efficacy of tentative treatments is a prerequisite for judging the efficacy of predictors. Is there a theorem that shows it's possible to distinguish useful predictors from useless predictors in most scenarios? Especially when allowing predictions over subsets of the data? I suppose one could not reward predictors who make vacuous predictions ex post facto, but that might have a chilling effect on predictors who would otherwise bet on homeopathy looking like a placebo.

Basically any sort of self-fulfilling prophesy looks like a way to steal money away from solving the health care problem.

View more: Next