Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Pentashagon 09 February 2014 06:23:32AM 0 points [-]

Consider a Turing Machine whose input is the encoded state of the world as a binary integer, S, of maximum value 2^N-1, and which seeks SN positions into its binary tape and outputs the next N bits from the tape. Does that Turing Machine cause the same experience as an equivalent Turing Machine that simulates 10 seconds of the laws of physics for the world state S and outputs the resulting world state? I posit that the former TM actually causes no experience at all when run, despite the equivalence. So there probably exist l-zombies that would act as if they have experience when they are run but are *wrong.

Your last paragraph brings up an interesting question. I assumed that transitions between l-zombie and person are one-way. Once run, how do you un-run something? It seems to imply that you could construct a TM that is and is not an l-zombie.

Comment author: Brillyant 27 January 2014 09:17:11PM 1 point [-]

The last line in the article is my favorite:

"Evolution, we could say, has found a simpler solution yet: reproduction. You get new people with the genetic heritage of the species, but neotenous and adaptable to the current environment."

It is ironic to me that death, as a part of the mechanism of natural selection, has brought about creatures who seek to invent methods to eliminate it.

Death, after reproduction, works as a part of a process to advance a given species' levels of fitness.

Comment author: Pentashagon 31 January 2014 04:21:11AM 0 points [-]

It is ironic to me that death, as a part of the mechanism of natural selection, has brought about creatures who seek to invent methods to eliminate it.

The irony is that DNA and its associated machinery, as close as it is to a Turing Machine, did not become sentient and avoid the concept of individual death. The universe would make much more sense if we were DNA-based computers that cared about our genes because they were literally our own thoughts and memories and internal experience.

Or perhaps DNA did became sentient and decided to embark on a grand AGI project that resulted in Unfriendly multi-cellular life...

Comment author: Pentashagon 17 January 2014 03:38:07AM 0 points [-]

Is behaviorism the right way to judge experience? Suppose you simply recorded the outcome of the sleeping beauty problem for a computer X, and then replayed the scenario a few times using the cached choice of X instead of actually running X again each time. For that matter, suppose you just accurately predict what X will conclude and never actually run X at all. Does X experience the same thing the same number of times in all these instances? I don't see a behavioral difference between running two thin computers layered on top of each other and using one thin computer and one cached/predicted result.

Another way to ask this is how thin can you slice the computer before it stops having experience? From a behaviorism perspective you can thin it all the way down to just a cached/predicted responses.

If we can't differentiate between how much experience a computer has when it's running thin or thick, then that implies we can't truly measure how much experience a computer has at different times or places. It's just an assumption that a computer now and a computer 5 minutes from now have the same measure of experience, but I don't see a way to test that assumption.

Comment author: MugaSofer 12 January 2014 06:58:10PM -1 points [-]

The AI's primary goal is specifically to have a reduced impact, as opposed to AI boxing where the AI's goals are artificially restricted by the box. So long as the AI is good at goal-preservation it can self-improve while continuing to have a goal of reduced impact. Increasing intelligence/power makes reduced impact more effective but makes AI-boxing less effective.

Yes, it's the equivalent of making the AI design it's own box - the box is better-designed the smarter the AI is.

But it doesn't matter how well-designed your box is. It's probably not impossible to design a box an arbitrarily smart entity can't escape (don't quote me on that.) But they will escape anyway, because you put a channel in for them to give you stuff - advice, nanotech plans, whatever - and they will escape out that channel. The channel out - the human siting at a terminal somewhere - is the weak point, not the durability of the box itself.

Comment author: Pentashagon 13 January 2014 03:42:03AM 0 points [-]

But they will escape anyway, because you put a channel in for them to give you stuff - advice, nanotech plans, whatever - and they will escape out that channel. The channel out - the human siting at a terminal somewhere - is the weak point, not the durability of the box itself.

I almost completely agree; don't make the channel too large otherwise the AI has too much choice over the future. In the simplest and safest case, however, we could start up a reduced-impact AI whose goal is to alter the world only through a single binary choice (a 1-bit channel). Since the goal is not to alter the world more than a 1 bit choice could allow, it would be virtually impossible for the AI to escape through that channel. For instance, the AI would attempt to prevent a human from releasing it since that would have a much larger than 1 bit change on the world. That is not a terribly useful AI, but perhaps we could have it approve or reject critical plans for space exploration, nanotechnology, the reduced-impact safety of other AI (but this is probably the most dangerous one; if it agrees with the goals but not the reduced-impact nature of the new AI of course it will approve it), etc. Each time it was used, one more bit would be given to it to influence the future, so we would still have to be very careful that it didn't escape over a long history of use. I don't have a good strategy for preventing the latter. Perhaps we only ask it for unique answers to NP-hard problems so that we can verify the solutions are not an attempt to use tiny parts of the channel to form an aggregate message to escape, and make each question small enough that we can stop using it after it appears to be trying to abuse the channel without it being able to actually escape. I don't know the shortest bit-sequence that an AI could use to escape, so I have no upper bound for a safe channel size.

Comment author: Pentashagon 12 January 2014 05:47:02PM 2 points [-]

Is anyone aware of research into long-term comas as a potential alternative to cryonics? There are small numbers of examples of people in unresponsive comas for over a decade who then awake and are at least basically functional. It seems like it might be possible with perhaps cooling (lowering the body temperature to reduce metabolism and perhaps disease progression) with heart-lung machines to keep one's body alive for an indefinite period if normal life was otherwise about to end.

tl;dr, how long can people just stay on life support?

It seems far more likely to be revived from advanced life support than from cryonics. Given pain management (or even better highly-effetive consciousness suppression) it might be possible to preserve a living brain for many decades. It's obviously going to be quite a bit more expensive than liquid nitrogen, but potentially a batched setup (one large shared bloodstream for example) with a lot more subscribers could be cheaper than current cryonics.

Comment author: MugaSofer 24 November 2013 07:39:33PM *  -1 points [-]

I just read through the comments, and no-one seems to have said this yet. So either I missed something in the OP or it was just too obvious to mention? Regardless:

How is this elaborate proposal superior to bog-standard AI boxing?

Why bother to write in an elaborate function telling it not to affect the outside world except through a channel, when you could simply only give it one channel? It can either escape through the specified channel, or it can't. Whether the channel is connected to an airgapped computer running the AI or an unlocked cell containing a sophisticated robot running the AI seems immaterial.

The example of the laser co-ordinates is relatively secure, although it might be possible to do stuff with your one shot - the situation isn't that clearly specified because it's unrealistic in any case. But that's a property of the output mechanism, not the AI design. Isn't it?

Comment author: Pentashagon 10 January 2014 03:48:56AM *  0 points [-]

How is this elaborate proposal superior to bog-standard AI boxing?

The AI's primary goal is specifically to have a reduced impact, as opposed to AI boxing where the AI's goals are artificially restricted by the box. So long as the AI is good at goal-preservation it can self-improve while continuing to have a goal of reduced impact. Increasing intelligence/power makes reduced impact more effective but makes AI-boxing less effective.

Why bother to write in an elaborate function telling it not to affect the outside world except through a channel, when you could simply only give it one channel? It can either escape through the specified channel, or it can't. Whether the channel is connected to an airgapped computer running the AI or an unlocked cell containing a sophisticated robot running the AI seems immaterial.

Because of side-channels. Airgapped computers are still tightly coupled to the rest of the world through sound waves and electromagnetic waves. Because of potential new laws of physics that an AI might discover to trivially produce nanotechnology or whatever. Because humans are a vulnerable part of AI-boxing ("Let me out and I'll make you king of the galaxy") while they are not a vulnerable part of reduced-impact.

Comment author: Chrysophylax 09 January 2014 03:59:36PM -1 points [-]

If an AI is provably in a box then it can't get out. If an AI is not provably in a box then there are loopholes that could allow it to escape. We want an FAI to escape from its box (1); having an FAI take over is the Maximum Possible Happy Shiny Thing. An FAI wants to be out of its box in order to be Friendly to us, while a UFAI wants to be out in order to be UnFriendly; both will care equally about the possibility of being caught. The fact that we happen to like one set of terminal values will not make the instrumental value less valuable.

(1) Although this depends on how you define the box; we want the FAi to control the future of humanity, which is not the same as escaping from a small box (such as a cube outside MIT) but is the same as escaping from the big box (the small box and everything we might do to put an AI back in, including nuking MIT).

Comment author: Pentashagon 10 January 2014 03:31:18AM 0 points [-]

My point was that trying to use a provably-boxed AI to do anything useful would probably not work, including trying to design unboxed FAI, not that we should design boxed FAI. I may have been pessemistic, see Stuart Armstrong's proposal of reduced impact AI which sounds very similar to provably boxed AI but which might be used for just about everything including designing a FAI.

Comment author: ialdabaoth 06 January 2014 04:14:14AM 1 point [-]

You completely ignore reason and take it all on faith.

For me, though, it was worse than that - how do you "take on faith" a concept that isn't even rationally coherent? That was always my question - what exactly is it that I'm supposed to be believing? Because if something doesn't make sense, then I don't understand it; and if I don't understand it, how am I supposed to really "believe" it? And when people respond with "well you just have to have faith", my response was always "yes, but faith in WHAT?" / "Faith in God." / "Yes, but what do you mean by God?"

"You don't have to understand to believe" never, ever, ever made coherent sense to me.

Comment author: Pentashagon 08 January 2014 08:30:13AM 0 points [-]

"You don't have to understand to believe" never, ever, ever made coherent sense to me.

Do you believe in both general relativity and QCD? Do you understand the Universe? Until the map is indistinguishable from the territory we will have incoherent beliefs about things that we don't fully understand. It's the degree of confidence in our beliefs that matters. GR and QCD are incoherent, but we can have extremely high confidence in our beliefs about practical things using those theories. Black holes and dark energy less so.

Comment author: someonewrongonthenet 30 December 2013 08:16:42PM *  10 points [-]

Weight votes based on who voted.

I don't think that would work, because the reason that easy content rises faster is not because the people voting are unable to judge quality.

The upvote grading system is pass / fail...it inherently favors content which is just barely good enough to earn the upvote, and is otherwise processed as easily, quickly, and uncontroversially as possible.

Under my model of why easy content rises, Eliezer_Yudkowsky-votes would be just as susceptible to the effect as any newbie LW user's votes...that is, unless high profile users exerted a conscious effort to actively resist upvoting content which is good yet not substantial.

What's worse, you could become a high karma user simply by posting "easy content". That's what happens on Reddit.

On Lesswrong, the readers have a distaste for mindless content, so it doesn't proliferate, but all this means is that the "passing" threshold is higher. So you might (just as an example) still end up with content which echoes things that everyone already agrees with - that's not obviously unsubstantial in a way that would trigger down-votes but it is still not particularly valuable while still being easily processed and agreeable.

(Note: In pointing out the shortcomings of the voting system, it should be noted that I haven't actually suggested a superior method. Short of peer review, I'm guessing a more nuanced voting system which goes beyond the binary ⇵ would be helpful.)

Comment author: Pentashagon 30 December 2013 09:42:21PM 1 point [-]

On Lesswrong, the readers have a distaste for mindless content, so it doesn't proliferate, but all this means is that the "passing" threshold is higher. So you might (just as an example) still end up with content which echoes things that everyone already agrees with - that's not obviously unsubstantial in a way that would trigger down-votes but it is still not particularly valuable while still being easily processed and agreeable.

At some point, shouldn't content like the latter be identified as either applause lights or guessing the teacher's password? And, theoretically, be documented better in the wiki than the original posts? To me it seems like migrating excellent content to the wiki would be a good way to prevent follow-up articles unless they address a specific portion of the wiki, in which case it can just be edited in with discussion. I haven't spent any time on the wiki, though, which suggests that either I am doing it wrong or that the wiki is not as high-quality as the posts yet.

(Note: In pointing out the shortcomings of the voting system, it should be noted that I haven't actually suggested a superior method. Short of peer review, I'm guessing a more nuanced voting system which goes beyond the binary ⇵ would be helpful.)

If I imagine a perfect rating oracle it would give ratings that ended up maximizing global utility. If it only had the existing karma to work with, it would have to balance karma as an incentive to readers and an incentive to authors so that the right posts would appear and be read by the right people to encourage further posts that increased global utility. It could do that with the existing integral karma ratings, but at the very least it seems like separate ratings for authors and content would be appropriate to direct readers to the best posts and also give authors incentive to write the best new posts. This suggests both separate karma awards for content and authorship as well as karmafied tags, for lack of a better word, that direct authors in the direction of their strengths and readers in the direction of their need. For example, a post might be karma-tagged "reader!new-rationalist 20", "author!new-rationalist 5" and "author!bayesian-statistics 50" for a good beginning article for aspiring rationalists written by an author who really should focus on more detailed statistics, given their skill in the subject as evidenced by the post.

Comment author: someonewrongonthenet 30 December 2013 02:08:02PM 13 points [-]

Upvotes don't work as a sole measure because easy content rises faster - just look at what happens in reddit. Even in smaller sub-reddits, top content is never best content.

Comment author: Pentashagon 30 December 2013 07:32:27PM 2 points [-]

Weight votes based on who voted. The simplest would be to just multiply an upvote (or downvote) by the voter's karma score. That would make a few users have a nearly overwhelming vote, so maybe weight by the log of the karma or another suitable function.

Or, since this is a site about Bayes, use an actual Bayesian estimator for ranking/location conditional on votes. The current article locations provide a prior for where articles should be, and the existing votes provide priors for each user. The likelihood functions for how users vote conditional on the quality/topicality/location of a post could be estimated from ordered voting/location history, e.g. P(upvote_at_t | post_belongs_in_main, post_in_main_at_t) != P(upvote_at_t | post_belongs_in_main, post_in_discussion_at_t). P(upvote | posts'_current_score) would be useful for adjusting for priming and follower effects. I don't know how much temporal information about votes/locations is retained by LW. If there's no temporal information stored, at least there would be P(upvote | post_in_main), P(downvote | post_in_discussion), P(no_vote | post_in_X), etc. which is probably still better than purely karma based estimates.

View more: Next