Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: MaryCh 05 January 2018 03:23:17PM 0 points [-]

Well, in my life I can recall two instances off-hand. There have probably been more of them, but at the very least, they seem to be completely unrelated to attempts to raise well-being levels...

Comment author: RedMan 05 January 2018 01:32:59PM 1 point [-]

Thank you for the detailed response!

I agree that the argument you advance here is the sane one, but I have trouble reconciling it with my interpretation of Effective Altruism: 'effort should be made to expend resources on preventing suffering, maximize the ratio of suffering avoided to cost expended'

I interpret your paper as rejecting the argument advanced by prof Hansen that if of all future variants of you, the number enjoying 'heaven' vastly outnumber the copies suffering 'hell', on balance, uploading is a good. Based on your paper's citation of Omelas, I assert that you would weight 'all future heaven copies' in aggregate, and all future hell copies individually.

So if the probability of one or more hell copies of an upload coming into existence for as long as any heaven copy exceeds the probability of a single heaven copy existing long enough to outlast all the hell copies, that person's future suffering will eventually exceed all suffering previously experienced by biological humans. Under the EA philosophy described above, this creates a moral imperative to prevent that scenario, possibly with a blender.

If uploading tech takes the form of common connection and uploading to an 'overmind', this can go away--if everyone is Borg, there's no way for a non-Borg to put Borg into a hell copy, only Borg can do that to itself, which is, at least from an EA standpoint, probably an acceptable risk.

At the end of the day, I was hoping to adjust my understanding of EA axioms, not be talked down from chasing my friends around with a blender, but that isn't how things went down.

SF is a tolerant place, and EAs are sincere about having consistent beliefs, but I don't think my talk title "You helped someone avoid starvation with EA and a large grant. I prevented infinity genocides with a blender" would be accepted at the next convention.

Comment author: Kaj_Sotala 05 January 2018 06:55:15AM 1 point [-]

Awesome paper.

Thank you very much!

Curious about your take on my question here: http://lesswrong.com/lw/os7/unethical_human_behavior_incentivised_by/

So, I agree that mind uploads being tortured indefinitely is a very scary possibility. And it seems very plausible that some of that is going to happen in a world with mind uploads, especially since it's going to be impossible to detect from the outside, unless you are going to check all the computations that anyone is running.

On the other hand, we don't know for sure what that world is going to be like. Maybe there will be some kind of AI in charge that does check everyone's computations, maybe all the hardware that gets sold is equipped with built-in suffering-detectors that disallow people from running torture simulations, or something. I'll admit that both of these seem somewhat unlikely or even far-fetched, but then again, someone might come up with a really clever solution that I just haven't thought of.

Your argument also seemed to me to have some flaws:

Over a long enough timeline, the probability of a copy of any given uploaded mind falling into the power of a sadistic jerk approaches unity. Once an uploaded mind has fallen under the power of a sadistic jerk, there is no guarantee that it will ever be 'free',

You can certainly make the argument that, for any event with non-zero probability, then over a sufficiently long lifetime that event will happen at some point. But if you are using that to argue that an upload will be captured by someone sadistic eventually, shouldn't you also hold that they will also escape eventually?

This argument also doesn't seem to be unique to mind uploading. Suppose that we achieved biological immortality and never uploaded. You could also make the argument that, now that people can live until the heat-death of the universe (or at least until our sun goes out), then their lifetimes are sufficiently long that at some point in their lives they are going to be kidnapped and tortured indefinitely by someone sadistic, so therefore we should kill everyone before we get radical life extension.

But for biological people, this argument doesn't feel anywhere near as compelling. In particular, this scenario highlights the fact that even though there might be a non-zero probability for any given person to be kidnapped and tortured during their lifetimes, that probability can be low enough that it's still unlikely to happen even during a very long lifetime.

You could reasonably argue that for uploads, it's different, since it's easier to make a copy of an upload undetected etc., so the probability of being captured during one's lifetime is larger. But note that there have been times in history that there actually was a reasonable chance for a biological human to be captured and enslaved during their lifetime! Back during the era of tribal warfare, for example. But we've come a long way from those times, and in large parts of the world, society has developed in such a way to almost eliminate that risk.

That, in turn, highlights the point that it's too simple to just look at whether we are biological or uploads. It all depends on how exactly society is set up, and how strong are the defenses and protections that society provides to the common person. Given that we've developed to the point where biological persons have pretty good defenses against being kidnapped and enslaved, to the point where we don't think that even a very long lifetime would be likely to lead to such a fate, shouldn't we also assume that upload societies could develop similar defenses and reduce the risk to be similarly small?

Comment author: RedMan 05 January 2018 12:15:25AM 1 point [-]

Curious about your take on my question here: http://lesswrong.com/lw/os7/unethical_human_behavior_incentivised_by/ Awesome paper.

Comment author: RedMan 04 January 2018 07:17:21PM *  0 points [-]

I hadn't thought about it that way.

I do think that either compiler time flags for the AI system or a second 'monitor' system chained to the AI system in order to enforce the named rules would probably limit the damage.

The broader point is that probabilistic AI safety is probably a much more tractable problem than absolute AI safety for a lot of reasons, to further the nuclear analogy, emergency shutdown is probably a viable safety measure for a lot of the plausible 'paperclip maximizer turns us into paperclips' scenarios.

"I need to disconnect the AI safety monitoring robot from my AI-enabled nanotoaster robot prototype because it keeps deactivating it" might still be the last words a human ever speaks, but hey, we tried.

Comment author: Lumifer 04 January 2018 03:47:35PM 0 points [-]

Are you reinventing Asimov's Three Laws of Robotics?

Comment author: thefishinthetank 04 January 2018 07:27:36AM 1 point [-]

OTOH, joy is very different. It kind of just happens, unasked-for.

This is the happiness we are really searching for. The other kind is better described as pleasure.

Comment author: thefishinthetank 04 January 2018 07:23:57AM *  2 points [-]

Interesting post. I can definitely identify with the journey of exercise, supplementation, and spiritual exploration.

I would like to caution you that your connection between the calming effects of vasodialation and enlightenment might be a bit superficial. It seems you have discovered what it is like to be calm, or have equanimity. While being calm is both a prerequisite and downstream effect of enlightenment, it is not to be confused with the deep knowledge of truth (enlightenment). Enlightenment is a deep subconscious insight, that becomes more likely to happen when the mind is calm, clear, and alert.

Enlightenment is also not state dependent. It's often thought of as something you realize and don't forget. It also induces perceptual changes, like those described by Jeffrey Martin. Entering states where your mind is finding profound connections is not enlightenment, but it is a step closer to realizing that insight.

I'm posting this not to tear down your experience, but to urge you on. I'm suggesting that you may think you've sailed the seven seas, where in reality you only saw a picture of a boat. Thinking you've found enlightenment and that it's not great is likely to steer you away from this path, which in my opinion, would be unfortunate.

And how can I be so sure that you didn't find enlightenment? Those who find it don't discredit it. ;)

Comment author: RedMan 04 January 2018 02:00:03AM *  0 points [-]

Rules for an AI:

If an action it takes results in more than N logs of $ worth of damage to humans/kills more than N logs of humans, transfer control of all systems it can provide control inputs to designated backup (human, formally proven safe algorithmic system, etc), power down.

When choosing among actions which affect a system external to it, calculate probable effect on human lives. If probability of exceeding N assigned in rule 1 is greater than some threshold Z, ignore that option, if no options are available, loop.

Most systems would be set to N= 1, Z = 1/10000, giving us five 9s of certainty that the AI won't kill anyone. Some systems (weapons, climate management, emergency management dispatch systems) will need higher N scores and lower Z scores to maintain effectiveness.

JFK had an N of like 9 and a Z score of 'something kind of high', and passed control to Lyndon B Johnson of 'I have a minibar and a shotgun in the car I keep on my farm so I can drive and shoot while intoxicated' fame. We survived that, we will be fine.

Are we done?

Comment author: RedMan 04 January 2018 01:45:11AM 0 points [-]

That's great to hear, stay safe.

This sort of data was a contributor to my choice of sport for general well being: https://graphiq-stories.graphiq.com/stories/11438/sports-cause-injuries-high-school#Intro

There is truth to it: https://www.westside-barbell.com/blogs/2003-articles/extra-workouts-2

Really grateful for the info, I never could put my finger on what exactly I was not liking about CM when I wasn't pushing myself, stuff is amazing for preventing exercise soreness though.

Comment author: Elo 03 January 2018 03:35:02PM 1 point [-]

Interesting you say that about bad when you are not lifting. There just wasn't any warning from anyone (there probably was but I took no notice).

I have been back to doctors, and I do run several times a week these days. It's not set wrong or else I couldn't run, and I never got an x-ray.

I went back to trampoline 6 months later and I injured myself trying to do something that I didn't have the muscles for any more. It strikes me as more dangerous than I was willing to admit. It's exercise that really pushes your body and I'm not sure I am comfortable with it compared to things that are more within a body's limit.

For example rock climbing - you are limited by what your body let's you do. Only lift your own weight. And that's a lot closer to the safe limit than trampolines which interact with external contraption and do things like compress your spine and cause unnatural brain shaking.

Weakest link theory was a bit of a joke but I am sure there is some truth to it.

Comment author: RedMan 03 January 2018 12:50:47PM 0 points [-]

You back to trampolining yet?

Way to eat a broken bone and not seek medical attention for it, someone I knew did about what you did and ended up having to have a doctor re-break and set the bone to fix things. Lots of 'newly fit' people, particularly teenagers, have your 'injury from stupidity' behavior pattern. This is one of the reasons professional athletes are banned from amateur sports by their contracts

The great coach Louis Simmons is worth reading, he will expand your mind on your weakest link theory.

My own conclusion on your magic enlightenment pill, based on my lived experience: super awesome when you're lifting, Fs you up a bit when you're not. Use it around intense exercise, otherwise avoid.

Comment author: 333kenshin 03 January 2018 09:07:21AM *  0 points [-]

As a Christian turned atheist, I can attest to the fact that church rituals do in fact encompass quite a few valid and effective techniques.

Consider the following practices which researchers have fairly well established contribute to mental wellness (all links are to Psychology Today):

Nothing surprising or new, right?

But the weird thing is when you realize each of the above practices is embedded in weekly church attendance:

  • confidant => confession
  • gratitude => grace
  • recitation => lord's prayer
  • singing => hymns
  • water => baptism (traditionally carried out down by the river)

In other words, church attendance provides a concentrated bundle of mental health benefit.

And I think this should jibe in terms of explaining why so many people continue to adhere to religion despite its obvious downsides. The usual explanation is that they must be dumb or irrational. But now we have a simpler explanation: that these mental health upsides offset the downsides. It doesn't require an assumption of extreme stupidity and/or irrationality (of course, it holds up just as well if they do happen to be so). As Bayesians, what is more probable: that we are all that much smarter and more rational then each and every one of them? Or that they simply value happiness more than than they value logic?

And yes, I know I'm presenting a false dichotomy to imply that happiness and logic are either/or proposition. But given that presently, access to many of these practices is limited outside of church. For example, the only socially acceptable venues for non-professionals to sing in is in the shower and at karaoke bars. Likewise, therapy costs an arm and a leg, and the prospects of finding someone else to confide in is spotty at best.

Which suggests what our next step should be as a community: to show that it's possible to be happy and logical. I suggest incorporating these practices into our own meetups as widely as possible - eg conduct meet at park fountains or with a rock band set. Only when we break this perceived monopoly of religion on mental well being will people in large number entertain leaving the church

Comment author: Torello 02 January 2018 06:17:50PM 0 points [-]
Comment author: Kallandras 01 January 2018 09:22:10PM 0 points [-]

I've recently begun listening to a few bands that are new to me - Parov Stelar, Tape Five, Caravan Palace, and Goldfish. I have found the upbeat tempo of electro-swing to be helpful when I want to improve my mood.

Comment author: gwern 01 January 2018 03:49:54AM 0 points [-]
Comment author: James_Miller 01 January 2018 03:45:39AM 1 point [-]

I've started creating a series of YouTube videos on the dangers of artificial general intelligence.

Comment author: ArisKatsaris 01 January 2018 02:12:48AM 0 points [-]

Short Online Texts Thread

Comment author: ArisKatsaris 01 January 2018 02:12:40AM 0 points [-]

Online Videos Thread

Comment author: ArisKatsaris 01 January 2018 02:12:36AM 0 points [-]

Fanfiction Thread

Comment author: ArisKatsaris 01 January 2018 02:12:32AM 0 points [-]

Nonfiction Books Thread

Comment author: ArisKatsaris 01 January 2018 02:12:28AM 0 points [-]

Fiction Books Thread

Comment author: ArisKatsaris 01 January 2018 02:12:22AM 0 points [-]

TV and Movies (Animation) Thread

Comment author: ArisKatsaris 01 January 2018 02:12:19AM 0 points [-]

TV and Movies (Live Action) Thread

Comment author: ArisKatsaris 01 January 2018 02:12:15AM 0 points [-]

Games Thread

Comment author: ArisKatsaris 01 January 2018 02:12:12AM 0 points [-]

Music Thread

Comment author: ArisKatsaris 01 January 2018 02:12:06AM 0 points [-]

Podcasts Thread

Comment author: ArisKatsaris 01 January 2018 02:12:02AM 0 points [-]

Other Media Thread

Comment author: ArisKatsaris 01 January 2018 02:11:56AM 0 points [-]

Meta Thread

Comment author: ChristianKl 29 December 2017 06:34:07PM 0 points [-]

Oliver Habryka (who works on programming LW2 at the moment) taught rationality to other students at his school a while back based on CFAR style ideas which at the time meant a lot of calibration and Fermi estimates.

The same would also make sense with the more recent CFAR material for anyone who took the CFAR course.

Comment author: Lu_Tong 27 December 2017 03:43:36AM *  0 points [-]

Thanks, I'll ask a couple more. Do you think UDT is a solution to anthropics? What is your ethical view (roughly, even given large uncertainty) and what actions do you think this prescribes? How have you changed your decisions based on the knowledge that multiple universes probably exist (AKA, what is the value of that information)?

Comment author: Luke_A_Somers 26 December 2017 12:27:10AM 0 points [-]

If you find an Omega, then you are in an environment where Omega is possible. Perhaps we are all simulated and QM is optional. Maybe we have easily enough determinism in our brains that Omega can make predictions, much as quantum mechanics ought to in some sense prevent predicting where a cannonball will fly but in practice does not. Perhaps it's a hypothetical where we're AI to begin with so deterministic behavior is just to be expected.

Comment author: Luke_A_Somers 26 December 2017 12:11:58AM 0 points [-]

I think the more relevant case is when the random noise is imperceptibly small. Of course you two-box if it's basically random.

Comment author: NerdyAesthete 24 December 2017 11:03:04PM *  1 point [-]

Sometimes, it almost seems like I am truly happy only when I "escape" or "triumph" over something that almost "ate me up": my husband's household, the Department that I had gone to for a PhD thesis... the genuinely nice psychiatrist who soothes my Mother's fears... Like "I am happy when I have proved that I haven't changed, because change is corruption".

I'd say that's a relief from a precarious situation, which does provide happiness, but is only temporary and not sustainable.

However, contentment (or a relaxed sense of well-being) is a form of happiness that can be sustained until something distressful occurs. Sustaining contentment may require life changes, for I feel many people's lives are incompatible with this sustained level of contentment; the lack of freedom imposed by obligations tends to being more stressing than not.

Also, exhilaration is another form of happiness (similar to anxiety, but the difference is certainly noticeable) that is desirable that's tricky to activate. I believe your joy is similar to my exhilaration, or maybe a gradation between contentment and exhilaration.

Comment author: Wei_Dai 24 December 2017 01:26:37AM 0 points [-]

I talked a bit about why I think multiple universes exist in this post. Aside from what I said there, I was convinced by Tegmark's writings on the Mathematical Universe Hypothesis. I can't really think of other views that are particularly worth mentioning (or haven't been talked about already in my posts), but I can answer more questions if you have them?

Comment author: morganism 23 December 2017 10:52:21PM 0 points [-]

"Destroyed Worlds" --Cause Star's Strange Dimming (VIDEO)

"A team of U.S. astronomers studying the star RZ Piscium has found evidence suggesting its strange, unpredictable dimming episodes may be caused by vast orbiting clouds of gas and dust, the remains of one or more destroyed planets."


Comment author: morganism 23 December 2017 10:44:44PM 0 points [-]
In response to Happiness Is a Chore
Comment author: MaryCh 23 December 2017 06:02:54PM 1 point [-]

I feel so much freer when I don't have to demonstrate that I am happy.

Sometimes, it almost seems like I am truly happy only when I "escape" or "triumph" over something that almost "ate me up": my husband's household, the Department that I had gone to for a PhD thesis... the genuinely nice psychiatrist who soothes my Mother's fears... Like "I am happy when I have proved that I haven't changed, because change is corruption". So yes, [feeling happy] is one of the necessary chores of self-maintenance. I don't get why I should want it more than, say, a chance to sleep in.

OTOH, joy is very different. It kind of just happens, unasked-for.

Comment author: Lu_Tong 22 December 2017 09:52:56PM 1 point [-]

Which philosophical views are you most certain of, and why? e.g. why do you think that multiple universes exist (and can you link or give the strongest argument for this?)

Comment author: Vaniver 22 December 2017 07:01:06PM 0 points [-]

I don't think that what you need has any bearing on what reality has actually given you.

As far as I can tell, I would pay Parfit's Hitchhiker because of intuitions that were rewarded by natural selection. It would be nice to have a formalization that agrees with those intuitions.

or by sneaking in different metaphysics

This seems wrong to me, if you're explicitly declaring different metaphysics (if you mean the thing by metaphysics that I think you mean). If I view myself as a function that generates an output based on inputs, and my decision-making procedure being the search for the best such function (for maximizing utility), then this could be considered as different metaphysics from trying to cause the most increase in utility for myself by making decisions, but it's not obvious that the latter leads to better decisions.

Comment author: bestazy 22 December 2017 06:19:40PM *  0 points [-]

I may be going too far afield here but did anyone else notice the part where the author says about AI that it can’t recognize uncertainty, so it ignores it, which brought to mind the recent self driving crashes where an unexpected event causes a crash, so while a human driver says whoa, uncertainty, I’m slowing down while I try to figure out what this other driver is up to, the AI at this point says, I don’t know what it is so it doesn’t exist. Seems like some postings recently stating that algos only know what they’re told and that is a big hurdle for the aforementioned masters of the tech universe. bestazy

Comment author: turchin 22 December 2017 05:13:33PM 0 points [-]

It should be able to understand human language or it is (most likely) not dangerous.

Comment author: RedMan 21 December 2017 07:54:35PM *  0 points [-]

Correct. I have found that the works written at the time when the relevant technical work had just recently been completed, by the people who made those breakthroughs, is often vastly superior to summary work written decades after the field's last major breakthrough.

If I remember correctly, Elon Musk cited some older texts on rocketry as his 'tree trunk' of knowledge about the subject.

This advice only applies to mature fields, in places where fundamental breakthroughs are happening regularly, this advice is downright awful.

Comment author: RedMan 21 December 2017 07:49:15PM 0 points [-]

Assertion: Any fooming non-human AI incompatible with uplifting technology would be too alien to communicate with in any way. If you happen to see one of those, probably a good idea to just destroy it on sight.

Comment author: Lumifer 20 December 2017 08:36:08PM *  0 points [-]

I suspect the solution is this.

In response to Happiness Is a Chore
Comment author: NancyLebovitz 20 December 2017 04:10:57PM 0 points [-]

I'm somewhat annoyed that this claims there's a solution to becoming happier, goes on at some length, and doesn't include the solution.

Comment author: Lumifer 20 December 2017 03:22:49PM 0 points [-]


Comment author: entirelyuseless 20 December 2017 03:43:01AM 0 points [-]

I do care about tomorrow, which is not the long run.

I don't think we should assume that AIs will have any goals at all, and I rather suspect they will not, in the same way that humans do not, only more so.

Comment author: UNVRSLWSDM 19 December 2017 08:31:31PM 2 points [-]

lovely... Catch Me If You Can, AI style... How about adding a box for "too much power in wrong human hands" (if not already there somehow)? Yes, AI power too.

Because this is the far greatest problem of human civilization. Everything (just staying in the tech domain) is created by very small fraction of the population (who posses that superpower called INTELLIGENCE) , but the society is ruled by a completely different tiny group that posses various predatory superpowers to assume control and power (over the herd of mental herbivores they "consume").

This is not going to end up well, such a system does not scale, has imbalances and systemic risks practically everywhere. Remember, already nuclear weapons were "too much" and we are only lucky that bio$hit is not really effective as a weapon (already gas was nothing much).

We are simply too much animals and too little on that "intelligent" side we often worship. And the coming nanobots will be far worse than nukes, and far less controllable.

Comment author: Lumifer 19 December 2017 04:40:27PM 0 points [-]


That's not conventionally considered to be "in the long run".

We don't have any theory that would stop AI from doing that

The primary reason is that we don't have any theory about what a post-singularity AI might or might not do. Doing some pretty basic decision theory focused on the corner cases is not "progress".

Comment author: Lumifer 19 December 2017 04:38:02PM 0 points [-]

It seems weird that you'd deterministically two-box against such an Omega

Even in the case when the random noise dominates and the signal is imperceptibly small?

Comment author: Lumifer 19 December 2017 04:35:13PM *  1 point [-]

So the source-code of your brain just needs to decide whether it'll be a source-code that will be one-boxing or not.

First, in the classic Newcomb when you meet Omega that's a surprise to you. You don't get to precommit to deciding one way or the other because you had no idea such a situation will arise: you just get to decide now.

You can decide however whether you're the sort of person who accepts their decisions can be deterministically predicted in advance with sufficient certainty, or whether you'll be claiming that other people predicting your choice must be a violation of causality (it's not).

Why would you make such a decision if you don't expect to meet Omega and don't care much about philosophical head-scratchers?

And, by the way, predicting your choice is not a violation of causality, but believing that your choice (of the boxes, not of the source code) affects what's in the boxes is.

Second, you are assuming that the brain is free to reconfigure and rewrite its software which is clearly not true for humans and all existing agents.

Comment author: Lumifer 19 December 2017 04:30:54PM 1 point [-]

Old and tired, maybe, but clearly there is not much consensus yet (even if, ahem, some people consider it to be as clear as day).

Note that who makes the decision is a matter of control and has nothing to do with freedom. A calculator controls its display and so the "decision" to output 4 in response to 2+2 it its own, in a way. But applying decision theory to a calculator is nonsensical and there is no free choice involved.

Comment author: cousin_it 19 December 2017 03:51:47PM *  0 points [-]

I hope at least you care if everyone on Earth dies painfully tomorrow. We don't have any theory that would stop AI from doing that, and any progress toward such a theory would be on topic for the contest.

Sorry, I'm feeling a bit frustrated. It's as if the decade of LW never happened, and people snap back out of rationality once they go off the dose of Eliezer's writing. And the mode they snap back to is so painfully boring.

Comment author: entirelyuseless 19 December 2017 03:38:06PM 0 points [-]

Not really. I don't care if that happens in the long run, and many people wouldn't.

Comment author: cousin_it 19 December 2017 12:51:00PM *  0 points [-]

Let's say I build my Omega by using a perfect predictor plus a source of noise that's uncorrelated with the prediction. It seems weird that you'd deterministically two-box against such an Omega, even though you deterministically one-box against a perfect predictor. Are you sure you did the math right?

Comment author: cousin_it 19 December 2017 12:43:02PM *  0 points [-]

For example, not turning the universe into paperclips is a goal of humanity.

Comment author: JenniferRM 19 December 2017 06:34:49AM 0 points [-]

So, at one point in my misspent youth I played with the idea of building an experimental Omega and looked into the subject in some detail.

In Martin Gardiner's writeup on this back in 1973 reprinted in The Night Is Large the essay explained that the core idea still works if Omega can just predict with 90% accuracy.

Your choice of ONE box pays nothing if you're predicted (incorrectly) to two box, and pays $1M if predicted correctly at 90%, for a total EV of $900,000 (== (0.1)0 + (0.9)1,000,000).

Your choice of TWO box pays $1k if you're predicted (correctly) to two box, and pays $1,001,000 if you're predicted to only one box for a total EV of $101k (== 900 + 100,100 == (0.9)1,000 + (0.1)1,001,000).

So the expected profit from one boxing in a normal game, with Omega accuracy of 90% would be $799k.

Also, by adjusting the game's payouts we could hypothetically make any amount of genuine human predictability (even just a reliable 51% accuracy) be enough to motivate one boxing.

The super simplistic conceptual question here is the distinction between two kinds of sincerity. One kind of sincerity is assessed at the time of the promise. The other kind of sincerity is assessed retrospectively by seeing whether the promise was upheld.

Then the standard version of the game tries to put a wedge between these concepts by supposing that maybe an initially sincere promise might be violated by the intervention of something like "free will", and it tries to make this seem slightly more magical (more of a far mode question?) by imagining that the promise was never even uttered, but rather the promise was stolen from the person by the magical mind reading "Omega" entity before the promise was ever even imagined by the person as being possible to make.

One thing that seems clear to me is that if one boxing is profitable but not certain then you might wish you could have done something in the past that would make it clear that you'll one box, so that you land in the part of Omega's calculations where the prediction is easy, rather than being one of the edge cases where Omega really has to work for its brier score.

On the other hand, the setup is also (probably purposefully) quite fishy. The promise that "you made" is originally implicit, and depending on your understanding of the game maybe extremely abstract. Omega doesn't just tell you what it predicted. If you get one box and get nothing and complain then Omega will probably try to twist it around and blame you for its failed prediction. If it all works then you seem to be getting free money, and why is anyone handing out free money?

The whole thing just "feels like the setup for a scam". Like you one box, get a million, then in your glow of positive trust you give some money to their charitable cause. Then it turns out the charitable cause was fake. Then it turns out the million dollars was counterfeit but your donation was real. Sucker!

And yet... you know, parents actually are pretty good at knowing when their kids are telling the truth or lying. And parents really do give their kids a free lunch. And it isn't really a scam, it is just normal life as a mortal human being.

But also in the end, for someone to look their parents in the eyes and promise to be home before 10PM and really mean it for reals at the time of the promise, and then be given the car keys, and then come home at 1AM... that also happens. And wouldn't it be great to just blame that on "free will" and "the 10% of the time that Omega's predictions fail"?

Looping this back around to the larger AGI question, it seems like what we're basically hoping for is to learn how to become a flawless Omega (or at least build some software that can do this job) at least for the restricted case of an AGI that we can give the car keys without fear that after it has the car keys it will play the "free will" card and grind us all up into fuel paste after promising not to.

Comment author: Snorky 18 December 2017 10:40:28PM 0 points [-]

I was wondering about the same at my University in Belarus. But somehow it feels like it wont work :D CIS Educational system is not even trying to study smart and well-developed graduates.

P.S. - Let me know if you get good answer snorky_arc@mail.ru

Comment author: JonsG 18 December 2017 03:14:36PM *  0 points [-]

Ah yes. My main thought after the 2008 crisis was surely, “I can’t wait until next time, when super-intelligent machines have overleveraged the entire economy faster and more efficiently than those Goldman and AIG guys ever could.” JonsG

Comment author: ike 18 December 2017 12:33:37AM 0 points [-]

It's not just the one post, it's the whole sequence of related posts.

It's hard for me to summarize it all and do it justice, but it disagrees with the way you're framing this. I would suggest you read some of that sequence and/or some of the decision theory papers for a defense of "should" notions being used even when believing in a deterministic world, which you reject. I don't really want to argue the whole thing from scratch, but that is where our disagreement would lie.

Comment author: ArisKatsaris 17 December 2017 08:15:21PM 3 points [-]

You're using words like "reputation", and understand how having a reputation for one-boxing is preferable, when we're discussing the level where Omega has access to the source code of your brain and can just tell whether you'll one-box or not, as a matter of calculation.

So the source-code of your brain just needs to decide whether it'll be a source-code that will be one-boxing or not. This isn't really about "precommittment" for that one specific scenario. Omega doesn't need to know whether you have precomitted or not, Omega isn't putting money in the boxes based on whether you have precommitted or not. It's putting money based on the decision you'll arrive to, even if you yourself don't know the decision yet.

You can't make the decision in advance, because you may not know the exact parameters of the decision you'll be asked to make (one-boxing & two-boxing are just examples of one particular type of decision). You can decide however whether you're the sort of person who accepts their decisions can be deterministically predicted in advance with sufficient certainty, or whether you'll be claiming that other people predicting your choice must be a violation of causality (it's not).

Comment author: ArisKatsaris 17 December 2017 08:07:13PM 0 points [-]

It is not compatible to believe your actions follow deterministically, and still talk about decision theory from a first-person point of view,

So it's the pronouns that matter? If I keep using "Aris Katsaris" rather than "I" that makes a difference to whether the person I'm talking about makes decisions that can be deterministally predicted?

Whether someone can predict your decisions has ZERO relevancy on whether you are the one making the decisions or not. This sort of confusion where people think that "free will" means "being unpredictable" is nonsensical - it's the very opposite. For the decisions to be yours, they must be theoretically predictable, arising from the contents of your brains. Adding in randomness and unpredictability, like e.g. using dice or coinflips reduces the ownership of the decisions and hence the free will.

This is old and tired territory.

Comment author: PhilGoetz 17 December 2017 06:26:40PM *  0 points [-]

I just now read that one post. It isn't clear how you think it's relevant. I'm guessing you think that it implies that positing free will is invalid.

You don't have to believe in free will to incorporate it into a model of how humans act. We're all nominalists here; we don't believe that the concepts in our theories actually exist somewhere in Form-space.

When someone asks the question, "Should you one-box?", they're using a model which uses the concept of free will. You can't object to that by saying "You don't really have free will." You can object that it is the wrong model to use for this problem, but then you have to spell out why, and what model you want to use instead, and what question you actually want to ask, since it can't be that one.

People in the LW community don't usually do that. I see sloppy statements claiming that humans "should" one-box, based on a presumption that they have no free will. That's making a claim within a paradigm while rejecting the paradigm. It makes no sense.

Consider what Eliezer says about coin flips:

We've previously discussed how probability is in the mind. If you are uncertain about whether a classical coin has landed heads or tails, that is a fact about your state of mind, not a property of the coin. The coin itself is either heads or tails. But people forget this, and think that coin.probability == 0.5, which is the Mind Projection Fallacy: treating properties of the mind as if they were properties of the external world.

The mind projection fallacy is treating the word "probability" not in a nominalist way, but in a philosophical realist way, as if they were things existing in the world. Probabilities are subjective. You don't project them onto the external world. That doesn't make "coin.probability == 0.5" a "false" statement. It correctly specifies the distribution of possibilities given the information available within the mind making the probability assessment. I think that is what Eliezer is trying to say there.

"Free will" is a useful theoretical construct in a similar way. It may not be a thing in the world, but it is a model for talking about how we make decisions. We can only model our own brains; you can't fully simulate your own brain within your own brain; you can't demand that we use the territory as our map.

Comment author: Caspar42 17 December 2017 08:49:34AM 0 points [-]

I wrote a summary of Hansons's The Age of Em, in which I focus on the bits of information that may be policy-relevant for effective altruists. For instance, I summarize what Hanson says about em values and also have a section about AI safety.

Comment author: morganism 16 December 2017 11:27:19PM 0 points [-]
Comment author: ike 16 December 2017 10:00:40PM 0 points [-]

Have you read http://lesswrong.com/lw/rb/possibility_and_couldness/ and the related posts and have some disagreement with them?

Comment author: jimrandomh 16 December 2017 08:27:48PM 1 point [-]

"Omega" is philosophical shorthand for "please accept this part of the thought experiment as a premise". Newcomb's problem isn't supposed to be realistic, it's supposed to isolate a corner-case in reasoning and let us consider it apart from everything else.While it's true that in reality you can't assign probability 1 to Omega being a perfect predictor, the thought experiment nevertheless asks you to do so anyways--because otherwise the underlying issue would be too obscured by irrelevant details to solve it philosophically..

Comment author: Dagon 16 December 2017 05:34:36PM 1 point [-]

If you rule out probabilities of 1, what do you assign to the probability that Omega is cheating, and somehow gimmicking the boxes to change the contents the instant you indicate your choice, before the contents are revealed?

Presumably the mechanisms of "correct prediction" are irrelevant, and once your expectation that this instance will be predicted correctly gets above million-to-one, you one-box.

Comment author: PhilGoetz 16 December 2017 02:44:20PM *  1 point [-]

I don't think that what you need has any bearing on what reality has actually given you. Nor can we talk about different decision theories here--as long as we are talking about maximizing expected utility, we have our decision theory; that is enough specification to answer the Newcomb one-shot question. We can only arrive at a different outcome by stating the problem differently, or by sneaking in different metaphysics, or by just doing bad logic (in this case, usually allowing contradictory beliefs about free will in different parts of the analysis.)

Your comment implies you're talking about policy, which must be modelled as an iterated game. I don't deny that one-boxing is good in the iterated game.

My concern in this post is that there's been a lack of distinction in the community between "one-boxing is the best policy" and "one-boxing is the best decision at one point in time in a decision-theoretic analysis, which assumes complete freedom of choice at that moment." This lack of distinction has led many people into wishful or magical rather than rational thinking.

In response to What is Rational?
Comment author: DragonGod 16 December 2017 10:49:57AM 0 points [-]

It's been almost four months since I wrote this thread. I've started to see the outline of an answer to my question. Over the course of the next year, I would begin documenting it.

Comment author: IlyaShpitser 16 December 2017 07:09:28AM *  0 points [-]

anyone going to the AAAI ethics/safety conf?

Comment author: entirelyuseless 16 December 2017 03:04:22AM 1 point [-]

I considered submitting an entry basically saying this, but decided that it would be pointless since obviously it would not get any prize. Human beings do not have coherent goals even individually. Much less does humanity.

Comment author: Vaniver 16 December 2017 03:00:14AM 0 points [-]

I don't think this gets Parfit's Hitchhiker right. You need a decision theory that, when safely returned to the city, pays the rescuer even though they have no external obligation to do so. Otherwise they won't have rescued you.

Comment author: PhilGoetz 16 December 2017 01:03:00AM *  0 points [-]

I can believe that it would make sense to commit ahead of time to one-box at such an event. Doing so would affect your behavior in a way that the predictor might pick up on.

Hmm. Thinking about this convinces me that there's a big problem here in how we talk about the problem, because if we allow people who already knew about Newcomb's Problem to play, there are really 4 possible actions, not 2:

  • intended to one-box, one-boxed
  • intended to one-box, two-boxed
  • intended to two-box, one-boxed
  • intended to two-box, two-boxed

I don't know if the usual statement of Newcomb's problem specifies whether the subjects learns the rules of the game before or after the predictor makes a prediction. It seems to me that's a critical factor. If the subject is told the rules of the game before the predictor observes the subject and makes a prediction, then we're just saying Omega is a very good lie detector, and the problem is not even about decision theory, but about psychology: Do you have a good enough poker face to lie to Omega? If not, pre-commit to one-box.

We shouldn't ask, "Should you two-box?", but, "Should you two-box now, given how you would have acted earlier?" The various probabilities in the present depend on what you thought in the past. Under the proposition that Omega is perfect at predicting, the person inclined to 2-box should still 2-box, 'coz that $1M probably ain't there.

So Newcomb's problem isn't a paradox. If we're talking just about the final decision, the one made by a subject after Omega's prediction, then the subject should probably two-box (as argued in the post). If we're talking about two decisions, one before and one after the box-opening, then all we're asking is whether you can convince Omega that you're going to one-box if you aren't. Then it would not be terribly hard to say that a predictor might be so good (say, an Amazing Kreskin-level cold-reader of humans, or that you are an AI) that your only hope is to precommit to one-box.

Comment author: Vaniver 15 December 2017 10:18:36PM 1 point [-]

The argument for one-boxing is that you aren't entirely sure you understand physics, but you know Omega has a really good track record--so good that it is more likely that your understanding of physics is false than that you can falsify Omega's prediction. This is a strict reliance on empirical observations as opposed to abstract reason: count up how often Omega has been right and compute a prior.

Isn't it that you aren't entirely sure that you understand psychology, or that you do understand psychology well enough to think that you're predictable? My understanding is that many people have run Newcomb's Problem-style experiments at philosophy departments (or other places) and have a sufficiently high accuracy that it makes sense to one-box at such events, even against fallible human predictors.

Comment author: Snorky 15 December 2017 08:06:25PM 0 points [-]

Illogical and not perfect intellect is trying to create perfect AI. What if human-like AI is best we can do?

P.S. Good point of view.

Comment author: PhilGoetz 15 December 2017 07:48:31PM *  0 points [-]

This was argued against in the Sequences and in general, doesn't seem to be a strong argument. It seems perfectly compatible to believe your actions follow deterministically and still talk about decision theory - all the functional decision theory stuff is assuming a deterministic decision process, I think.

It is compatible to believe your actions follow deterministically and still talk about decision theory. It is not compatible to believe your actions follow deterministically, and still talk about decision theory from a first-person point of view, as if you could by force of will violate your programming.

To ask what choice a deterministic entity should make presupposes both that it does, and does not, have choice. Presupposing a contradiction means STOP, your reasoning has crashed and you can prove any conclusion if you continue.

Comment author: PhilGoetz 15 December 2017 07:41:12PM *  2 points [-]

I think that first you should elaborate on what you mean by "the goals of humanity". Do you mean majority opinion? In that case, one goal of humanity is to have a single world religious State, although there is disagreement on what that religion should be. Other goals of humanity include eliminating homosexuality and enforcing traditional patriarchal family structures.

Okay, I admit it--what I really think is that "goals of humanity" is a nonsensical phrase, especially when spoken by an American academic. It would be a little better to talk about values instead of goals, but not much better. The phrase still implies the unspoken belief that everyone would think like the person who speaks it, if only they were smarter.

Comment author: ike 15 December 2017 07:37:17PM 0 points [-]

If they can perfectly predict your actions, then you have no choice, so talking about which choice to make is meaningless.

This was argued against in the Sequences and in general, doesn't seem to be a strong argument. It seems perfectly compatible to believe your actions follow deterministically and still talk about decision theory - all the functional decision theory stuff is assuming a deterministic decision process, I think.

Re QM: sometimes I've seen it stipulated that the world in which the scenario happens is deterministic. It's entirely possible that the amount of noise generated by QM isn't enough to affect your choice (besides for a very unlikely "your brain has a couple bits changed randomly in exactly the right way to change your choice", but that should be way too many orders of magnitude unlikely so as to not matter in any expected utility calculation).

View more: Prev | Next