You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

2016 LessWrong Diaspora Survey Analysis: Part Four (Politics, Calibration & Probability, Futurology, Charity & Effective Altruism)

9 ingres 10 September 2016 03:51AM

Politics

The LessWrong survey has a very involved section dedicated to politics. In previous analysis the benefits of this weren't fully realized. In the 2016 analysis we can look at not just the political affiliation of a respondent, but what beliefs are associated with a certain affiliation. The charts below summarize most of the results.

Political Opinions By Political Affiliation



































Miscellaneous Politics

There were also some other questions in this section which aren't covered by the above charts.

PoliticalInterest

On a scale from 1 (not interested at all) to 5 (extremely interested), how would you describe your level of interest in politics?

1: 67 (2.182%)

2: 257 (8.371%)

3: 461 (15.016%)

4: 595 (19.381%)

5: 312 (10.163%)

Voting

Did you vote in your country's last major national election? (LW Turnout Versus General Election Turnout By Country)
Group Turnout
LessWrong 68.9%
Austrailia 91%
Brazil 78.90%
Britain 66.4%
Canada 68.3%
Finland 70.1%
France 79.48%
Germany 71.5%
India 66.3%
Israel 72%
New Zealand 77.90%
Russia 65.25%
United States 54.9%
Numbers taken from Wikipedia, accurate as of the last general election in each country listed at time of writing.

AmericanParties

If you are an American, what party are you registered with?

Democratic Party: 358 (24.5%)

Republican Party: 72 (4.9%)

Libertarian Party: 26 (1.8%)

Other third party: 16 (1.1%)

Not registered for a party: 451 (30.8%)

(option for non-Americans who want an option): 541 (37.0%)

Calibration And Probability Questions

Calibration Questions

I just couldn't analyze these, sorry guys. I put many hours into trying to get them into a decent format I could even read and that sucked up an incredible amount of time. It's why this part of the survey took so long to get out. Thankfully another LessWrong user, Houshalter, has kindly done their own analysis.

All my calibration questions were meant to satisfy a few essential properties:

  1. They should be 'self contained'. I.E, something you can reasonably answer or at least try to answer with a 5th grade science education and normal life experience.
  2. They should, at least to a certain extent, be Fermi Estimable.
  3. They should progressively scale in difficulty so you can see whether somebody understands basic probability or not. (eg. In an 'or' question do they put a probability of less than 50% of being right?)

At least one person requested a workbook, so I might write more in the future. I'll obviously write more for the survey.

Probability Questions

Question Mean Median Mode Stdev
Please give the obvious answer to this question, so I can automatically throw away all surveys that don't follow the rules: What is the probability of a fair coin coming up heads? 49.821 50.0 50.0 3.033
What is the probability that the Many Worlds interpretation of quantum mechanics is more or less correct? 44.599 50.0 50.0 29.193
What is the probability that non-human, non-Earthly intelligent life exists in the observable universe? 75.727 90.0 99.0 31.893
...in the Milky Way galaxy? 45.966 50.0 10.0 38.395
What is the probability that supernatural events (including God, ghosts, magic, etc) have occurred since the beginning of the universe? 13.575 1.0 1.0 27.576
What is the probability that there is a god, defined as a supernatural intelligent entity who created the universe? 15.474 1.0 1.0 27.891
What is the probability that any of humankind's revealed religions is more or less correct? 10.624 0.5 1.0 26.257
What is the probability that an average person cryonically frozen today will be successfully restored to life at some future time, conditional on no global catastrophe destroying civilization before then? 21.225 10.0 5.0 26.782
What is the probability that at least one person living at this moment will reach an age of one thousand years, conditional on no global catastrophe destroying civilization in that time? 25.263 10.0 1.0 30.510
What is the probability that our universe is a simulation? 25.256 10.0 50.0 28.404
What is the probability that significant global warming is occurring or will soon occur, and is primarily caused by human actions? 83.307 90.0 90.0 23.167
What is the probability that the human race will make it to 2100 without any catastrophe that wipes out more than 90% of humanity? 76.310 80.0 80.0 22.933

 

Probability questions is probably the area of the survey I put the least effort into. My plan for next year is to overhaul these sections entirely and try including some Tetlock-esque forecasting questions, a link to some advice on how to make good predictions, etc.

Futurology

This section got a bit of a facelift this year. Including new cryonics questions, genetic engineering, and technological unemployment in addition to the previous years.

Cryonics

Cryonics

Are you signed up for cryonics?

Yes - signed up or just finishing up paperwork: 48 (2.9%)

No - would like to sign up but unavailable in my area: 104 (6.3%)

No - would like to sign up but haven't gotten around to it: 180 (10.9%)

No - would like to sign up but can't afford it: 229 (13.8%)

No - still considering it: 557 (33.7%)

No - and do not want to sign up for cryonics: 468 (28.3%)

Never thought about it / don't understand: 68 (4.1%)

CryonicsNow

Do you think cryonics, as currently practiced by Alcor/Cryonics Institute will work?

Yes: 106 (6.6%)

Maybe: 1041 (64.4%)

No: 470 (29.1%)

Interestingly enough, of those who think it will work with enough confidence to say 'yes', only 14 are actually signed up for cryonics.

sqlite> select count(*) from data where CryonicsNow="Yes" and Cryonics="Yes - signed up or just finishing up paperwork";

14

sqlite> select count(*) from data where CryonicsNow="Yes" and (Cryonics="Yes - signed up or just finishing up paperwork" OR Cryonics="No - would like to sign up but unavailable in my area" OR "No - would like to sign up but haven't gotten around to it" OR "No - would like to sign up but can't afford it");

34

CryonicsPossibility

Do you think cryonics works in principle?

Yes: 802 (49.3%)

Maybe: 701 (43.1%)

No: 125 (7.7%)

LessWrongers seem to be very bullish on the underlying physics of cryonics even if they're not as enthusiastic about current methods in use.

The Brain Preservation Foundation also did an analysis of cryonics responses to the LessWrong Survey.

Singularity

SingularityYear

By what year do you think the Singularity will occur? Answer such that you think, conditional on the Singularity occurring, there is an even chance of the Singularity falling before or after this year. If you think a singularity is so unlikely you don't even want to condition on it, leave this question blank.

Mean: 8.110300081581755e+16

Median: 2080.0

Mode: 2100.0

Stdev: 2.847858859055733e+18

I didn't bother to filter out the silly answers for this.

Obviously it's a bit hard to see without filtering out the uber-large answers, but the median doesn't seem to have changed much from the 2014 survey.

Genetic Engineering

ModifyOffspring

Would you ever consider having your child genetically modified for any reason?

Yes: 1552 (95.921%)

No: 66 (4.079%)

Well that's fairly overwhelming.

GeneticTreament

Would you be willing to have your child genetically modified to prevent them from getting an inheritable disease?

Yes: 1387 (85.5%)

Depends on the disease: 207 (12.8%)

No: 28 (1.7%)

I find it amusing how the strict "No" group shrinks considerably after this question.

GeneticImprovement

Would you be willing to have your child genetically modified for improvement purposes? (eg. To heighten their intelligence or reduce their risk of schizophrenia.)

Yes : 0 (0.0%)

Maybe a little: 176 (10.9%)

Depends on the strength of the improvements: 262 (16.2%)

No: 84 (5.2%)

Yes I know 'yes' is bugged, I don't know what causes this bug and despite my best efforts I couldn't track it down. There is also an issue here where 'reduce your risk of schizophrenia' is offered as an example which might confuse people, but the actual science of things cuts closer to that than it does to a clean separation between disease risk and 'improvement'.

 

This question is too important to just not have an answer to so I'll do it manually. Unfortunately I can't easily remove the 'excluded' entries so that we're dealing with the exact same distribution but only 13 or so responses are filtered out anyway.

sqlite> select count(*) from data where GeneticImprovement="Yes";

1100

>>> 1100 + 176 + 262 + 84
1622
>>> 1100 / 1622
0.6781750924784217

67.8% are willing to genetically engineer their children for improvements.

GeneticCosmetic

Would you be willing to have your child genetically modified for cosmetic reasons? (eg. To make them taller or have a certain eye color.)

Yes: 500 (31.0%)

Maybe a little: 381 (23.6%)

Depends on the strength of the improvements: 277 (17.2%)

No: 455 (28.2%)

These numbers go about how you would expect, with people being progressively less interested the more 'shallow' a genetic change is seen as.


GeneticOpinionD

What's your overall opinion of other people genetically modifying their children for disease prevention purposes?

Positive: 1177 (71.7%)

Mostly Positive: 311 (19.0%)

No strong opinion: 112 (6.8%)

Mostly Negative: 29 (1.8%)

Negative: 12 (0.7%)

GeneticOpinionI

What's your overall opinion of other people genetically modifying their children for improvement purposes?

Positive: 737 (44.9%)

Mostly Positive: 482 (29.4%)

No strong opinion: 273 (16.6%)

Mostly Negative: 111 (6.8%)

Negative: 38 (2.3%)

GeneticOpinionC

What's your overall opinion of other people genetically modifying their children for cosmetic reasons?

Positive: 291 (17.7%)

Mostly Positive: 290 (17.7%)

No strong opinion: 576 (35.1%)

Mostly Negative: 328 (20.0%)

Negative: 157 (9.6%)

All three of these seem largely consistent with peoples personal preferences about modification. Were I inclined I could do a deeper analysis that actually takes survey respondents row by row and looks at correlation between preference for ones own children and preference for others.

Technological Unemployment

LudditeFallacy

Do you think the Luddite's Fallacy is an actual fallacy?

Yes: 443 (30.936%)

No: 989 (69.064%)

We can use this as an overall measure of worry about technological unemployment, which would seem to be high among the LW demographic.

UnemploymentYear

By what year do you think the majority of people in your country will have trouble finding employment for automation related reasons? If you think this is something that will never happen leave this question blank.

Mean: 2102.9713740458014

Median: 2050.0

Mode: 2050.0

Stdev: 1180.2342850727339

Question is flawed because you can't distinguish answers of "never happen" from people who just didn't see it.

Interesting question that would be fun to take a look at in comparison to the estimates for the singularity.

EndOfWork

Do you think the "end of work" would be a good thing?

Yes: 1238 (81.287%)

No: 285 (18.713%)

Fairly overwhelming consensus, but with a significant minority of people who have a dissenting opinion.

EndOfWorkConcerns

If machines end all or almost all employment, what are your biggest worries? Pick two.

Question Count Percent
People will just idle about in destructive ways 513 16.71%
People need work to be fulfilled and if we eliminate work we'll all feel deep existential angst 543 17.687%
The rich are going to take all the resources for themselves and leave the rest of us to starve or live in poverty 1066 34.723%
The machines won't need us, and we'll starve to death or be otherwise liquidated 416 13.55%
Question is flawed because it demanded the user 'pick two' instead of up to two.

The plurality of worries are about elites who refuse to share their wealth.

Existential Risk

XRiskType

Which disaster do you think is most likely to wipe out greater than 90% of humanity before the year 2100?

Nuclear war: +4.800% 326 (20.6%)

Asteroid strike: -0.200% 64 (4.1%)

Unfriendly AI: +1.000% 271 (17.2%)

Nanotech / grey goo: -2.000% 18 (1.1%)

Pandemic (natural): +0.100% 120 (7.6%)

Pandemic (bioengineered): +1.900% 355 (22.5%)

Environmental collapse (including global warming): +1.500% 252 (16.0%)

Economic / political collapse: -1.400% 136 (8.6%)

Other: 35 (2.217%)

Significantly more people worried about Nuclear War than last year. Effect of new respondents, or geopolitical situation? Who knows.

Charity And Effective Altruism

Charitable Giving

Income

What is your approximate annual income in US dollars (non-Americans: convert at www.xe.com)? Obviously you don't need to answer this question if you don't want to. Please don't include commas or dollar signs.

Sum: 66054140.47384

Mean: 64569.052271593355

Median: 40000.0

Mode: 30000.0

Stdev: 107297.53606321265

IncomeCharityPortion

How much money, in number of dollars, have you donated to charity over the past year? (non-Americans: convert to dollars at http://www.xe.com/ ). Please don't include commas or dollar signs in your answer. For example, 4000

Sum: 2389900.6530000004

Mean: 2914.5129914634144

Median: 353.0

Mode: 100.0

Stdev: 9471.962766896671

XriskCharity

How much money have you donated to charities aiming to reduce existential risk (other than MIRI/CFAR) in the past year?

Sum: 169300.89

Mean: 1991.7751764705883

Median: 200.0

Mode: 100.0

Stdev: 9219.941506342007

CharityDonations

How much have you donated in US dollars to the following charities in the past year? (Non-americans: convert to dollars at http://www.xe.com/) Please don't include commas or dollar signs in your answer. Options starting with "any" aren't the name of a charity but a category of charity.

Question Sum Mean Median Mode Stdev
Against Malaria Foundation 483935.027 1905.256 300.0 None 7216.020
Schistosomiasis Control Initiative 47908.0 840.491 200.0 1000.0 1618.785
Deworm the World Initiative 28820.0 565.098 150.0 500.0 1432.712
GiveDirectly 154410.177 1429.723 450.0 50.0 3472.082
Any kind of animal rights charity 83130.47 1093.821 154.235 500.0 2313.493
Any kind of bug rights charity 1083.0 270.75 157.5 None 353.396
Machine Intelligence Research Institute 141792.5 1417.925 100.0 100.0 5370.485
Any charity combating nuclear existential risk 491.0 81.833 75.0 100.0 68.060
Any charity combating global warming 13012.0 245.509 100.0 10.0 365.542
Center For Applied Rationality 127101.0 3177.525 150.0 100.0 12969.096
Strategies for Engineered Negligible Senescence Research Foundation 9429.0 554.647 100.0 20.0 1156.431
Wikipedia 12765.5 53.189 20.0 10.0 126.444
Internet Archive 2975.04 80.406 30.0 50.0 173.791
Any campaign for political office 38443.99 366.133 50.0 50.0 1374.305
Other 564890.46 1661.442 200.0 100.0 4670.805
"Bug Rights" charity was supposed to be a troll fakeout but apparently...

This table is interesting given the recent debates about how much money certain causes are 'taking up' in Effective Altruism.

Effective Altruism

Vegetarian

Do you follow any dietary restrictions related to animal products?

Yes, I am vegan: 54 (3.4%)

Yes, I am vegetarian: 158 (10.0%)

Yes, I restrict meat some other way (pescetarian, flexitarian, try to only eat ethically sourced meat): 375 (23.7%)

No: 996 (62.9%)

EAKnowledge

Do you know what Effective Altruism is?

Yes: 1562 (89.3%)

No but I've heard of it: 114 (6.5%)

No: 74 (4.2%)

EAIdentity

Do you self-identify as an Effective Altruist?

Yes: 665 (39.233%)

No: 1030 (60.767%)

The distribution given by the 2014 survey results does not sum to one, so it's difficult to determine if Effective Altruism's membership actually went up or not but if we take the numbers at face value it experienced an 11.13% increase in membership.

EACommunity

Do you participate in the Effective Altruism community?

Yes: 314 (18.427%)

No: 1390 (81.573%)

Same issue as last, taking the numbers at face value community participation went up by 5.727%

EADonations

Has Effective Altruism caused you to make donations you otherwise wouldn't?

Yes: 666 (39.269%)

No: 1030 (60.731%)

Wowza!

Effective Altruist Anxiety

EAAnxiety

Have you ever had any kind of moral anxiety over Effective Altruism?

Yes: 501 (29.6%)

Yes but only because I worry about everything: 184 (10.9%)

No: 1008 (59.5%)


There's an ongoing debate in Effective Altruism about what kind of rhetorical strategy is best for getting people on board and whether Effective Altruism is causing people significant moral anxiety.

It certainly appears to be. But is moral anxiety effective? Let's look:

Sample Size: 244
Average amount of money donated by people anxious about EA who aren't EAs: 257.5409836065574

Sample Size: 679
Average amount of money donated by people who aren't anxious about EA who aren't EAs: 479.7501384388807

Sample Size: 249 Average amount of money donated by EAs anxious about EA: 1841.5292369477913

Sample Size: 314
Average amount of money donated by EAs not anxious about EA: 1837.8248407643312

It seems fairly conclusive that anxiety is not a good way to get people to donate more than they already are, but is it a good way to get people to become Effective Altruists?

Sample Size: 1685
P(Effective Altruist): 0.3940652818991098
P(EA Anxiety): 0.29554896142433235
P(Effective Altruist | EA Anxiety): 0.5

Maybe. There is of course an argument to be made that sufficient good done by causing people anxiety outweighs feeding into peoples scrupulosity, but it can be discussed after I get through explaining it on the phone to wealthy PR-conscious donors and telling the local all-kill shelter where I want my shipment of dead kittens.

EAOpinion

What's your overall opinion of Effective Altruism?

Positive: 809 (47.6%)

Mostly Positive: 535 (31.5%)

No strong opinion: 258 (15.2%)

Mostly Negative: 75 (4.4%)

Negative: 24 (1.4%)

EA appears to be doing a pretty good job of getting people to like them.

Interesting Tables

Charity Donations By Political Affilation
Affiliation Income Charity Contributions % Income Donated To Charity Total Survey Charity % Sample Size
Anarchist 1677900.0 72386.0 4.314% 3.004% 50
Communist 298700.0 19190.0 6.425% 0.796% 13
Conservative 1963000.04 62945.04 3.207% 2.612% 38
Futarchist 1497494.1099999999 166254.0 11.102% 6.899% 31
Left-Libertarian 9681635.613839999 416084.0 4.298% 17.266% 245
Libertarian 11698523.0 214101.0 1.83% 8.885% 190
Moderate 3225475.0 90518.0 2.806% 3.756% 67
Neoreactionary 1383976.0 30890.0 2.232% 1.282% 28
Objectivist 399000.0 1310.0 0.328% 0.054% 10
Other 3150618.0 85272.0 2.707% 3.539% 132
Pragmatist 5087007.609999999 266836.0 5.245% 11.073% 131
Progressive 8455500.440000001 368742.78 4.361% 15.302% 217
Social Democrat 8000266.54 218052.5 2.726% 9.049% 237
Socialist 2621693.66 78484.0 2.994% 3.257% 126


Number Of Effective Altruists In The Diaspora Communities
Community Count % In Community Sample Size
LessWrong 136 38.418% 354
LessWrong Meetups 109 50.463% 216
LessWrong Facebook Group 83 48.256% 172
LessWrong Slack 22 39.286% 56
SlateStarCodex 343 40.98% 837
Rationalist Tumblr 175 49.716% 352
Rationalist Facebook 89 58.94% 151
Rationalist Twitter 24 40.0% 60
Effective Altruism Hub 86 86.869% 99
Good Judgement(TM) Open 23 74.194% 31
PredictionBook 31 51.667% 60
Hacker News 91 35.968% 253
#lesswrong on freenode 19 24.675% 77
#slatestarcodex on freenode 9 24.324% 37
#chapelperilous on freenode 2 18.182% 11
/r/rational 117 42.545% 275
/r/HPMOR 110 47.414% 232
/r/SlateStarCodex 93 37.959% 245
One or more private 'rationalist' groups 91 47.15% 193


Effective Altruist Donations By Political Affiliation
Affiliation EA Income EA Charity Sample Size
Anarchist 761000.0 57500.0 18
Futarchist 559850.0 114830.0 15
Left-Libertarian 5332856.0 361975.0 112
Libertarian 2725390.0 114732.0 53
Moderate 583247.0 56495.0 22
Other 1428978.0 69950.0 49
Pragmatist 1442211.0 43780.0 43
Progressive 4004097.0 304337.78 107
Social Democrat 3423487.45 149199.0 93
Socialist 678360.0 34751.0 41

Causal graphs and counterfactuals

3 Stuart_Armstrong 30 August 2016 04:12PM

Problem solved: Found what I was looking for in: An Axiomatic Characterization Causal Counterfactuals, thanks to Evan Lloyd.

Basically, making every endogenous variable a deterministic function of the exogenous variables and of the other endogenous variables, and pushing all the stochasticity into the exogenous variables.

 

Old post:

A problem that's come up with my definitions of stratification.

Consider a very simple causal graph:

In this setting, A and B are both booleans, and A=B with 75% probability (independently about whether A=0 or A=1).

I now want to compute the counterfactual: suppose I assume that B=0 when A=0. What would happen if A=1 instead?

The problem is that P(B|A) seems insufficient to solve this. Let's imagine the process that outputs B as a probabilistic mix of functions, that takes the value of A and outputs that of B. There are four natural functions here:

  • f0(x) = 0
  • f1(x) = 1
  • f2(x) = x
  • f3(x) = 1-x

Then one way of modelling the causal graph is as a mix 0.75f2 + 0.25f3. In that case, knowing that B=0 when A=0 implies that P(f2)=1, so if A=1, we know that B=1.

But we could instead model the causal graph as 0.5f2+0.25f1+0.25f0. In that case, knowing that B=0 when A=0 implies that P(f2)=2/3 and P(f0)=1/3. So if A=1, B=1 with probability 2/3 and B=1 with probability 1/3.

And we can design the node B, physically, to be one or another of the two distributions over functions or anything in between (the general formula is (0.5+x)f2 + x(f3)+(0.25-x)f1+(0.25-x)f0 for 0 ≤ x ≤ 0.25). But it seems that the causal graph does not capture that.

Owain Evans has said that Pearl has papers covering these kinds of situations, but I haven't been able to find them. Does anyone know any publications on the subject?

Does Evidence Have To Be Certain?

0 potato 30 March 2016 10:32AM

It seems like in order to go from P(H) to P(H|E) you have to become certain that E. Am I wrong about that? 

Say you have the following joint distribution:

P(H&E) = a
P(~H&E) = b
P(H&~E) = c

P(~H&~E) = d 

Where a,b,c, and d, are each larger than 0.

So P(H|E) = a/(a+b). It seems like what we're doing is going from assigning ~E some positive probability to assigning it a 0 probability. Is there another way to think about it? Is there something special about evidential statements that justifies changing their probabilities without having updated on something else? 

How did my baby die and what is the probability that my next one will?

22 deprimita_patro 19 January 2016 06:24AM

Summary: My son was stillborn and I don't know why. My wife and I would like to have another child, but would very much not like to try if the probability of this occurring again is above a certain threshold (of which we have already settled on one). All 3 doctors I have consulted were unable to give a definitive cause of death, nor were any willing to give a numerical estimate of the probability (whether for reasons of legal risk, or something else) that our next baby will be stillborn. I am likely too mind-killed to properly evaluate my situation and would very much appreciate an independent (from mine) probability estimate of what caused my son to die, and given that cause, what is the recurrence risk?

Background: V (L and my only biologically related living son) had no complications during birth, nor has he showed any signs of poor health whatsoever. L has a cousin who has had two miscarriages, and I have an aunt who had several stillbirths followed by 3 live births of healthy children. We know of no other family members that have had similar misfortunes.

J (my deceased son) was the product of a 31 week gestation. L (my wife and J's mother) is 28 years old, gravida 2, para 1. L presented to the physicians office for routine prenatal care and noted that she had not felt any fetal movement for the last five to six days. No fetal heart tones were identified. It was determined that there was an intrauterine fetal demise. L was admitted on 11/05/2015 for induction and was delivered of a nonviable, normal appearing, male fetus at approximately 1:30 on 11/06/2015.

Pro-Con Reasoning: According to a leading obstetrics textbook1, causes of stillbirth are commonly classified into 8 categories: obstetrical complications, placental abnormalities, fetal malformations, infection, umbilical cord abnormalities, hypertensive disorders, medical complications, and undetermined. Below, I'll list the percentage of stillbirths in each category (which may be used as prior probabilities) along with some reasons for or against.

Obstetrical complications (29%)

  • Against: No abruption detected. No multifetal gestation. No ruptured preterm membranes at 20-24 weeks.

Placental abnormalities (24%)

  • For: Excessive fibrin deposition (as concluded in the surgical pathology report). Early acute chorioamnionitis (as conclused in the surgical pathology report, but Dr. M claimed this was caused by the baby's death, not conversely). L has gene variants associated with deep vein thrombosis (AG on rs2227589 per 23andme raw data).
  • Against: No factor V Leiden mutation (GG on rs6025 per 23andme raw data and confirmed via independent lab test). No prothrombin gene mutation (GG on l3002432 per 23andme raw data and confirmed via independent lab test). L was negative for prothrombin G20210A mutation (as determined by lab test). Anti-thrombin III activity results were within normal reference ranges (as determined by lab test). Protein C activity results were withing normal reference ranges (as determined by lab test). Protein S activity results were within normal reference ranges (as determined by lab test). Protein S antigen (free and total) results were within normal references ranges (as determined by lab test).

Infection (13%)

  • For: L visited a nurse's home during the last week of August that works in a hospital we now know had frequent cases of CMV infection. CMV antibody IgH, CMV IgG, and Parvovirus B-19 Antibody IgG values were outside of normal reference ranges.
  • Against: Dr. M discounted the viral test results as the cause of death, since the levels suggested the infection had occurred years ago, and therefore could not have caused J's death. Dr. F confirmed Dr. M's assessment.

Fetal malformations (14%)

  • Against: No major structural abnormalities. No genetic abnormalities detected (CombiSNP Array for Pregnancy Loss results showed a normal male micro array profile).

Umbilical cord abnormalities (10%)

  • Against: No prolapse. No stricture. No thrombosis.

Hypertensive disorder (9%)

  • Against: No preeclampsia. No chronic hypertension.

Medical complications (8%)

  • For: L experienced 2 nights of very painful abdominal pains that could have been contractions on 10/28 and 10/29. L remembers waking up on her back a few nights between 10/20 and 11/05 (it is unclear if this belongs in this category or somewhere else).
  • Against: No antiphospholipid antibody syndrome detected (determined via Beta-2 Glycoprotein I Antibodies [IgG, IgA, IgM] test). No maternal diabetes detected (determined via glucose test on 10/20).

Undetermined (24%)

What is the most likely cause of death? How likely is that cause? Given that cause, if we choose to have another child, then how likely is it to survive its birth? Are there any other ways I could reduce uncertainty (additional tests, etc...) that I haven't listed here? Are there any other forums where these questions are more likely to get good answers? Why won't doctors give probabilities? Help with any of these questions would be greatly appreciated. Thank you.

If your advice to me is to consult another expert (in addition to the 2 obstetricians and 1 high-risk obstetrician I already have consulted), please also provide concrete tactics as to how to find such an expert and validate their expertise.

Contact Information: If you would like to contact me, but don't want to create an account here, you can do so at deprimita.patro@gmail.com.

[1] Cunningham, F. (2014). Williams obstetrics. New York: McGraw-Hill Medical.

EDIT 1: Updated to make clear that both V and J are mine and L's biological sons.

EDIT 2: Updated to add information on family history.

EDIT 3: On PipFoweraker's advice, I added contact info.

EDIT 4: I've cross-posted this on Health Stack Exchange.

EDIT 5: I've emailed the list of authors of the most recent meta-analysis concerning causes of stillbirth. Don't expect much.

[LINK] Common fallacies in probability (when numbers aren't used)

7 Stuart_Armstrong 15 January 2016 08:29AM

Too many people attempt to use logic when they should be using probabilities - in fact, when they are using probabilities, but don't mention it. Here are some of the major fallacies caused by misusing logic and probabilities this way:

  1. "It's not certain" does not mean "It's impossible" (and vice versa).
  2. "We don't know" absolutely does not imply "It's impossible".
  3. "There is evidence against it" doesn't mean much on its own.
  4. Being impossible *in a certain model*, does not mean being impossible: it changes the issue to the probability of the model.

Common fallacies in probability

A note about calibration of confidence

12 jbay 04 January 2016 06:57AM

Background

In a recent Slate Star Codex Post (http://slatestarcodex.com/2016/01/02/2015-predictions-calibration-results/), Scott Alexander made a number of predictions and presented associated confidence levels, and then at the end of the year, scored his predictions in order to determine how well-calibrated he is. In the comments, however, there arose a controversy over how to deal with 50% confidence predictions. As an example, Scott has these predictions at 50% confidence, among his others:

Proposition

Scott's Prior

Result

A

Jeb Bush will be the top-polling Republican candidate

P(A) = 50%

A is False

B

Oil will end the year greater than $60 a barrel

P(B) = 50%

B is False

C

Scott will not get any new girlfriends

P(C) = 50%

C is False

D

At least one SSC post in the second half of 2015 will get > 100,000 hits: 70%

P(D) = 70%

D is False

E

Ebola will kill fewer people in second half of 2015 than the in first half

P(E) = 95%

E is True

 

Scott goes on to score himself as having made 0/3 correct predictions at the 50% confidence interval, which looks like significant overconfidence. He addresses this by noting that with only 3 data points it’s not much data to go by, and could easily have been correct if any of those results had turned out differently. His resulting calibration curve is this:

Scott Alexander's 2015 calibration curve

 

However, the commenters had other objections about the anomaly at 50%. After all, P(A) = 50% implies P(~A) = 50%, so the choice of “I will not get any new girlfriends: 50% confidence”  is logically equivalent to “I will get at least 1 new girlfriend: 50% confidence”, except that one results as true and the other false. Therefore, the question seems sensitive only to the particular phrasing chosen, independent of the outcome.

One commenter suggests that close to perfect calibration at 50% confidence can be achieved by choosing whether to represent propositions as positive or negative statements by flipping a fair coin. Another suggests replacing 50% confidence with 50.1% or some other number arbitrarily close to 50%, but not equal to it. Others suggest getting rid of the 50% confidence bin altogether.

Scott recognizes that predicting A and predicting ~A are logically equivalent, and choosing to use one or the other is arbitrary. But by choosing to only include A in his data set rather than ~A, he creates a problem that occurs when P(A) = 50%, where the arbitrary choice of making a prediction phrased as ~A would have changed the calibration results despite being the same prediction.

Symmetry

This conundrum illustrates an important point about these calibration exercises. Scott chooses all of his propositions to be in the form of statements to which he assigns greater or equal to 50% probability, by convention, recognizing that he doesn’t need to also do a calibration of probabilities less than 50%, as the upper-half of the calibration curve captures all the relevant information about his calibration.

This is because the calibration curve has a property of symmetry about the 50% mark, as implied by the mathematical relation P(X) = 1- P(~X) and of course P(~X) = 1 –P(X).

We can enforce that symmetry by recognizing that when we make the claim that proposition X has probability P(X), we are also simultaneously making the claim that proposition ~X has probability 1-P(X). So we add those to the list of predictions and do the bookkeeping on them too. Since we are making both claims, why not be clear about it in our bookkeeping?

When we do this, we get the full calibration curve, and the confusion about what to do about 50% probability disappears. Scott’s list of predictions looks like this:

Proposition

Scott's Prior

Result

A

Jeb Bush will be the top-polling Republican candidate

P(A) = 50%

A is False

~A

Jeb Bush will not be the top-polling Republican candidate

P(~A) = 50%

~A is True

B

Oil will end the year greater than $60 a barrel

P(B) = 50%

B is False

~B

Oil will not end the year greater than $60 a barrel

P(~B) = 50%

~B is True

C

Scott will not get any new girlfriends

P(C) = 50%

C is False

~C

Scott will get new girlfriend(s)

P(~C) = 50%

~C is True

D

At least one SSC post in the second half of 2015 will get > 100,000 hits: 70%

P(D) = 70%

D is False

~D

No SSC post in the second half of 2015 will get > 100,000 hits

P(~D) = 30%

~D is True

E

Ebola will kill fewer people in second half of 2015 than the in first half

P(E) = 95%

E is True

~E

Ebola will kill as many or more people in second half of 2015 than the in first half

P(~E) = 05%

~E is False

 

You will by now have noticed that there will always be an even number of predictions, and that half of the predictions always are true and half are always false. In most cases, like with E and ~E, that means you get a 95% likely prediction that is true and a 5%-likely prediction that is false, which is what you would expect. However, with 50%-likely predictions, they are always accompanied by another 50% prediction, one of which is true and one of which is false. As a result, it is actually not possible to make a binary prediction at 50% confidence that is out of calibration.

The resulting calibration curve, applied to Scott’s predictions, looks like this:

no error bars


Sensitivity

By the way, this graph doesn’t tell the whole calibration story; as Scott noted it’s still sensitive to how many predictions were made in each bucket. We can add “error bars” that show what would have resulted if Scott had made one more prediction in each bucket, and whether the result of that prediction had been true or false. The result is the following graph:

with error bars

Note that the error bars are zero about the point of 0.5. That’s because even if one additional prediction had been added to that bucket, it would have had no effect. That point is fixed by the inherent symmetry.

I believe that this kind of graph does a better job of showing someone’s true calibration. But it's not the whole story.

Ramifications for scoring calibration (updated)

Clearly, it is not possible to make a binary prediction with 50% confidence that is poorly calibrated. This shouldn’t come as a surprise; a prediction at 50% between two choices represents the correct prior for the case where you have no information that discriminates between X and ~X. However, that doesn’t mean that you can improve your ability to make correct predictions just by giving them all 50% confidence and claiming impeccable calibration! An easy way to "cheat" your way into apparently good calibration is to take a large number of predictions that you are highly (>99%) confident about, negate a fraction of them, and falsely record a lower confidence for those. If we're going to measure calibration, we need a scoring method that will encourage people to write down the true probabilities they believe, rather than faking low confidence and ignoring their data. We want people to only claim 50% confidence when they genuinely have 50% confidence, and we need to make sure our scoring method encourages that.

 

A first guess would be to look at that graph and do the classic assessment of fit: sum of squared errors. We can sum the squared error of our predictions against the ideal linear calibration curve. If we did this, we would want to make sure we summed all the individual predictions, rather than the averages of the bins, so that the binning process itself doesn’t bias our score.

If we do this, then our overall prediction score can be summarized by one number:

S = \frac{1}{N}\left(\sum_{i=1}^{N}(P(X_i)-X_i)^2 \right )

Here P(Xi) is the assigned confidence of the truth of Xi, and Xi is the ith proposition and has a value of 1 if it is True and 0 if it is False. S is the prediction score, and lower is better. Note that because these are binary predictions, the sum of squared errors gives an optimal score if you assign the probabilities you actually believe (ie, there is no way to "cheat" your way to a better score by giving false confidence).

In this case, Scott's score is S=0.139, much of this comes from the 0.4/0.6 bracket. The worst score possible would be S=1, and the best score possible is S=0. Attempting to fake a perfect calibration by everything by claiming 50% confidence for every prediction, regardless of the information you actually have available, yields S=0.25 and therefore isn't a particularly good strategy (at least, it won't make you look better-calibrated than Scott).

Several of the commenters pointed out that log scoring is another scoring rule that works better in the general case. Before posting this I ran the calculus to confirm that the least-squares error did encourage an optimal strategy of honest reporting of confidence, but I did have a feeling that it was an ad-hoc scoring rule and that there must be better ones out there.

The logarithmic scoring rule looks like this:

S = \frac{1}{N}\sum_{i=1}^{N}X_i\ln(P(X_i))

Here again Xi is the ith proposition and has a value of 1 if it is True and 0 if it is False. The base of the logarithm is arbitrary so I've chosen base "e" as it makes it easier to take derivatives. This scoring method gives a negative number and the closer to zero the better. The log scoring rule has the same honesty-encouraging properties as the sum-of-squared-errors, plus the additional nice property that it penalizes wrong predictions of 100% or 0% confidence with an appropriate score of minus-infinity. When you claim 100% confidence and are wrong, you are infinitely wrong. Don't claim 100% confidence!

In this case, Scott's score is calculated to be S=-0.42. For reference, the worst possible score would be minus-infinity, and claiming nothing but 50% confidence for every prediction results in a score of S=-0.69. This just goes to show that you can't win by cheating.

Example: Pretend underconfidence to fake good calibration

In an attempt to appear like I have better calibration than Scott Alexander, I am going to make the following predictions. For clarity I have included the inverse propositions in the list (as those are also predictions that I am making), but at the end of the list so you can see the point I am getting at a bit better.

Proposition

Quoted Prior

Result

A

I will not win the lottery on Monday

P(A) = 50%

A is True

B

I will not win the lottery on Tuesday

P(B) = 66%

B is True

C

I will not win the lottery on Wednesday

P(C) = 66%

C is True

D

I will win the lottery on Thursday

P(D) =66%

D is False

E

I will not win the lottery on Friday

P(E) = 75%

E is True

F

I will not win the lottery on Saturday

P(F) = 75%

F is True

G

I will not win the lottery on Sunday

P(G) = 75%

G is True

H

I will win the lottery next Monday

P(H) = 75%

H is False

 

 

 

~A

I will win the lottery on Monday

P(~A) = 50%

~A is False

~B

I will win the lottery on Tuesday

P(~B) = 34%

~B is False

~C

I will win the lottery on Wednesday

P(~C) = 34%

~C is False

 

 

 

Look carefully at this table. I've thrown in a particular mix of predictions that I will or will not win the lottery on certain days, in order to use my extreme certainty about the result to generate a particular mix of correct and incorrect predictions.

To make things even easier for me, I’m not even planning to buy any lottery tickets. Knowing this information, an honest estimate of the odds of me winning the lottery are astronomically small. The odds of winning the lottery are about  1 in 14 million (for the Canadian 6/49 lottery). I’d have to win by accident (one of my relatives buying me a lottery ticket?). Not only that, but since the lottery is only held on Wednesday and Saturday, that makes most of these scenarios even more implausible since the lottery corporation would have to hold the draw by mistake.

I am confident I could make at least 1 billion similar statements of this exact nature and get them all right, so my true confidence must be upwards of (100% - 0.0000001%).

If I assemble 50 of these types of strategically-underconfident predictions (and their 50 opposites) and plot them on a graph, here’s what I get:

 Looks like good calibration...? Not so fast.

You can see that the problem with cheating doesn’t occur only at 50%. It can occur anywhere!

But here’s the trick: The log scoring algorithm rates me -0.37. If I had made the same 100 predictions all at my true confidence (99.9999999%), then my score would have been -0.000000001. A much better score! My attempt to cheat in order to make a pretty graph has only sabotaged my score.

By the way, what if I had gotten one of those wrong, and actually won the lottery one of those times without even buying a ticket? In that case my score is -0.41 (the wrong prediction had a probability of 1 in 10^9 which is about 1 in e^21, so it’s worth -21 points, but then that averages down to -0.41 due to the 49 correct predictions that are collectively worth a negligible fraction of a point).* Not terrible! The log scoring rule is pretty gentle about being very badly wrong sometimes, just as long as you aren’t infinitely wrong. However, if I had been a little less confident and said the chance of winning each time was only 1 in a million, rather than 1 in a billion, my score would have improved to -0.28, and if I had expressed only 98% confidence I would have scored -0.098, the best possible score for someone who is wrong one in every fifty times.

This has another important ramification: If you're going to honestly test your calibration, you shouldn't pick the predictions you'll make. It is easy to improve your score by throwing in a couple predictions that you are very certain about, like that you won't win the lottery, and by making few predictions that you are genuinely uncertain about. It is fairer to use a list of propositions that is generated by somebody else, and then pick your probabilities. Scott demonstrates his honesty by making public predictions about a mix of things he was genuinely uncertain about, but if he wanted to cook his way to a better score in the future, he would avoid making any predictions at the 50% category that he wasn't forced to.

 

Input and comments are welcome! Let me know what you think!

* This result surprises me enough that I would appreciate if someone in the comments can double-check it on their own. What is the proper score for being right 49 times with 1-1 in a billion certainty, but wrong once?

Computable Universal Prior

0 potato 11 December 2015 09:54AM

Suppose instead of using 2^-K(H) we just use 2^-length(H), does this do something obviously stupid? 

Here's what I'm proposing:

Take a programing language with two characters. Assign each program a prior of 2^-length(program). If the program outputs some string, then P(string | program) = 1, else it equals 0. I figure there must be some reason people don't do this already, or else there's a bunch of people doing it. I'd be real happy to find out about either. 

Clearly, it isn't a probability distribution, but we can still use it, no? 

 

 

Utility, probability and false beliefs

1 Stuart_Armstrong 09 November 2015 09:43PM

A putative new idea for AI control; index here.

This is part of the process of rigourising and formalising past ideas.

Paul Christiano recently asked why I used utility changes, rather than probability changes, to have an AI believe (or act as if it believed) false things. While investigating that, I developed several different methods for achieving the belief changes that we desired. This post analyses these methods.

 

Different models of forced beliefs

Let x and ¬x refer to the future outcome of a binary random variable X (write P(x) as a shorthand for P(X=x), and so on). Assume that we want P(x):P(¬x) to be in the 1:λ ratio for some λ (since the ratio is all that matters, λ=∞ is valid, meaning P(x)=0). Assume that we have an agent, who has utility u, has seen past evidence e, and wishes to assess the expected utility of their action a.

Typically, for expected utility, we sum over the possible worlds. In practice, we almost always sum over sets of possible worlds, the sets determined by some key features of interest. In assessing the quality of health interventions, for instance, we do not carefully and separately treat each possible position of atoms in the sun. Thus let V be the set of variables or values we can about, and v a possible value vector V can take. As usual, we'll be writing P(v) as a shorthand for P(V=v). The utility function u assigns utilities to possible v's.

One of the advantages of this approach is that it can avoid many issues of conditionals like P(A|B) when P(B)=0.

The first obvious idea is to condition on x and ¬x:

  • (1) Σv u(v)(P(v|x,e,a)+λP(v|¬x,e,a))

The second one is to use intersections rather than conditionals (as in this post):

  • (2) Σv u(v)(P(v,x|e,a)+λP(v,¬x|e,a))

Finally, imagine that we have a set of variables H, that "screen off" the effects of e and a, up until X. Let h be a set of values H can take. Thus P(x|h,e,a)=P(x|h). One could see H as the full set of possible pre-X histories, but it could be much smaller - maybe just the local environment around X. This gives a third definition:

  • (3) Σv Σh u(v)(P(v|h,x,e,a)+λP(v|h,¬x,e,a))P(h|,e,a)

 

Changing and unchangeable P(x)

An important thing to note is that all three definitions are equivalent for fixed P(x), up to changes of λ. The equivalence of (2) and (1) derives from the fact that Σv u(v)(P(v,x|e,a)+λP(v,¬x|e,a)) = Σv u(v)(P(x)P(v|x,e,a)+λP(¬x)P(v|¬x,e,a)) (we write P(x) rather than P(x|e,a) since the probability of x is fixed). Thus a type (2) agent with λ is equivalent with a type (1) agent with λ'=λP(x)/P(¬x).

Similarly, P(v|h,x,e,a)=P(v,h,x|e,a)/(P(x|h,e,a)*P(h|e,a)). Since P(x|h,e,a)=P(x), equation (3) reduces to Σv Σh u(v)(P(x)P(v,h,x|e,a)+λP(¬x)P(v,h,¬x|e,a)). Summing over h, this becomes Σv u(v)(P(x)P(v,x|e,a)+λP(¬x)P(v,¬x|e,a))=Σv u(v)(P(v|x,e,a)+λP(v|¬x,e,a)), ie the same as (1).

What about non-constant x? Let c(x) and c(¬x) be two contracts that pay out under x and ¬x, respectively. If the utility u is defined as 1 if a payout is received (and 0 otherwise), it's clear that both agent (1) and agent (3) assess c(x) as having an expected utility of 1 while c(¬x) has an expected utility of λ. This assessment is unchanging, whatever the probability of x. Therefore agents (1) and (3), in effect, see the odds of x as being a constant ratio 1:λ.

Agent (2), in contrast, gets a one-off artificial 1:λ update to the odds of x and then proceeds to update normally. Suppose that X is a coin toss that the agent believes is fair, having extensively observed the coin. Then it will believe that the odds are 1:λ. Suppose instead that it observes the coin has a λ:1 odd ratio; then it will believe the true odds are 1:1. It will be accurate, with a 1:λ ratio added on.

The effects of this percolate backwards in time from X. Suppose that X was to be determined by the toss of one of two unfair coins, one with odds ε:1 and one with odds 1:ε. The agent would assess the odds of the first coin being used rather than the second as around 1:λ. This update would extend to the process of choosing the coins, and anything that that depended on. Agent (1) is similar, though its update rule always assumes the odds of x:¬x being fixed; thus any information about the processes of coin selection is interpreted as a change in the probability of the processes, not a change in the probability of the outcome.

Agent (3), in contrast, is completely different. It assess the probability of H=h objectively, but then assumes that the odds of x and ¬x, given any h, is 1:λ. Thus if given updates about the probability of which coin is used, it will assess those updates objectively, but then assume that both coins are "really" giving 1:1 odds. It cuts off the update process at h, thus ensuring that it is "incorrect" only about x and its consequences, not its pre-h causes.

 

Utility and probability: assessing goal stability

Agents with unstable goals are likely to evolve towards being (equivalent to) expected utility maximisers. The converse is more complicated, but we'll assume here that an agent's goal is stable if it is an expected utility maximiser for some probability distribution.

Which one? I've tended to shy away from changing the probability, preferring to change the utility instead. If we divide the probability in equation (2) by 1+λ, it becomes a u-maximiser with a biased probability distribution. Alternatively, if we defined u'(v,x)=u(v) and u'(v,¬x)=λu(v), then it is a u'-maximiser with an unmodified probability distribution. Since all agents are equivalent for fixed P(x), we can see that in that case, all agents can be seen as expected utility maximisers with the standard probability distribution. 

Paul questioned whether the difference was relevant. I preferred the unmodified probability distribution - maybe the agent uses the distribution for induction, maybe having false probability beliefs will interfere with AI self-improvement, or maybe agents with standard probability distributions are easier to corrige - but for agent (2) the difference seems to be arguably a matter of taste.

Note that though agent (2) is stable, it's definition is not translation invariant in u. If we add c to u, we add c(P(x|e,a)+λP(¬x|e,a)) to u'. Thus, if the agent can affect the value of P(x) through its actions, different constants c likely give different behaviours.

Agent (1) is different. Except for the cases λ=0 and λ=∞, the agent cannot be an expected utility maximiser. To see this, just notice that an update about the process that could change the probability of x, gets reinterpreted as an update on the probability of that process. If we have the ε:1 and 1:ε coins, then any update about their respective probabilities of being used gets essentially ignored (as long as the evidence that the coins are biased is much stronger than the evidence as to which coin is used).

In the cases λ=0 and λ=∞, though, agent (1) is a u-maximiser that uses the probability distribution that assumes x or ¬x is certain, respectively. This is the main point of agent (1) - providing a simple maximiser for those cases.

What about agent (3)? Define u' by: u'(v,h,x)=u(v)/P(x|h), and u'(v,h,¬x)=λu(v)/P(¬x|h). Then consider the u'-maximiser:

  • (4) Σv Σh u'(v,h,x)P(v,h,x|e,a)+u'(v,h¬x)P(v,h,¬x|e,a)

Now P(v,h,x|e,a)=P(v|h,x,e,a)P(x|h,e,a)P(h|e,a). Because of the screening off assumptions, the middle term is the constant P(x|h). Multiplying this by u'(v,h,x)=u(v)/P(x|h) gives u(v)P(v|h,x,e,a)P(h|e,a). Similarly, the second term becomes λu(v)P(v|h,¬x,e,a)P(h|e,a). Thus a u'-maximiser, with the standard probability distribution, is the same as agent (3), thus proving the stability of that agent type.

 

Beyond the future: going crazy or staying sane

What happens after the event X has come to pass? In that case, agent (4), the u'-maximiser will continue as normal. Its behaviour will not be unusual as long as neither λ nor 1/λ is close to 0. The same goes for agent (2).

In contrast, agent (3) will no longer be stable after X, as H no longer screens off evidence after that point. And agent (1) was never stable in the first place, and now it denies all the evidence it sees to determine that impossible events actually happened. But what of those two agents, or the stable ones if λ or 1/λ were close to 0? In particular, what if λ falls below the probability that the agent is deluded in its observation of X?

In those cases, it's easy to argue that the agents would effectively go insane, believing wild and random things to justify their delusions.

But maybe not, in the end. Suppose that you, as a human, believe an untrue fact - maybe that Kennedy was killed on the 23rd of November rather than the 22nd. Maybe you construct elaborate conspiracy theories to account for the discrepancy. Maybe you posit an early mistake by some reporter that was then picked up and repeated. After a while, you discover that all the evidence you can find points to the 22nd. Thus, even though you believe with utter conviction that the assassination was on the 23rd, you learn to expect that the next piece of evidence will point to the 22nd. You look for the date-changing conspiracy, and never discover anything about it; and thus learn to expect they have covered their tracks so well they can't be detected.

In the end, the expectations of this "insane" agent could come to resemble those of normal agents, as long as there's some possibility of a general explanation of all the normal observations (eg a well-hidden conspiracy) given the incorrect assumption.

Of course, the safer option is just to corrige the agent to some sensible goal soon after X.

Does Probability Theory Require Deductive or Merely Boolean Omniscience?

4 potato 03 August 2015 06:54AM

It is often said that a Bayesian agent has to assign probability 1 to all tautologies, and probability 0 to all contradictions. My question is... exactly what sort of tautologies are we talking about here? Does that include all mathematical theorems? Does that include assigning 1 to "Every bachelor is an unmarried male"?1 Perhaps the only tautologies that need to be assigned probability 1 are those that are Boolean theorems implied by atomic sentences that appear in the prior distribution, such as: "S or ~ S".

It seems that I do not need to assign probability 1 to Fermat's last conjecture in order to use probability theory when I play poker, or try to predict the color of the next ball to come from an urn. I must assign a probability of 1 to "The next ball will be white or it will not be white", but Fermat's last theorem seems to be quite irrelevant. Perhaps that's because these specialized puzzles do not require sufficiently general probability distributions; perhaps, when I try to build a general Bayesian reasoner, it will turn out that it must assign 1 to Fermat's last theorem. 

Imagine a (completely impractical, ideal, and esoteric) first order language, who's particular subjects were discrete point-like regions of space-time. There can be an arbitrarily large number of points, but it must be a finite number. This language also contains a long list of predicates like: is blue, is within the volume of a carbon atom, is within the volume of an elephant, etc. and generally any predicate type you'd like (including n place predicates).2 The atomic propositions in this language might look something like: "5, 0.487, -7098.6, 6000s is Blue" or "(1, 1, 1, 1s), (-1, -1, -1, 1s) contains an elephant." The first of these propositions says that a certain point in space-time is blue; the second says that there is an elephant between two points at one second after the universe starts. Presumably, at least the denotational content of most english propositions could be expressed in such a language (I think, mathematical claims aside).

Now imagine that we collect all of the atomic propositions in this language, and assign a joint distribution over them. Maybe we choose max entropy, doesn't matter. Would doing so really require us to assign 1 to every mathematical theorem? I can see why it would require us to assign 1 to every tautological Boolean combination of atomic propositions [for instance: "(1, 1, 1, 1s), (-1, -1, -1, 1s) contains an elephant OR ~((1, 1, 1, 1s), (-1, -1, -1, 1s) contains an elephant)], but that would follow naturally as a consequence of filling out the joint distribution. Similarly, all the Boolean contradictions would be assigned zero, just as a consequence of filling out the joint distribution table with a set of reals that sum to 1. 

A similar argument could be made using intuitions from algorithmic probability theory. Imagine that we know that some data was produced by a distribution which is output by a program of length n in a binary programming language. We want to figure out which distribution it is. So, we assign each binary string a prior probability of 2^-n. If the language allows for comments, then simpler distributions will be output by more programs, and we will add the probability of all programs that print that distribution.3 Sure, we might need an oracle to figure out if a given program outputs anything at all, but we would not need to assign a probability of 1 to Fermat's last theorem (or at least I can't figure out why we would). The data might be all of your sensory inputs, and n might be Graham's number; still, there's no reason such a distribution would need to assign 1 to every mathematical theorem. 

Conclusion

A Bayesian agent does not require mathematical omniscience, or logical (if that means anything more than Boolean) omniscience, but merely Boolean omniscience. All that Boolean omniscience means is that for whatever atomic propositions appear in the language (e.g., the language that forms the set of propositions that constitute the domain of the probability function) of the agent, any tautological Boolean combination of those propositions must be assigned a probability of 1, and any contradictory Boolean combination of those propositions must be assigned 0. As far as I can tell, the whole notion that Bayesian agents must assign 1 to tautologies and 0 to contradictions comes from the fact that when you fill out a table of joint distributions (or follow the Komolgorov axioms in some other way) all of the Boolean theorems get a probability of 1. This does not imply that you need to assign 1 to Fermat's last theorem, even if you are reasoning probabilistically in a language that is very expressive.4 

Some Ways To Prove This Wrong:

Show that a really expressive semantic language, like the one I gave above, implies PA if you allow Boolean operations on its atomic propositions. Alternatively, you could show that Solomonoff induction can express PA theorems as propositions with probabilities, and that it assigns them 1. This is what I tried to do, but I failed on both occasions, which is why I wrote this. 


[1] There are also interesting questions about the role of tautologies that rely on synonymy in probability theory, and whether they must be assigned a probability of 1, but I decided to keep it to mathematics for the sake of this post. 

[2] I think this language is ridiculous, and openly admit it has next to no real world application. I stole the idea for the language from Carnap.

[3] This is a sloppily presented approximation to Solomonoff induction as n goes to infinity. 

[4] The argument above is not a mathematical proof, and I am not sure that it is airtight. I am posting this to the discussion board instead of a full-blown post because I want feedback and criticism. !!!HOWEVER!!! if I am right, it does seem that folks on here, at MIRI, and in the Bayesian world at large, should start being more careful when they think or write about logical omniscience. 

 

 

The fairness of the Sleeping Beauty

0 MrMind 07 July 2015 08:25AM

This post will attempt a (yet another) analysis of the problem of the Sleeping Beauty, in terms of Jaynes' framework "probability as extended logic" (aka objective Bayesianism).

TL,DR: The problem of the sleeping beauty reduces to interpreting the sentence “a fair coin is tossed”: it can mean either that no results of the toss is favourite, or that the coin toss is not influenced by anthropic information, but not both at the same time. Fairness is a property in the mind of the observer that must be further clarified: the two meanings cannot be confused.

What I hope to show is that the two standard solutions, 1/3 and 1/2 (the 'thirder' and the 'halfer' solutions), are both consistent and correct, and the confusion lies only in the incorrect specification of the sentence "a fair coin is tossed".

The setup is given both in the Lesswrong's wiki and in Wikipedia, so I will not repeat it here. 

I'm going to symbolize the events in the following way: 

- It's Monday = Mon
- It's Tuesday = Tue
- The coin landed head = H
- The coin landed tail = T
- statement "A and B" = A & B
- statement "not A" = ~A

The problem setup leads to an uncontroversial attributions of logical structure:

1)    H = ~T (the coin can land only on head or tail)

2)    Mon = ~Tue (if it's Tuesday, it cannot be Monday, and viceversa) 

And of probability:

3)    P(Mon|H) = 1 (upon learning that the coin landed head, the sleeping beauty knows that it’s Monday)

4)    P(T|Tue) = 1 (upon learning that it’s Tuesday, the sleeping beauty knows that the coin landed tail)

Using the indifference principle, we can also derive another equation.

Let's say that the Sleeping Beauty is awaken and told that the coin landed tail, but nothing else. Since she has no information useful to distinguish between Monday and Tuesday, she should assign both events equal probability. That is:

5)    P(Mon|T) = P(Tue|T)

Which gives

6)    P(Mon & T) = P(Mon|T)P(T) = P(Tue|T)P(T) = P(Tue & T)

It's here that the analysis between "thirder" and "halfer" starts to diverge.

The wikipedia article says "Guided by the objective chance of heads landing being equal to the chance of tails landing, it should therefore hold that". We know however that there's no such thing as 'the objective chance'.

Thus, "a fair coin will be tossed", in this context, will mean different things for different people.

The thirders interpret the sentence to mean that beauty learns no new facts about the coin upon learning that it is Monday.

They thus make the assumption:

(TA) P(T|Mon) = P(H|Mon)

So:

7)    P(Mon & H) = P(H|Mon)P(Mon) = P(T|Mon)P(Mon) = P(Mon & T)

From 6) and 7) we have:

8)    P(Mon & H) = P(Mon & T) = P(Tue & T)

And since those events are a partition of unity, P(Mon & H) = 1/3.

And indeed from 8) and 3):

9)    1/3 =  P(Mon & H) = P(Mon|H)P(H) = P(H)

So that, under TA, P(H) = 1/3 and P(T) = 2/3.

Notice that also, since if it’s Monday the coin landed either on head or tail, P(H|Mon) = 1/2.

The thirder analysis of the Sleeping Beauty problem is thus one in which "a fair coin is tossed" means "Sleeping Beauty receives no information about the coin from anthropic information".

There is however another way to interpret the sentence, that is the halfer analysis:

(HA) P(T) = P(H)

Here, a fair coin is tossed means simply that we assign no preference to either side of the coin.

Obviously from 1:

10)  P(T) + P(H) = 1

So that, from 10) and HA)

11) P(H) = 1/2, P(T) = 1/2

But let’s not stop here, let’s calculate P(H|Mon).

First of all, from 3) and 11)

12) P(H & Mon) = P(H|Mon)P(Mon) = P(Mon|H)P(H) = 1/2

From 5) and 11) also

13) P(Mon & T) = 1/4

But from 12) and 13) we get

14) P(Mon) = P(Mon & T) + P(Mon & H) = 1/2 + 1/4 = 3/4

So that, from 12) and 14)

15) P(H|Mon) = P(H & Mon) / P(Mon) = 1/2 / 3/4 = 2/3

We have seen that either P(H) = 1/2 and P(H|Mon) = 2/3, or P(H) = 2/3 and P(H|Mon) = 1/2.

Nick Bostrom is correct in saying that self-locating information changes the probability distribution, but this is true in both interpretations.

The problem of the sleeping beauty reduces to interpreting the sentence “a fair coin is tossed”: it can mean either that no results of the toss is favourite, or that the coin toss is not influenced by anthropic information, that is, you can attribute the fairness of the coin to prior or posterior distribution.

Either P(H)=P(T) or P(H|Mon)=P(T|Mon), but both at the same time is not possible.

If probability were a physical property of the coin, then so would be its fairness. But since the causal interactions of the coin possess both kind of indifference (balance and independency from the future), that would make the two probability equivalent. 

That such is not the case just means that fairness is a property in the mind of the observer that must be further clarified, since the two meanings cannot be confused.

Presidents, asteroids, natural categories, and reduced impact

1 Stuart_Armstrong 06 July 2015 05:44PM

A putative new idea for AI control; index here.

EDIT: I feel this post is unclear, and will need to be redone again soon.

This post attempts to use the ideas developed about natural categories in order to get high impact from reduced impact AIs.

 

Extending niceness/reduced impact

I recently presented the problem of extending AI "niceness" given some fact X, to niceness given ¬X, choosing X to be something pretty significant but not overwhelmingly so - the death of a president. By assumption we had a successfully programmed niceness, but no good definition (this was meant to be "reduced impact" in a slight disguise).

This problem turned out to be much harder than expected. It seems that the only way to do so is to require the AI to define values dependent on a set of various (boolean) random variables Zj that did not include X/¬X. Then as long as the random variables represented natural categories, given X, the niceness should extend.

What did we mean by natural categories? Informally, it means that X should not appear in the definitions of these random variables. For instance, nuclear war is a natural category; "nuclear war XOR X" is not. Actually defining this was quite subtle; diverting through the grue and bleen problem, it seems that we had to define how we update X and the Zj given the evidence we expected to find. This was put in equation as picking Zj's that minimize

  • Variance{log[ P(X∧Z|E)*P(¬X∧¬Z|E) / P(X∧¬Z|E)*P(¬X∧Z|E) ]} 

where E is the random variable denoting the evidence we expected to find. Note that if we interchange X and ¬X, the ratio inverts, the log changes sign - but this makes no difference to the variance. So we can equally well talk about extending niceness given X to ¬X, or niceness given ¬X to X.

 

Perfect and imperfect extensions

The above definition would work for an "perfectly nice AI". That could be an AI that would be nice, given any combination of estimates of X and Zj. In practice, because we can't consider every edge case, we would only have an "expectedly nice AI". That means that the AI can fail to be nice in certain unusual and unlikely edge cases, in certain strange set of values of Zj that almost never come up...

...or at least, that almost never come up, given X. Since the "expected niceness" was calibrated given X, the such an expectedly nice AI may fail to be nice if ¬X results in a substantial change in the probability of the Zj (see the second failure mode in this post; some of the Zj may be so tightly coupled to the value of X that an expected niceness AI considers them fixed, and this results in problems if ¬X happens and their values change).

One way of fixing this is to require that the "swing" of the Zj be small upon changing X to ¬X or vice versa. Something like, for all values of {aj}, the ratio P({Zj=aj} | X) / P({Zj=aj} | ¬X) is contained between 100 and 1/100. This means that a reasonably good "expected niceness" calibrated on the Zj will transfer from X to ¬X (though the error may grow). This approach has some other advantages, as we'll see in the next section.

Of course, problems arise if the Zj are defined by smashing together events relatively independent of X with some that are very strongly dependent on X, into the same variable. Thus we'll further require that Zj cannot be decomposed into natural subvariables whose values swing more under changes from X to ¬X.

 

High impact from reduced impact

With this, we are finally in a position to define a reduced impact AI design that can have a large impact. Imagine that an asteroid is coming to hit the Earth. Two AIs are charged with targeting a laser at it. The laser isn't perfectly effective, and the asteroid isn't perfectly certain to hit the Earth anyway: let's say aiming the laser correctly will reduce the odds of impact by a factor of 10 or so (well within the probability swing restrictions of the previous section).

In the original formulation of this, the AIs were tasked with outputting the x and y coordinates to aim the laser. Here, instead, they'll be tasked with constructing the motivation of a robotic subagent that will move around for a while, see things and interact, before physically aiming the laser in the right direction. This task is harder.

Then the top level AI charged with aiming the y coordinates is given the following task:

  • It is motivated to cause the correct y-coordinate to be outputted.
  • If the x-coordinate AI is not turned on (call this event ¬X), it is strongly motivated to have reduced impact.
  • It must phrase the robotic subagent's utility function in terms of the Zj that obey all the properties above.
  • The evidence E considered for the equation governing the naturalness of Zj is considered (for example) to be a random sampling of space at the time both AIs are turned on. Most importantly, this is defined to be a theoretical process that the AI cannot affect physically.

So what will the AI do? It will produce a robot that will aim to aim the y-coordinate of the laser correctly, given ¬X, and not expand dangerously. However, we've setup things so that X is virtually certain, so the robot will see another, x-aiming robot, wander around. The chances of this happening, given ¬X, are minuscule, so the swing of the probability from ¬X to X, is huge. So, by the restrictions we've put on the Zj, the robot will not be motivated to change its behaviour just because its sees the other robot.

And if the laser is fired, the impact on humanity will, conversely, not be that large, so well within the parameters of the Zj. Thus the "general reduced impact" portion of the robot's motivation will be preserved.

And, of course, the x-robot will have the mirror image motivations, so they will both aim correctly, it seems. This should evade the problem described here.

 

Failure modes

There are two ways the approach could fail: one bad way, and two graceful ways. The bad way is if the "no decomposition" requirements (one of them is presented above, the other is in this post) fail. In this case, the niceness or reduced impact are not guaranteed to be extended to ¬X, and the AI's behaviour can become undefined. Note that this is is not bad optimisation pressure, just an absence of optimisation pressure - but that could still be potentially dangerous.

The first graceful failure happens if the AI is not able to create a reduced impact sub-agent given the restrictions on the Zj above. In that case, the AI does nothing. The second graceful failure happens if the AI evades our attempts to increase its impact, given ¬X. In that case, it simply becomes a reduced impact AI that does little. Not ideal, but not deadly.

 

Overall status: I'm not sure the idea is sound, at least not yet. Critiques welcome.

Grue, Bleen, and natural categories

3 Stuart_Armstrong 06 July 2015 01:47PM

A putative new idea for AI control; index here.

In a previous post, I looked at unnatural concepts such as grue (green if X was true, blue if it was false) and bleen. This was to enable one to construct the natural categories that extend AI behaviour, something that seemed surprisingly difficult to do.

The basic idea discussed in the grue post was that the naturalness of grue and bleen seemed dependent on features of our universe - mostly, that it was easy to tell whether an object was "currently green" without knowing what time it was, but we could not know whether the object was "currently grue" without knowing the time.

So the naturalness of the category depended on the type of evidence we expected to find. Furthermore, it seemed easier to discuss whether a category is natural "given X", rather than whether that category is natural in general. However, we know the relevant X in the AI problems considered so far, so this is not a problem.

 

Natural category, probability flows

Fix a boolean random variable X, and assume we want to check whether the boolean random variable Z is a natural category, given X.

If Z is natural (for instance, it could be the colour of an object, while X might be the brightness), then we expect to uncover two types of evidence:

  • those that change our estimate of X; this causes probability to "flow" as follows (or in the opposite directions):

  • ...and those that change our estimate of Z:

Or we might discover something that changes our estimates of X and Z simultaneously. If the probability flows to X and and Z in the same proportions, we might get:

What is an example of an unnatural category? Well, if Z is some sort of grue/bleen-like object given X, then we can have Z = X XOR Z', for Z' actually a natural category. This sets up the following probability flows, which we would not want to see:

More generally, Z might be constructed so that X∧Z, X∧¬Z, ¬X∧Z and ¬X∧¬Z are completely distinct categories; in that case, there are more forbidden probability flows:

and

In fact, there are only really three "linearly independent" probability flows, as we shall see.

 

Less pictures, more math

Let's represent the four possible state of affairs by four weights (not probabilities):

Since everything is easier when it's linear, let's set w11 = log(P(X∧Z)) and similarly for the other weights (we neglect cases where some events have zero probability). Weights are correspond to the same probabilities iff you get from one set to another by multiplying by a strictly positive number. For logarithms, this corresponds to adding the same constant to all the log-weights. So we can normalise our log-weights (select a single set of representative log-weights for each possible probability sets) by choosing the w such that

w11 + w12 + w21 + w22 = 0.

Thus the probability "flows" correspond to adding together two such normalised 2x2 matrices, one for the prior and one for the update. Composing two flows means adding two change matrices to the prior.

Four variables, one constraint: the set of possible log-weights is three dimensional. We know we have two allowable probability flows, given naturalness: those caused by changes to P(X), independent of P(Z), and vice versa. Thus we are looking for a single extra constraint to keep Z natural given X.

A little thought reveals that we want to keep constant the quantity:

w11 + w22 - w12 - w21.

This preserves all the allowed probability flows and rules out all the forbidden ones. Translating this back to a the general case, let "e" be the evidence we find. Then if Z is a natural category given X and the evidence e, the following quantity is the same for all possible values of e:

log[P(X∧Z|e)*P(¬X∧¬Z|e) / P(X∧¬Z|e)*P(¬X∧Z|e)].

If E is a random variable representing the possible values of e, this means that we want

log[P(X∧Z|E)*P(¬X∧¬Z|E) / P(X∧¬Z|E)*P(¬X∧Z|E)]

to be constant, or, equivalently, seeing the posterior probabilities as random variables dependent on E:

  • Variance{log[ P(X∧Z|E)*P(¬X∧¬Z|E) / P(X∧¬Z|E)*P(¬X∧Z|E) ]} = 0.

Call that variance the XE-naturalness measure. If it is zero, then Z defines a XE-natural category. Note that this does not imply that Z and X are independent, or independent conditional on E. Just that they are, in some sense, "equally (in)dependent whatever E is".

 

Almost natural category

The advantage of that last formulation becomes visible when we consider that the evidence which we uncover is not, in the real world, going to perfectly mark Z as natural, given X. To return to the grue example, though most evidence we uncover about an object is going to be the colour or the time rather than some weird combination, there is going to be somebidy who will right things like "either the object is green, and the sun has not yet set in the west; or instead perchance, those two statements are both alike in falsity". Upon reading that evidence, if we believe it in the slightest, the variance can no longer be zero.

Thus we cannot expect that the above XE-naturalness be perfectly zero, but we can demand that it be low. How low? There seems no principled way of deciding this, but we can make one attempt: that we cannot lower it be decomposing Z.

What do we mean by that? Well, assume that Z is a natural category, given X and the expected evidence, but Z' is not. Then we can define a new category boolean Y to be Z with high probability, and Z' otherwise. This will still have low XE-naturalness measure (as Z does) but is obviously not ideal.

Reversing this idea, we say Z defines a "XE-almost natural category" if there is no "more XE-natural" category that extends X∧Z (and the other for conjunctions). Technically, if

X∧Z = X∧Y,

Then Y must have equal or greater XE-naturalness measure to Z. And similarly for X∧¬Z, ¬X∧Z, and ¬X∧¬Z.

Note: I am somewhat unsure about this last definition; the concept I want to capture is clear (Z is not the combination of more XE-natural subvariables), but I'm not certain the definition does it.

 

Beyond boolean

What if Z takes n values, rather than being a boolean? This can be treated simply.

If we set the wjk to be log-weights as before, there are 2n free variables. The normalisation constraint is that they all sum to a constant. The "permissible" probability flows are given by flows from X to ¬X (adding a constant to the first column, subtracting it from the second) and pure changes in Z (adding constants to various rows, summing to 0). There are 1+ (n-1) linearly independent ways of doing this.

Therefore we are looking for 2n-1 -(1+(n-1))=n-1 independent constraints to forbid non-natural updating of X and Z. One basis set for these constraints could be to keep constant the values of

wj1 + w(j+1)2 - wj2 - w(j+1)1,

where j ranges between 1 and n-1.

This translates to variance constraints of the type:

  • Variance{log[ P(X∧{Z=j}|E)*P(¬X∧{Z=j+1}|E) / P(X∧{Z=j+1}|E)*P(¬X∧{Z=j}|E) ]} = 0.

But those are n different possible variances. What is the best global measure of XE-naturalness? It seems it could simply be

  • Maxjk Variance{log[ P(X∧{Z=j}|E)*P(¬X∧{Z=k}|E) / P(X∧{Z=k}|E)*P(¬X∧{Z=j}|E) ]} = 0.

If this quantity is zero, it naturally sends all variances to zero, and, when not zero, is a good candidate for the degree of XE-naturalness of Z.

The extension to the case where X takes multiple values is straightforward:

  • Maxjklm Variance{log[ P({X=l}∧{Z=j}|E)*P({X=m}∧{Z=k}|E) / P({X=l}∧{Z=k}|E)*P({X=m}∧{Z=j}|E) ]} = 0.

Note: if ever we need to compare the XE-naturalness of random variables taking different numbers of values, it may become necessary to divide these quantities by the number of variables involved, or maybe substitute a more complicated expression that contains all the different possible variances, rather than simply the maximum.

 

And in practice?

In the next post, I'll look at using this in practice for an AI, to evade presidential deaths and deflect asteroids.

Utility vs Probability: idea synthesis

4 Stuart_Armstrong 27 March 2015 12:30PM

A putative new idea for AI control; index here.

This post is a synthesis of some of the ideas from utility indifference and false miracles, in an easier-to-follow format that illustrates better what's going on.

 

Utility scaling

Suppose you have an AI with a utility u and a probability estimate P. There is a certain event X which the AI cannot affect. You wish to change the AI's estimate of the probability of X, by, say, doubling the odds ratio P(X):P(¬X). However, since it is dangerous to give an AI false beliefs (they may not be stable, for one), you instead want to make the AI behave as if it were a u-maximiser with doubled odds ratio.

Assume that the AI is currently deciding between two actions, α and ω. The expected utility of action α decomposes as:

u(α) = P(X)u(α|X) + P(¬X)u(α|¬X).

The utility of action ω is defined similarly, and the expected gain (or loss) of utility by choosing α over ω is:

u(α)-u(ω) = P(X)(u(α|X)-u(ω|X)) + P(¬X)(u(α|¬X)-u(ω|¬X)).

If we were to double the odds ratio, the expected utility gain becomes:

u(α)-u(ω) = (2P(X)(u(α|X)-u(ω|X)) + P(¬X)(u(α|¬X)-u(ω|¬X)))/Ω,    (1)

for some normalisation constant Ω = 2P(X)+P(¬X), independent of α and ω.

We can reproduce exactly the same effect by instead replacing u with u', such that

  • u'( |X)=2u( |X)
  • u'( |¬X)=u( |¬X)

Then:

u'(α)-u'(ω) = P(X)(u'(α|X)-u'(ω|X)) + P(¬X)(u'(α|¬X)-u'(ω|¬X)),

2P(X)(u(α|X)-u(ω|X)) + P(¬X)(u(α|¬X)-u(ω|¬X)).    (2)

This, up to an unimportant constant, is the same equation as (1). Thus we can accomplish, via utility manipulation, exactly the same effect on the AI's behaviour as a by changing its probability estimates.

Notice that we could also have defined

  • u'( |X)=u( |X)
  • u'( |¬X)=(1/2)u( |¬X)

This is just the same u', scaled.

The utility indifference and false miracles approaches were just special cases of this, where the odds ratio was sent to infinity/zero by multiplying by zero. But the general result is that one can start with an AI with utility/probability estimate pair (u,P) and map it to an AI with pair (u',P) which behaves similarly to (u,P'). Changes in probability can be replicated as changes in utility.

 

Utility translating

In the previous, we multiplied certain utilities by two. But by doing so, we implicitly used the zero point of u. But utility is invariant under translation, so this zero point is not actually anything significant.

It turns out that we don't need to care about this - any zero will do, what matters simply is that the spread between options is doubled in the X world but not in the ¬X one.

But that relies on the AI being unable to affect the probability of X and ¬X itself. If the AI has an action that will increase (or decrease) P(X), then it becomes very important where we set the zero before multiplying. Setting the zero in a different place is isomorphic with adding a constant to the X world and not the ¬X world (or vice versa). Obviously this will greatly affect the AI's preferences between X and ¬X.

One way of avoiding the AI affecting X is to set this constant so that u'(X)=u'(¬X), in expectation. Then the AI has no preferences between the two situations, and will not seek to boost one over the other. However, note that u(X) is an expected utility calculation. Therefore:

  1. Choosing the constant so that u'(X)=u'(¬X) requires accessing the AI's probability estimate P for various worlds; it cannot be done from outside, by multiplying the utility, as the previous approach could.
  2. Even if u'(X)=u'(¬X), this does not mean that u'(X|Y)=u'(¬X|Y) for every event Y that could happen before X does. Simple example: X is a coin flip, and Y is the bet of someone on that coin flip, someone the AI doesn't like.

This explains all the complexity of the utility indifference approach, which is essentially trying to decompose possible universes (and adding constants to particular subsets of universes) to ensure that u'(X|Y)=u'(¬X|Y) for any Y that could happen before X does.

Humans get different counterfactuals

2 Stuart_Armstrong 23 March 2015 02:54PM

A putative new idea for AI control; index here.

A lot of my ideas rely upon taking counterfactuals of events that have tiny ε probabilities of ever happening. This introduces some extra risks, mainly centred around human behaviour. For the counterfactual methods to work, we need the alternate world to be sensible, without things going crazy due to human action.

For instance, imagine that X defines an event where an AI would get turned on, with ¬X (of probability ε) corresponding to the AI failing to get turned on. There are two risks here: the first is that humans would react by saying "wow, an event of probability of ε actually happened; all our models must be wrong! Let's go crazy! Yay!". The second is that humans react by saying: "well, that was embarrassing - let's turn the AI on anyway."

To avoid this issue, imagine the following setup:

The "ON" signal first goes through an event Y, which has 99% chance of letting it through, then to the event X, which (as before) has a 1-ε chance of letting it through. The setup is designed so that humans cannot distinguish between ¬Y (the signal gets blocked at the first stage) and ¬X (the signal gets blocked at the second stage). This only needs to fool humans, not the AI itself.

The AI defines counterfactuals, as before, by looking at ¬X (possibly conditioning on Y versus ¬Y, if this is needed). Everything proceeds as previously from its perspective.

From the human perspective, however, the ¬X world is not distinguishable from the ¬Y one. Given (¬Y or ¬X), humans would conclude that ¬Y is the much more likely option:

P(¬Y|¬Y or ¬X)≈1-100ε.

So the ¬X counterfactual world (for the AI) is one where humans behave as if they were in the ¬Y world.

And ¬Y has one chance in a hundred of happening, which is unlikely, but not enough for humans to assume that their whole model of reality is wrong. Also, this is sufficiently likely that humans would give serious thought as to what to do in the ¬Y case, maybe arranging various pre-commitments or insurance options, making it possible to arrange (with high probability) that humans don't just ignore the result and try again immediately.

Note that this method can't be used (obviously) if ¬X is something hideously dangerous (like an unleashed UFAI), but in all other cases, it seems implementable.

Closest stable alternative preferences

3 Stuart_Armstrong 20 March 2015 12:41PM

A putative new idea for AI control; index here.

There's a result that's almost a theorem, which is that an agent that is an expected utility maximiser, is an agent that is stable under self-modification (or the creation of successor sub-agents).

Of course, this needs to be for "reasonable" utility, where no other agent cares about the internal structure of the agent (just its decisions), where the agent is not under any "social" pressure to make itself into something different, where the boundedness of the agent itself doesn't affect its motivations, and where issues of "self-trust" and acausal trade don't affect it in relevant ways, etc...

So quite a lot of caveats, but the result is somewhat stronger in the opposite direction: an agent that is not an expected utility maximiser is under pressure to self-modify itself into one that is. Or, more correctly, into an agent that is isomorphic with an expected utility maximiser (an important distinction).

What is this "pressure" agent are "under"? The known result is that if an agent obeys four simple axioms, then its behaviour must be isomorphic with an expected utility maximiser. If we assume the Completeness axiom (trivial) and Continuity (subtle), then violations of Transitivity or Independence correspond to situations where the agent has been money pumped - lost resources or power for no gain at all. The more likely the agent is to face these situations, the more pressure they're under to behave as an expected utility maximiser, or simply lose out.

 

Unbounded agents

I have two models for how idealised agents could deal with this sort of pressure. The first, post-hoc, is the unlosing agent I described here. The agent follows whatever preferences it had, but kept track of its past decisions, and whenever it was in a position to violate transitivity or independence in a way that it would suffer from, it makes another decision instead.

Another, pre-hoc, way of dealing with this is to make an "ultra choice" and choose between not decisions, but all possible input output maps (equivalently, between all possible decision algorithms), looking to the expected consequences of each one. This reduces the choices to a single choice, where issues of transitivity or independence need not necessarily apply.

 

Bounded agents

Actual agents will be bounded, unlikely to be able to store and consult their entire history when making every single decision, and unable to look at the whole future of their interactions to make a good ultra choice. So how would they behave?

This is not determined directly by their preferences, but by some sort of meta-preferences. Would they make an approximate ultra-choice? Or maybe build up a history of decisions, and then simplify it (when it gets to large to easily consult) into a compatible utility function? This is also determined by their interactions, as well - an agent that makes a single decision has no pressure to be an expected utility maximiser, one that makes trillions of related decisions has a lot of pressure.

It's also notable that different types of boundedness (storage space, computing power, time horizons, etc...) have different consequences for unstable agents, and would converge to different stable preference systems.

 

Investigation needed

So what is the point of this post? It isn't presenting new results; it's more an attempt to launch a new sub-field of investigation. We know that many preferences are unstable, and that the agent is likely to make them stable over time, either through self-modification, subagents, or some other method. There are also suggestions for preferences that are known to be unstable, but have advantages (such as resistance to Pascal Muggings) that standard maximalisation does not.

Therefore, instead of saying "that agent design can never be stable", we should be saying "what kind of stable design would that agent converge to?", "does that convergent stable design still have the desirable properties we want?" and "could we get that stable design directly?".

The first two things I found in this area were that traditional satisficers could converge to vastly different types of behaviour in an essentially unconstrained way, and that a quasi-expected utility maximiser of utility u might converge to an expected utility maximiser, but it might not be u that it maximises.

In fact, we need not look only at violations of the axioms of expected utility; they are but one possible reason for decision behaviour instability. Here are some that spring to mind:

  1. Non-independence and non-transitivity (as above).
  2. Boundedness of abilities.
  3. Adversaries and social pressure.
  4. Evolution (survival cost to following “odd” utilities (eg time-dependent preference)).
  5. Unstable decision theories (such as CDT).

Now, some categories (such as "Adversaries and social pressure") may not possess a tidy stable solution, but it is still worth asking what setups are more stable than others, and what the convergence rules are expected to be.

Anti-Pascaline agent

4 Stuart_Armstrong 12 March 2015 02:17PM

A putative new idea for AI control; index here.

Pascal's wager-like situations come up occasionally with expected utility, making some decisions very tricky. It means that events of the tiniest of probability could dominate the whole decision - intuitively unobvious, and a big negative for a bounded agent - and that expected utility calculations may fail to converge.

There are various principled approaches to resolving the problem, but how about an unprincipled approach? We could try and bound utility functions, but the heart of the problem is not high utility, but hight utility combined with low probability. Moreover, this has to behave sensibly with respect to updating.

 

The agent design

Consider a UDT-ish agent A looking at input-output maps {M} (ie algorithms that could determine every single possible decision of the agent in the future). We allow probabilistic/mixed output maps as well (hence A has access to a source of randomness). Let u be a utility function, and set 0 < ε << 1 to be the precision. Roughly, we'll be discarding the highest (and lowest) utilities that are below probability ε. There is no fundamental reason that the same ε should be used for highest and lowest utilities, but we'll keep it that way for the moment.

The agent is going to make an "ultra-choice" among the various maps M (ie fixing its future decision policy), using u and ε to do so. For any M, designate by A(M) the decision of the agent to use M for its decisions.

Then, for any map M, set max(M) to be the lowest number s.t P(u ≥ max(M)|A(M)) ≤ ε. In other words, if the agent decides to use M as its decision policy, this is the maximum utility that can be achieved if we ignore the highest valued ε of the probability distribution. Similarly, set min(M) to be the highest number s.t. P(u ≤ min(M)|A(M)) ≤ ε.

Then define the utility function uMε, which is simply u, bounded between max(M) and min(M). Now calculate the expected value of uMε given A(M), call this Eε(u|A(M)).

The agent then chooses the M that maximises Eε(u|A(M)). Call this the ε-precision u-maximising algorithm.

 

Stability of the design

The above decision process is stable, in that there is a single ultra-choice to be made, and clear criteria for making that ultra-choice. Realistic and bounded agents, however, cannot calculate all the M in sufficient detail to get a reasonable outcome. So we can ask whether the design is stable for a bounded agent.

Note that this question is underdefined, as there are many ways of being bounded, and many ways of cashing out ε-precision u-maximising into bounded form. Most likely, this will not be a direct expected utility maximalisation, so the algorithm will be unstable (prone to change under self-modification). But how exactly it's unstable is an interesting question.

I'll look at one particular situation: one where A was tasked with creating subagents that would go out and interact with the world. These agents are short-sighted: they apply ε-precision u-maximising not to the ultra-choice, but to each individual expected utility calculation (we'll assume the utility gains and losses for each decision is independent).

A has a single choice: what to set ε to for the subagents. Intuitively, it would seem that A would set ε lower than its own value; this could correspond roughly to an agent self-modifying to remove the ε-precision restriction from itself, converging on becoming a u-maximiser. However:

  • Theorem: There are (stochastic) worlds in which A will set the subagent precision to be higher, lower or equal to its own precision ε.

The proof will be by way of illustration of the interesting things that can happen in this setup. Let B be the subagent whose precision A sets.

Let C(p) be a coupon that pays out 1 with probability p. xC(p) simply means the coupon pays out x instead of 1. Each coupon costs ε2 utility. This is negligible, and only serves to break ties. Then consider the following worlds:

  • In W1, B will be offered the possibility of buying C(0.75ε).
  • In W2, B will be offered the possibility of buying C(1.5ε).
  • In W3, B will be offered the possibility of buying C(0.75ε), and the offer will be made twice.
  • In W4, B will be offered, with 50% probability, the possibility of buying C(1.5ε).
  • In W5, B will be offered, with 50% probability, the possibility of buying C(1.5ε), and otherwise the possibility buying 2C(1.5ε).
  • In W6, B will be offered, with 50% probability, the possibility of buying C(0.75ε), and otherwise the possibility buying 2C(1.5ε).
  • In W7, B will be offered, with 50% probability, the possibility of buying C(0.75ε), and otherwise the possibility buying 2C(1.05ε).

From A’s perspective, the best input-output maps are: in W1, don’t buy, in W2, buy, in W3, buy both, in W4, don’t buy (because the probability of getting above 0 utility by buying, is, from A's initial perspective, 1.5ε/2 = 0.75ε).

W5 is more subtle, and interesting – essentially A will treat 2C(1.5ε) as if it were C(1.5ε) (since the probability of getting above 1 utility by buying is 1.5ε/2 = 0.75ε, while the probability of getting above zero by buying is (1.5ε+1.5ε)/2=1.5ε). Thus A would buy everything offered.

Similarly, in W6, the agent would buy everything, and in W7, the agent would buy nothing (since the probability of getting above zero by buying is now (1.05ε + 0.75ε)/2 = 0.9ε).

So in W1 and W2, the agent can leave the sub-agent precision at ε. In W2, it needs to lower it below 0.75ε. In W4, it needs to raise it above 1.5ε. In W5 it can leave it alone, while in W6 it must lower it below 0.75ε, and in W7 it must raise it above 1.05ε.

 

Irrelevant information

One nice feature about this approach is that it ignores irrelevant information. Specifically:

  • Theorem: Assume X is a random variable that is irrelevant to the utility function u. If A (before knowing X) has to design successor agents that will exist after X is revealed, then (modulo a few usual assumptions about only decisions mattering, not internal thought processes) it will make these successor agents isomorphic to copies of itself, i.e. ε-precision u-maximising algorithms (potentially with a different way of breaking ties).

These successor agents are not the short-sighted agents of the previous model, but full ultra-choice agents. Their ultra-choice is over all decisions to come, while A's ultra-choice (which is simply a choice) is over all agent designs.

For the proof, I'll assume X is boolean valued (the general proof is similar). Let M be the input-output map A would choose for itself, if it were to make all the decisions itself rather than just designing a subagent. Now, it's possible that M(X) will be different from M(¬X) (here M(X) and M(¬X) are contractions of the input-output map by adding in one of the inputs).

Define the new input-ouput map M' by defining a new internal variable Y in A (recall that A has access to a source of randomness). Since this variable is new, M is independent of the value of Y. Then M' is defined as M with X and Y permuted. Since both Y and X are equally irrelevant to u, Eε(u|A(M))=Eε(u|A(M')), so M' is an input output map that fulfils the ε-precision u-maximising. And M'(X)=M'(¬X), so M' is independent of X.

Now consider the subagent that runs the same algorithm as A, and has seen X. Because of the irrelevance of X, M'(X) will still fulfil ε-precision u-maximising (we can express any fact relevant to u in the form of Zs, with P(Z)=P(Z|X), and then the algorithm is the same).

Similarly, a subagent that has seen ¬X will run M'(¬X). Putting these together, the subagent will expect to run M'(X) with probability P(X) and M'(¬X) with probability P(¬X)=1-P(X).

Since M'(X)=M'(¬X), this whole thing is just M'. So if A creates a copy of itself (possibly tweaking the tie-breaking so that M' is selected), then it will achieve its maximum according to ε-precision u-maximising.

False thermodynamic miracles

13 Stuart_Armstrong 05 March 2015 05:04PM

A putative new idea for AI control; index here. See also Utility vs Probability: idea synthesis.

Ok, here is the problem:

  • You have to create an AI that believes (or acts as if it believed) that event X is almost certain, while you believe that X is almost impossible. Furthermore, you have to be right. To make things more interesting, the AI is much smarter than you, knows everything that you do (and more), and has to react sensibly when event X doesn't happen.

Answers will be graded on mathematics, style, colours of ink, and compatibility with the laws of physics. Also, penmanship. How could you achieve this?

continue reading »

More marbles and Sleeping Beauty

4 Manfred 23 November 2014 02:00AM

I

Previously I talked about an entirely uncontroversial marble game: I flip a coin, and if Tails I give you a black marble, if Heads I flip another coin to either give you a white or a black marble.

The probabilities of seeing the two marble colors are 3/4 and 1/4, and the probabilities of Heads and Tails are 1/2 each.

The marble game is analogous to how a 'halfer' would think of the Sleeping Beauty problem - the claim that Sleeping Beauty should assign probability 1/2 to Heads relies on the claim that your information for the Sleeping Beauty problem is the same as your information for the marble game - same possible events, same causal information, same mutual exclusivity and exhaustiveness relations.

So what's analogous to the 'thirder' position, after we take into account that we have this causal information? Is it some difference in causal structure, or some non-causal anthropic modification, or something even stranger?

As it turns out, nope, it's the same exact game, just re-labeled.

In the re-labeled marble game you still have two unknown variables (represented by flipping coins), and you still have a 1/2 chance of black and Tails, a 1/4 chance of black and Heads, and a 1/4 chance of white and Heads.

And then to get the thirds, you ask the question "If I get a black marble, what is the probability of the faces of the first coin?" Now you update to P(Heads|black)=1/3 and P(Tails|black)=2/3.

II

Okay, enough analogies. What's going on with these two positions in the Sleeping Beauty problem?

1:            2:

Here are two different diagrams, which are really re-labelings of the same diagram. The first labeling is the problem where P(Heads|Wake) = 1/2. The second labeling is the problem where P(Heads|Wake) = 1/3. The question at hand is really - which of these two math problems corresponds to the word problem / real world situation?

As a refresher, here's the text of the Sleeping Beauty problem that I'll use: Sleeping Beauty goes to sleep in a special room on Sunday, having signed up for an experiment. A coin is flipped - if the coin lands Heads, she will only be woken up on Monday. If the coin lands Tails, she will be woken up on both Monday and Tuesday, but with memories erased in between. Upon waking up, she then assigns some probability to the coin landing Heads, P(Heads|Wake).

Diagram 1:  First a coin is flipped to get Heads or Tails. There are two possible things that could be happening to her, Wake on Monday or Wake on Tuesday. If the coin landed Heads, then she gets Wake on Monday. If the coin landed Tails, then she could either get Wake on Monday or Wake on Tuesday (in the marble game, this was mediated by flipping a second coin, but in this case it's some unspecified process, so I've labeled it [???]).  Because all the events already assume she Wakes, P(Heads|Wake) evaluates to P(Heads), which just as in the marble game is 1/2.

This [???] node here is odd, can we identify it as something natural? Well, it's not Monday/Tuesday, like in diagram 2 - there's no option that even corresponds to Heads & Tuesday. I'm leaning towards the opinion that this node is somewhat magical / acausal, just hanging around because of analogy to the marble game. So I think we can take it out. A better causal diagram with the halfer answer, then, might merely be Coin -> (Wake on Monday / Wake on Tuesday), where Monday versus Tuesday is not determined at all by a causal node, merely informed probabilistically to be mutually exclusive and exhaustive.

Diagram 2:  A coin is flipped, Heads or Tails, and also it could be either Monday or Tuesday. Together, these have a causal effect on her waking or not waking - if Heads and Monday, she Wakes, but if Heads and Tuesday, she Doesn't wake. If Tails, she Wakes. Her pre-Waking prior for Heads is 1/2, but upon waking, the event Heads, Tuesday, Don't Wake gets eliminated, and after updating P(Heads|Wake)=1/3.

There's a neat asymmetry here. In diagram 1, when the coin was Heads she got the same outcome no matter the value of [???], and only when the coin was Tails were there really two options. In Diagram 2, when the coin is Heads, two different things happen for different values of the day, while if the coin is Tails the same thing happens no matter the day.

 

Do these seem like accurate depictions of what's going on in these two different math problems? If so, I'll probably move on to looking closer at what makes the math problem correspond to the word problem.

Deriving probabilities from causal diagrams

5 Manfred 13 November 2014 12:28AM

What this is: an attempt to examine how causal knowledge gets turned into probabilistic predictions.

I'm not really a fan of any view of probability that involves black boxes. I want my probabilities (or more practically, the probabilities of toy agents in toy problems I consider) to be derivable from what I know in a nice clear way, following some desideratum of probability theory at every step.

Causal knowledge sometimes looks like a black box, when it comes to assigning probabilities, and I would like to crack open that box and distribute the candy inside to smiling children.

What this is not: an attempt to get causal diagrams from constraints on probabilities.

That would be silly - see Pearl's article that was recently up here. Our reasonable desire is the reverse: getting the constraints on probabilities from the causal diagrams.

 

The Marble Game

Consider marbles. First, I use some coin-related process to get either Heads or Tails. If Tails, I give you a black marble. If Heads, I use some other process to choose between giving you a black marble or a white marble.

Causality is an important part of the marble game. If I manually interfere with the process that gives Heads or Tails, this can change the probability you should assign of getting a black marble. But if I manually interfere with the process that gives you white or black marbles, this won't change your probability of seeing Heads or Tails.

 

What I'd like versus what is

The fundamental principle of putting numbers to beliefs, that always applies, is to not make up information. If I don't know of any functional differences between two events, I shouldn't give them different probabilities. But going even further - if I learn a little information, it should only change my probabilities a little.

The general formulation of this is to make your probability distribution consistent with what you know, in the way that contains the very least information possible (or conversely, the maximum entropy). This is how to not make up information.

I like this procedure; if we write down pieces of knowledge as mathematical constraints, we can find correct distribution by solving a single optimization problem. Very elegant. Which is why it's a shame that this isn't at all what we do for causal problems.

Take the marble game. To get our probabilities, we start with the first causal node, figure out the probability of Heads without thinking about marbles at all (that's easy, it's 1/2), and then move on to the marbles while taking the coin as given (3/4 for black and 1/4 for white).

One cannot do this problem without using causal information. If we neglect the causal diagram, our information is the following: A: We know that Heads and Tails are mutually exclusive and exhaustive (MEE), B: we know that getting a black marble and getting a white marble are MEE, and C: we know that if the coin is Tails, you'll get a black marble.

This leaves three MEE options: Tails and Black (TB), HB, and HW. Maximizing entropy, they all get probability 1/3.

One could alternately think of it like this: if we don't have the causal part of the problem statement (the causal diagram D), we don't know whether the coin causes the marble choice, or the marble causes the coin choice - why not pick a marble first, and if it's W we give you an H coin, but if it's B we flip the coin? Heck, why have one cause the other at all? Indeed, you should recover the 1/3 result if you average over all the consistent causal diagrams.

So my question is - what causal constraints is our distribution subject to, and what is it optimizing? Not piece by piece, but all at once?

 

Rephrasing the usual process

One method is to just do the same steps as usual, but to think of the rationale in terms of knowledge / constraints and maximum entropy.

We start with the coin, and we say "because the coin's result isn't caused by the marbles, no information pertaining to marbles matters here. Therefore, P(H|ABCD) is just P(H|A) = 1/2" (First application of maximum entropy). Then we move on to the marbles, and applying information B and C, plus maximum entropy a second time, we learn that P(B|ABCD) = 3/4. All that our causal knowledge really meant for our probabilities was the equation P(H|ABCD)=P(H|A).

Alternatively, what if we only wanted to maximize something once, but let causal knowledge change the thing we were maximizing? We can say something like "we want to minimize the amount of information about the state of the coin, since that's the first causal node, and then minimize the amount of information about it's descendant node, the marble." Although this could be represented as one equation using linear multipliers, it's clearly the same process just with different labels.

 

Is it even possible to be more elegant?

Both of these approaches are... functional. I like the first one a lot better, because I don't want to even come close to messing with the principle of maximum entropy / minimal information. But I don't like that we never get to apply this principle all at once. Can we break our knowledge down more so that everything happen nice and elegantly?

The way we stated our knowledge above was as P(H|ABCD) = P(H|A). But this is equivalent to the statement that there's a symmetry between the left and right branches coming out of the causal node. We can express this symmetry using the equivalence principle as P(H)=P(T), or as P(HB)+P(HW)=P(TB).

But note that this is just hiding what's going on, because the equivalence principle is just a special case of the maximum entropy principle - we might as well just require that P(H)=1/2 but still say that at the end we're "maximizing entropy subject to this constraint."

 

Answer: Probably not

The general algorithm followed above is, for each causal node, to insert the condition that the probabilities of outputs of that node, given the starting information including the causal diagram, are equal to the probabilities given only the starting information related to that node or its parents - information about the descendants does not help determine probabilities of the parents.

Anthropic signature: strange anti-correlations

51 Stuart_Armstrong 21 October 2014 04:59PM

Imagine that the only way that civilization could be destroyed was by a large pandemic that occurred at the same time as a large recession, so that governments and other organisations were too weakened to address the pandemic properly.

Then if we looked at the past, as observers in a non-destroyed civilization, what would we expect to see? We could see years with no pandemics or no recessions; we could see mild pandemics, mild recessions, or combinations of the two; we could see large pandemics with no or mild recessions; or we could see large recessions with no or mild pandemics. We wouldn't see large pandemics combined with large recessions, as that would have caused us to never come into existence. These are the only things ruled out by anthropic effects.

Assume that pandemics and recessions are independent (at least, in any given year) in terms of "objective" (non-anthropic) probabilities. Then what would we see? We would see that pandemics and recessions appear to be independent when either of them are of small intensity. But as the intensity rose, they would start to become anti-correlated, with a large version of one completely precluding a large version of the other.

The effect is even clearer if we have a probabilistic relation between pandemics, recessions and extinction (something like: extinction risk proportional to product of recession size times pandemic size). Then we would see an anti-correlation rising smoothly with intensity.

Thus one way of looking for anthropic effects in humanity's past is to look for different classes of incidents that are uncorrelated at small magnitude, and anti-correlated at large magnitudes. More generally, to look for different classes of incidents where the correlation changes at different magnitudes - without any obvious reasons. Than might be the signature of an anthropic disaster we missed - or rather, that missed us.

Thought experiments on simplicity in logical probability

5 Manfred 20 August 2014 05:25PM

A common feature of many proposed logical priors is a preference for simple sentences over complex ones. This is sort of like an extension of Occam's razor into math. Simple things are more likely to be true. So, as it is said, "why not?"

 

Well, the analogy has some wrinkles - unlike hypothetical rules for the world, logical sentences do not form a mutually exclusive set. Instead, for every sentence A there is a sentence not-A with pretty much the same complexity, and probability 1-P(A). So you can't make the probability smaller for all complex sentences, because their negations are also complex sentences! If you don't have any information that discriminates between them, A and not-A will both get probability 1/2 no matter how complex they get.

But if our agent knows something that breaks the symmetry between A and not-A, like that A belongs to a mutually exclusive and exhaustive set of sentences with differing complexities, then it can assign higher probabilities to simpler sentences in this set without breaking the rules of probability. Except, perhaps, the rule about not making up information.

The question: is the simpler answer really more likely to be true than the more complicated answer, or is this just a delusion? If so, is it for some ontologically basic reason, or for a contingent and explainable reason?

 

There are two complications to draw your attention to. The first is in what we mean by complexity. Although it would be nice to use the Kolmogorov complexity of any sentence, which is the length of the shortest program that prints the sentence, such a thing is uncomputable by the kind of agent we want to build in the real world. The only thing our real-world agent is assured of seeing is the length of the sentence as-is. We can also find something in between Kolmogorov complexity and length by doing a brief search for short programs that print the sentence - this meaning is what is usually meant in this article, and I'll call it "apparent complexity."

The second complication is in what exactly a simplicity prior is supposed to look like. In the case of Solomonoff induction the shape is exponential - more complicated hypotheses are exponentially less likely. But why not a power law? Why not even a Poisson distribution? Does the difficulty of answering this question mean that thinking that simpler sentences are more likely is a delusion after all?

 

Thought experiments:

1: Suppose our agent knew from a trusted source that some extremely complicated sum could only be equal to A, or to B, or to C, which are three expressions of differing complexity. What are the probabilities?

 

Commentary: This is the most sparse form of the question. Not very helpful regarding the "why," but handy to stake out the "what." Do the probabilities follow a nice exponential curve? A power law? Or, since there are just the three known options, do they get equal consideration?

This is all based off intuition, of course. What does intuition say when various knobs of this situation are tweaked - if the sum is of unknown complexity, or of complexity about that of C? If there are a hundred options, or countably many? Intuitively speaking, does it seem like favoring simpler sentences is an ontologically basic part of your logical prior?

 

2: Consider subsequences of the digits of pi. If I give you a pair (n,m), you can tell me the m digits following the nth digit of pi. So if I start a sentence like "the subsequence of digits of pi (10100, 102) = ", do you expect to see simpler strings of digits on the right side? Is this a testable prediction about the properties of pi?

 

Commentary: We know that there is always a short-ish program to produce the sequences, which is just to compute the relevant digits of pi. This sets a hard upper bound on the possible Kolmogorov complexity of sequences of pi (that grows logarithmically as you increase m and n), and past a certain m this will genuinely start restricting complicated sequences, and thus favoring "all zeros" - or does it?

After all, this is weak tea compared to an exponential simplicity prior, for which the all-zero sequence would be hojillions of times more likely than a messy one. On the other hand, an exponential curve allows sequences with higher Kolmogorov complexity than the computation of the digits of pi.

Does the low-level view outlined in the first paragraph above demonstrate that the exponential prior is bunk? Or can you derive one from the other with appropriate simplifications (keeping in mind Komogorov complexity vs. apparent complexity)? Does pi really contain more long simple strings than expected, and if not what's going on with our prior?

 

3: Suppose I am writing an expression that I want to equal some number you know - that is, the sentence "my expression = your number" should be true. If I tell you the complexity of my expression, what can you infer about the likelihood of the above sentence?

 

Commentary: If we had access to Kolmogorov complexity of your number, then we could completely rule out answers that were too K-simple to work. With only an approximation, it seems like we can still say that simple answers are less likely up to a point. Then as my expression gets more and more complicated, there are more and more available wrong answers (and, outside of the system a bit, it becomes less and less likely that I know what I'm doing), and so probability goes down.

In the limit that my expression is much more complex than your number, does an elegant exponential distribution emerge from underlying considerations?

Top-Down and Bottom-Up Logical Probabilities

2 Manfred 22 July 2014 08:53AM

I.

I don't know very much model theory, and thus I don't fully understand Hutter et al.'s logical prior, detailed here, but nonetheless I can tell you that it uses a very top-down approach. About 60% of what I mean is that the prior is presented as a completed object with few moving parts, which fits the authors' mathematical tastes and proposed abstract properties the function should have. And for another thing, it uses model theory - a dead giveaway.

There are plenty of reasons to take a top-down approach. Yes, Hutter et al.'s function isn't computable, but sometimes the properties you want require uncomputability. And it's easier to come up with something vaguely satisfactory if you don't have to have many moving parts. This can range from "the prior is defined as a thing that fulfills the properties I want" on the lawful good side of the spectrum, to "clearly the right answer is just the exponential of the negative complexity of the statement, duh".

Probably the best reason to use a top-down approach to logical uncertainty is so you can do math to it. When you have some elegant description of global properties, it's a lot easier to prove that your logical probability function has nice properties, or to use it in abstract proofs. Hence why model theory is a dead giveaway.

There's one other advantage to designing a logical prior from the top down, which is that you can insert useful stuff like a complexity penalty without worrying too much. After all, you're basically making it up as you go anyhow, you don't have to worry about where it comes from like you would if you were going form the bottom up.

A bottom-up approach, by contrast, starts with an imagined agent with some state of information and asks what the right probabilities to assign are. Rather than pursuing mathematical elegance, you'll see a lot of comparisons to what humans do when reasoning through similar problems, and demands for computability from the outset.

For me, a big opportunity of the bottom-up approach is to use desiderata that look like principles of reasoning. This leads to more moving parts, but also outlaws some global properties that don't have very compelling reasons behind them.

 

II.

Before we get to the similarities, rather than the differences, we'll have to impose the condition of limited computational resources. A common playing field, as it were. It would probably serve just as well to extend bottom-up approaches to uncomputable heights, but I am the author here, and I happen to be biased towards the limited-resources case.

The part of top-down assignment using limited resources will be played by a skeletonized pastiche of Paul Christiano's recent report:

i. No matter what, with limited resources we can only assign probabilities to a limited pool of statements. Accordingly, step one is to use some process to choose the set S0 of statements (and their negations) to assign probabilities.

ii. Then we use something a weakened consistency condition (that can be decided between pairs of sentences in polynomial time) to set constraints on the probability function over S0. For example, sentences that are identical except for a double-negation have to be given the same probability.

iii. Christiano constructs a description-length-based "pre-prior" function that is bigger for shorter sentences. There are lots of options for different pre-priors, and I think this is a pretty good one.

iv. Finally, assign a logical probability function over S0 that is as similar as possible to the pre-prior while fulfilling the consistency condition. Christiano measures similarity using cross-entropy between the two functions, so that the problem is one of minimizing cross-entropy subject to a finite list of constraints. (Even if the pre-prior decreases exponentially, this doesn't mean that complicated statements will have exponentially low logical probability, because of the condition from step two that P(a statement) + P(its negation) = 1 - in a state of ignorance, everything still gets probability 1/2. The pre-prior only kicks in when there are more options with different description lengths.)

Next, let's look at the totally different world of a bottom-up assignment of logical probabilities, played here by a mildly rephrased version of my past proposal.

i. Pick a set of sentences S1 to try and figure out the logical probabilities of.

ii. Prove the truth or falsity of a bunch of statements in the closure of S1 under conjugation and negation (i.e. if sentences a and b are in S1, a&b is in the closure of S1).

iii. Assign a logical probability function over the closure of S1 under conjugation with maximum entropy, subject to the constraints proved in part two, plus the constraints that each sentence && its negation has probability 0.

These turn out to be really similar! Look in step three of my bottom-up example - there's a even a sneakily-inserted top-down condition about going through every single statement and checking an aspect of consistency. In the top-down approach, every theorem of a certain sort is proved, while in the bottom-up approach there are allowed to be lots of gaps - but the same sorts of theorems are proved. I've portrayed one as using proofs only about sentences in S0, and the other as using proofs in the entire closure of S1 under conjunction, but those are just points on an available continuum (for more discussion, see Christiano's section on positive semidefinite methods).

The biggest difference is this "pre-prior" thing. On the one hand, it's essential for giving us guarantees about inductive learning. On the other hand, what piece of information do we have that tells us that longer sentences really are less likely? I have unresolved reservations, despite the practical advantages.

 

III.

A minor confession - my choice of Christiano's report was not coincidental at all. The causal structure went like this:

Last week - Notice dramatic similarities in what gets proved and how it gets used between my bottom-up proposal and Christiano's top-down proposal.

Now - Write post talking about generalities of top-down and bottom-up approaches to logical probability, and then find as a startling conclusion the thing that motivated me to write the post in the first place.

The teeensy bit of selection bias here means that though these similarities are cool, it's hard to draw general conclusions.

So let's look at one more proposal, this one due to Abram Demski, modified by to use limited resources.

i. Pick a set of sentences S2 to care about.

ii. Construct a function on sentences in S2 that is big for short sentences and small for long sentences.

iii. Start with the set of sentences that are axioms - we'll shortly add new sentences to the set.

iv. Draw a sentence from S2 with probability proportional to the function from step two.

v. Do a short consistency check (can use a weakened consistency condition, or just limited time) between this sentence and the sentences already in the set. If it's passed, add the sentence to the set.

vi. Keep doing steps four and five until you've either added or ruled out all the sentences in S2.

vii. The logical probability of a sentence is defined as the probability that it ends up in our set after going through this process. We can find this probability using Monte Carlo by just running the process a bunch of times and counting up what portion of the time each sentences is in the set by the end.

Okay, so this one looks pretty different. But let's look for the similarities. The exact same kinds of things get proved again - weakened or scattershot consistency checks between different sentences. If all you have in S2 are three mutually exclusive and exhaustive sentences, the one that's picked first wins - meaning that the probability function over what sentence gets picked first is acting like our pre-prior.

So even though the method is completely different, what's really going on is that sentences are being given measure that looks like the pre-prior, subject to the constraints of weakened consistency (via rejection sampling) and normalization (keep repeating until all statements are checked).

In conclusion: not everything is like everything else, but some things are like some other things.

How do you notice when you are ignorant of necessary alternative hypotheses?

16 [deleted] 24 June 2014 06:12PM

So I just wound up in a debate with someone over on Reddit about the value of conventional academic philosophy.  He linked me to a book review, in which both the review and the book are absolutely godawful.  That is, the author (and the reviewer following him) start with ontological monism (the universe only contains a single kind of Stuff: mass-energy), adds in the experience of consciousness, reasons deftly that emergence is a load of crap... and then arrives to the conclusion of panpsychism.

WAIT HOLD ON, DON'T FLAME YET!

Of course panpsychism is bunk.  I would be embarrassed to be caught upholding it, given the evidence I currently have, but what I want to talk about is the logic being followed.

1) The universe is a unified, consistent whole.  Good!

2) The universe contains the experience/existence of consciousness.  Easily observable.

3) If consciousness exists, something in the universe must cause or give rise to consciousness.  Good reasoning!

4) "Emergence" is a non-explanation, so that can't be it.  Good!

5) Therefore, whatever stuff the unified universe is made of must be giving rise to consciousness in a nonemergent way.

6) Therefore, the stuff must be innately "mindy".

What went wrong in steps (5) and (6)?  The man was actually reasoning more-or-less correctly!  Given the universe he lived in, and the impossibility of emergence, he reallocated his probability mass to the remaining answer.  When he had eliminated the impossible, whatever remained, however low its prior, must be true.

The problem was, he eliminated the impossible, but left open a huge vast space of possible hypotheses that he didn't know about (but which we do): the most common of these is the computational theory of mind and consciousness, which says that we are made of cognitive algorithms.  A Solomonoff Inducer can just go on to the next length of bit-strings describing Turing machines, but we can't.

Now, I can spot the flaw in the reasoning here.  What frightens me is: what if I'm presented with some similar argument, and I can't spot the flaw?  What if, instead, I just neatly and stupidly reallocate my belief to what seems to me to be the only available alternative, while failing to go out and look for alternatives I don't already know about?  Notably, it seems like expected evidence is conserved, but expecting to locate new hypotheses means I should be reducing my certainty about all currently-available hypotheses now to have some for dividing between the new possibilities.

If you can notice when you're confused, how do you notice when you're ignorant?

Common sense quantum mechanics

11 dvasya 15 May 2014 08:10PM

Related to: Quantum physics sequence.

TLDR: Quantum mechanics can be derived from the rules of probabilistic reasoning. The wavefunction is a mathematical vehicle to transform a nonlinear problem into a linear one. The Born rule that is so puzzling for MWI results from the particular mathematical form of this functional substitution.

This is a brief overview a recent paper in Annals of Physics (recently mentioned in Discussion):

Quantum theory as the most robust description of reproducible experiments (arXiv)

by Hans De RaedtMikhail I. Katsnelson, and Kristel Michielsen. Abstract:

It is shown that the basic equations of quantum theory can be obtained from a straightforward application of logical inference to experiments for which there is uncertainty about individual events and for which the frequencies of the observed events are robust with respect to small changes in the conditions under which the experiments are carried out.

In a nutshell, the authors use the "plausible reasoning" rules (as in, e.g., Jaynes' Probability Theory) to recover the quantum-physical results for the EPR and SternGerlach experiments by adding a notion of experimental reproducibility in a mathematically well-formulated way and without any "quantum" assumptions. Then they show how the Schrodinger equation (SE) can be obtained from the nonlinear variational problem on the probability P for the particle-in-a-potential problem when the classical Hamilton-Jacobi equation holds "on average". The SE allows to transform the nonlinear variational problem into a linear one, and in the course of said transformation, the (real-valued) probability P and the action S are combined in a single complex-valued function ~P1/2exp(iS) which becomes the argument of SE (the wavefunction).

This casts the "serious mystery" of Born probabilities in a new light. Instead of the observed frequency being the square(d amplitude) of the "physically fundamental" wavefunction, the wavefunction is seen as a mathematical vehicle to convert a difficult nonlinear variational problem for inferential probability into a manageable linear PDE, where it so happens that the probability enters the wavefunction under a square root.

Below I will excerpt some math from the paper, mainly to show that the approach actually works, but outlining just the key steps. This will be followed by some general discussion and reflection.

1. Plausible reasoning and reproducibility

The authors start from the usual desiderata that are well laid out in Jaynes' Probability Theory and elsewhere, and add to them another condition:

There may be uncertainty about each event. The conditions under which the experiment is carried out may be uncertain. The frequencies with which events are observed are reproducible and robust against small changes in the conditions.

Mathematically, this is a requirement that the probability P(x|θ,Z) of observation x given an uncertain experimental parameter θ and the rest of out knowledge Z, is maximally robust to small changes in θ and independent of θ. Using log-probabilities, this amounts to minimizing the "evidence"

for any small ε so that |Ev| is not a function of θ (but the probability is).

2. The EinsteinPodolskyRosenBohm experiment

There is a source S that, when activated, sends a pair of signals to two routers R1,2. Each router then sends the signal to one of its two detectors Di+,– (i=1,2). Each router can be rotated and we denote as θ the angle between them. The experiment is repeated N times yielding the data set {x1,y1}, {x2,y2}, ... {xN,yN} where x and y are the outcomes from the two detectors (+1 or –1). We want to find the probability P(x,y|θ,Z).

After some calculations it is found that the single-trial probability can be expressed as P(x,y|θ,Z) = (1 + xyE12(θ) ) / 4, where E12(θ) = Σx,y=+–1 xyP(x,y|θ,Z) is a periodic function.

From the properties of Bernoulli trials it follows that, for a data set of N trials with nxy total outcomes of each type {x,y},

and expanding this in a Taylor series it is found that

The expression in the sum is the Fisher information IF for P. The maximum robustness requirement means it must be minimized. Writing it down as IF = 1/(1 – E12(θ)2) (dE12(θ)/dθ)2 one finds that E12(θ) = cos(θIF1/2 + φ), and since E12 must be periodic in angle, IF1/2 is a natural number, so the smallest possible value is IF = 1. Choosing φ π it is found that E12(θ) = –cos(θ), and we obtain the result that

which is the well-known correlation of two spin-1/2 particles in the singlet state.

Needless to say, our derivation did not use any concepts of quantum theory. Only plain, rational reasoning strictly complying with the rules of logical inference and some elementary facts about the experiment were used

3. The SternGerlach experiment

This case is analogous and simpler than the previous one. The setup contains a source emitting a particle with magnetic moment S, a magnet with field in the direction a, and two detectors D+ and D.

Similarly to the previous section, P(x|θ,Z) = (1 + xE(θ) ) / 2, where E(θ) = P(+|θ,Z) – P(–|θ,Z) is an unknown periodic function. By complete analogy we seek the minimum of IF and find that E(θ) = +–cos(θ), so that

In quantum theory, [this] equation is in essence just the postulate (Born’s rule) that the probability to observe the particle with spin up is given by the square of the absolute value of the amplitude of the wavefunction projected onto the spin-up state. Obviously, the variability of the conditions under which an experiment is carried out is not included in the quantum theoretical description. In contrast, in the logical inference approach, [equation] is not postulated but follows from the assumption that the (thought) experiment that is being performed yields the most reproducible results, revealing the conditions for an experiment to produce data which is described by quantum theory.

To repeat: there are no wavefunctions in the present approach. The only assumption is that a dependence of outcome on particle/magnet orientation is observed with robustness/reproducibility.

4. Schrodinger equation

A particle is located in unknown position θ on a line segment [–L, L]. Another line segment [–L, L] is uniformly covered with detectors. A source emits a signal and the particle's response is detected by one of the detectors.

After going to the continuum limit of infinitely many infinitely small detectors and accounting for translational invariance it is possible to show that the position of the particle θ and of the detector x can be interchanged so that dP(x|θ,Z)/dθ = –dP(x|θ,Z)/dx.

In exactly the same way as before we need to minimize Ev by minimizing the Fisher information, which is now

However, simply solving this minimization problem will not give us anything new because nothing so far accounted for the fact that the particle moves in a potential. This needs to be built into the problem. This can be done by requiring that the classical Hamilton-Jacobi equation holds on average. Using the Lagrange multiplier method, we now need to minimize the functional

Here S(x) is the action (Hamilton's principal function). This minimization yields solutions for the two functions P(x|θ,Z) and S(x). It is a difficult nonlinear minimization problem, but it is possible to find a matching solution in a tractable way using a mathematical "trick". It is known that standard variational minimization of the functional

yields the Schrodinger equation for its extrema. On the other hand, if one makes the substitution combining two real-valued functions P and S into a single complex-valued ψ,

Q is immediately transformed into F, concluding the derivation of the Schrodinger equation. Incidentally, ψ is constructed so that P(x|θ,Z) = |ψ(x|θ,Z)|2, which is the Born rule.

Summing up the meaning of Schrodinger equation in the present context:

Of course, a priori there is no good reason to assume that on average there is agreement with Newtonian mechanics ... In other words, the time-independent Schrodinger equation describes the collective of repeated experiments ... subject to the condition that the averaged observations comply with Newtonian mechanics.

The authors then proceed to derive the time-dependent SE (independently from the stationary SE) in a largely similar fashion.

5. What it all means

Classical mechanics assumes that everything about the system's state and dynamics can be known (at least in principle). It starts from axioms and proceeds to derive its conclusions deductively (as opposed to inductive reasoning). In this respect quantum mechanics is to classical mechanics what probabilistic logic is to classical logic.

Quantum theory is viewed here not as a description of what really goes on at the microscopic level, but as an instance of logical inference:

in the logical inference approach, we take the point of view that a description of our knowledge of the phenomena at a certain level is independent of the description at a more detailed level.

and

quantum theory does not provide any insight into the motion of a particle but instead describes all what can be inferred (within the framework of logical inference) from or, using Bohr’s words, said about the observed data

Such a treatment of QM is similar in spirit to Jaynes' Information Theory and Statistical Mechanics papers (I, II). Traditionally statistical mechanics/thermodynamics is derived bottom-up from the microscopic mechanics and a series of postulates (such as ergodicity) that allow us to progressively ignore microscopic details under strictly defined conditions. In contrast, Jaynes starts with minimum possible assumptions:

"The quantity x is capable of assuming the discrete values xi ... all we know is the expectation value of the function f(x) ... On the basis of this information, what is the expectation value of the function g(x)?"

and proceeds to derive the foundations of statistical physics from the maximum entropy principle. Of course, these papers deserve a separate post.

This community should be particularly interested in how this all aligns with the many-worlds interpretation. Obviously, any conclusions drawn from this work can only apply to the "quantum multiverse" level and cannot rule out or support any other many-worlds proposals.

In quantum physics, MWI does quite naturally resolve some difficult issues in the "wavefunction-centristic" view. However, we see that the concept wavefunction is not really central for quantum mechanics. This removes the whole problem of wavefunction collapse that MWI seeks to resolve.

The Born rule is arguably a big issue for MWI. But here it essentially boils down to "x is quadratic in t where t = sqrt(x)". Without the wavefunction (only probabilities) the problem simply does not appear.

Here is another interesting conclusion:

if it is difficult to engineer nanoscale devices which operate in a regime where the data is reproducible, it is also difficult to perform these experiments such that the data complies with quantum theory.

In particular, this relates to the decoherence of a system via random interactions with the environment. Thus decoherence becomes not as a physical intrinsically-quantum phenomenon of "worlds drifting apart", but a property of experiments that are not well-isolated from the influence of environment and therefore not reproducible. Well-isolated experiments are robust (and described by "quantum inference") and poorly-isolated experiments are not (hence quantum inference does not apply).

In sum, it appears that quantum physics when viewed as inference does not require many-worlds any more than probability theory does.

Quantum versus logical bombs

13 Stuart_Armstrong 17 November 2013 03:14PM

Child, I'm sorry to tell you that the world is about to end. Most likely. You see, this madwoman has designed a doomsday machine that will end all life as we know it - painlessly and immediately. It is attached to a supercomputer that will calculate the 10100th digit of pi - if that digit is zero, we're safe. If not, we're doomed and dead.

However, there is one thing you are allowed to do - switchout the logical trigger and replaced it by a quantum trigger, that instead generates a quantum event that will prevent the bomb from triggering with 1/10th measure squared (in the other cases, the bomb goes off). You ok paying €5 to replace the triggers like this?

If you treat quantum measure squared exactly as probability, then you shouldn't see any reason to replace the trigger. But if you believed in many worlds quantum mechanics (or think that MWI is possibly correct with non-zero probability), you might be tempted to accept the deal - after all, everyone will survive in one branch. But strict total utilitarians may still reject the deal. Unless they refuse to treat quantum measure as akin to probability in the first place (meaning they would accept all quantum suicide arguments), they tend to see a universe with a tenth of measure-squared as exactly equally valued to a 10% chance of a universe with full measure. And they'd even do the reverse, replace a quantum trigger with a logical one, if you paid them €5 to do so.

Still, most people, in practice, would choose to change the logical bomb for a quantum bomb, if only because they were slightly uncertain about their total utilitarian values. It would seem self evident that risking the total destruction of humanity is much worse than reducing its measure by a factor of 10 - a process that would be undetectable to everyone.

Of course, once you agree with that, we can start squeezing. What if the quantum trigger only has 1/20 measured-squared "chance" of saving us? 1/000? 1/10000? If you don't want to fully accept the quantum immortality arguments, you need to stop - but at what point?

Of all the SIA-doomsdays in the all the worlds...

4 Stuart_Armstrong 18 October 2013 12:56PM

Ideas developed with Paul Almond, who kept on flogging a dead horse until it started showing signs of life again.

Doomsday, SSA and SIA

Imagine there's a giant box filled with people, and clearly labelled (inside and out) "(year of some people's lord) 2013". There's another giant box somewhere else in space-time, labelled "2014". You happen to be currently in the 2013 box.

Then the self-sampling assumption (SSA) produces the doomsday argument. It works approximately like this: SSA has a preference for universe with smaller numbers of observers (since it's more likely that you're one-in-a-hundred than one-in-a-billion). Therefore we expect that the number of observers in 2014 is smaller than we would otherwise "objectively" believe: the likelihood of doomsday is higher than we thought.

What about the self-indication assumption (SIA) - that makes the doomsday argument go away, right? Not at all! SIA has no effect on the number of observers expected in the 2014, but increases the expected number of observers in 2013. Thus we still expect that the number of observers in 2014 to be lower than we otherwise thought. There's an SIA doomsday too!

Enter causality

What's going on? SIA was supposed to defeat the doomsday argument! What happens is that I've implicitly cheated - by naming the boxes "2013" and "2014", I've heavily implied that these "boxes" figuratively correspond two subsequent years. But then I've treated them as independent for SIA, like two literal distinct boxes.

continue reading »

Freakonomics Study Investigates Decision-Making and Estimated Prior Probabilities

3 telms 30 July 2013 04:08AM

The Freakonomics web site is currently conducting online research that appears, to this properly hypothesis-blinded participant, to be investigating decision-making and estimated prior probability of success. You can participate yourself at  http://www.freakonomics.com/experiments/.

The study asks the participant to choose a yes/no decision that they would be willing to commit to making on the basis of a random coin toss. (Well, actually, the random decay of an atomic nucleus, but they use coin flip graphics.) In my case, the only decision I was willing to make on such a random basis is something with very low risks: namely, the decision whether or not to quit twisting my hair. I accepted the obligation to change my behavior based on a coin toss, and the coin toss says I gotta change.

Breaking a habit of such long standing will be difficult. Past behavior is the best predictor of future behavior, and all that, so when they asked how LIKELY I thought it would be that my hair-twisting habit would stick despite my best efforts to get rid of it, I estimated 90%. Yet I also claimed that I WILL PROBABLY (not certainly, but probably) conquer the habit.

Yes, I recognize the dissonance between these two statements. It intrigues me. Is it perhaps the intent of the experiment to create explicit, conscious, cognitive dissonance like this in some participants, and see what difference it makes to outcomes?

They could easily have phrased the odds question in the inverse form. They COULD have asked how likely I thought it was that I would SUCCEED in achieving my goal. That would align neatly with my statement of commitment and yield no dissonance. I could make the usual biased assumptions that strength of willpower is the same as odds of success, and over-estimate those success odds accordingly.

I don't actually know that the study cares about this, but this is what I would care about if I were the researchers.

The Freakonomics people will be following up over time by email. They're also checking on me through a friend, so there is every possibility that they expect to see an interaction between social involvement in the decision's outcome and the presence of cognitive dissonance, which is believed to drive SOCIAL behavior more strongly than it drives personal decisions kept to oneself.

I'm posting this to increase my social commitment, of course. I also posted on Facebook. It's terrible to have a psychologically trained participant make assumptions about your research project and leverage those assumptions to the max for imaginary ends. But that's life in social science. :)

[LINK] If correlation doesn’t imply causation, then what does?

4 Strilanc 12 July 2013 05:39AM

A post about how, for some causal models, causal relationships can be inferred without doing experiments that control one of the random variables.

If correlation doesn’t imply causation, then what does?

To help address problems like the two example problems just discussed, Pearl introduced a causal calculus. In the remainder of this post, I will explain the rules of the causal calculus, and use them to analyse the smoking-cancer connection. We’ll see that even without doing a randomized controlled experiment it’s possible (with the aid of some reasonable assumptions) to infer what the outcome of a randomized controlled experiment would have been, using only relatively easily accessible experimental data, data that doesn’t require experimental intervention to force people to smoke or not, but which can be obtained from purely observational studies.

Caught in the glare of two anthropic shadows

17 Stuart_Armstrong 04 July 2013 07:54PM

This article consists of original new research, so would not get published on Wikipedia!

The previous post introduced the concept of the anthropic shadow: the fact that certain large and devastating disasters cannot be observed in the historical record, because if they had happened, we wouldn't be around to observe them. This absence forms an “anthropic shadow”.

But that was the result for a single category of disasters. What would happen if we consider two independent classes of disasters? Would we see a double shadow, or would one ‘overshadow’ the other?

To answer that question, we’re going to have to analyse the anthropic shadow in more detail, and see that there are two separate components to it:

  • The first is the standard effect: humanity cannot have developed a technological civilization, if there were large catastrophes in the recent past.
  • The second effect is the lineage effect: humanity cannot have developed a technological civilization, if there was another technological civilization in the recent past that survived to today (or at least, we couldn't have developed the way we did).

To illustrate the difference between the two, consider the following model. Segment time into arbitrarily “eras”. In a given era, a large disaster may hit with probability q, or a small disaster may independently hit with probability q (hence with probability q2, there will be both a large and a small disaster). A small disaster will prevent a technological civilization from developing during that era; a large one will prevent such a civilization from developing in that era or the next one.

If it is possible for a technological civilization to develop (no small disasters that era, no large ones in the preceding era, and no previous civilization), then one will do so with probability p. We will assume p constant: our model will only span a time frame where p is unchanging (maybe it's over the time period after the rise of big mammals?)

continue reading »

[Link]: Anthropic shadow, or the dark dusk of disaster

10 Stuart_Armstrong 04 July 2013 07:52PM

From a paper by Milan M. Ćirković, Anders Sandberg, and Nick Bostrom:

We describe a significant practical consequence of taking anthropic biases into account in deriving predictions for rare stochastic catastrophic events. The risks associated with catastrophes such as asteroidal/cometary impacts, supervolcanic episodes, and explosions of supernovae/gamma-ray bursts are based on their observed frequencies. As a result, the frequencies of catastrophes that destroy or are otherwise incompatible with the existence of observers are systematically underestimated. We describe the consequences of this anthropic bias for estimation of catastrophic risks, and suggest some directions for future work.

There cannot have been a large disaster on Earth in the last millennia, or we wouldn't be around to see it. There can't have been a very large disaster on Earth in the last ten thousand years, or we wouldn't be around to see it. There can't have been a huge disaster on Earth in the last million years, or we wouldn't be around to see it. There can't have been a planet-destroying disaster on Earth... ever.

Thus the fact that we exist precludes us seeing certain types of disasters in the historical record; as we get closer and closer to the present day, the magnitude of the disasters we can see goes down. These missing disasters form the "anthropic shadow", somewhat visible in the top right of this diagram:

Hence even though it looks like the risk is going down (the magnitude is diminishing as we approach the present), we can't rely on this being true: it could be a purely anthropic effect.

 

[LINK] Bets do not (necessarily) reveal beliefs

12 Cyan 27 May 2013 08:13PM

When does a bet fail to reveal your true beliefs? When it hedges a risk in your portfolio.

If this claim does not immediately strike you as obviously true, you may benefit from reading this post by econblogger Noah Smith. Excerpt:

 

...Alex Tabarrok famously declared that "a bet is a tax on bullshit".

But this idea, attractive as it is, is not quite true. The reason is something that I've decided to call the Fundamental Error of Risk. It's a mistake that most people make (myself often included!), and that an intro finance class spends months correcting. The mistake is looking at the risk and return of single assets instead of total portfolios. Basically, the risk of an asset - which includes a bet! - is based mainly on how that asset relates to other assets in your portfolio.

 

Probabilistic Löb theorem

24 Stuart_Armstrong 26 April 2013 06:45PM

In this post (based on results from MIRI's recent workshop), I'll be looking at whether reflective theories of logical uncertainty (such as Paul's design) still suffer from Löb's theorem.

Theories of logical uncertainty are theories which can assign probability to logical statements. Reflective theories are theories which know something about themselves within themselves. In Paul's theory, there is an external P, in the meta language, which assigns probabilities to statements, an internal P, inside the theory, that computes probabilities of coded versions of the statements inside the language, and a reflection principle that relates these two P's to each other.

And Löb's theorem is the result that if a (sufficiently complex, classical) system can prove that "a proof of Q implies Q" (often abbreviated as □Q → Q), then it can prove Q. What would be the probabilistic analogue? Let's use □aQ to mean P('Q')≥1-a (so that □0Q is the same as the old □Q; see this post on why we can interchange probabilistic and provability notions). Then Löb's theorem in a probabilistic setting could:

Probabilistic Löb's theorem: for all a<1, if the system can prove □aQ → Q, then the system can prove Q.

To understand this condition, we'll go through the proof of Löb's theorem in a probabilistic setting, and see if and when it breaks down. We'll conclude with an example to show that any decent reflective probability theory has to violate this theorem.

continue reading »

Logic in the language of probability

12 Stuart_Armstrong 26 April 2013 06:45PM

This post is a minor note, to go along with the post on the probabilistic Löb theorem. It simply seeks to justify why terms like "having probability 1" are used interchangeably with "provable" and why implications symbols "→" can be used in a probabilistic setting.

Take a system of classical logic, with a single rule of inference: modus ponens:

From A and A→B, deduce B.

Having a single rule of inference isn't much of a restriction, because you can replace other rules of inference ("from A1,A2,... and An, deduce B") with an axiom or axiom schema ("A1∧A2∧...∧An → B") and then use modus ponens on that axiom to get the other rule of inference.

In this logical system, I'm now going to make some purely syntactical changes - not changing the meaning of anything, just the way we write things. For any sentence A that doesn't contain an implication arrow →, replace

A with P(A)=1.

Similarly, replace any sentence of the type

A → B with P(B|A)=1.

This is recursive, so we replace

(A → B) → C with P(C | P(B|A)=1 )=1.

And instead of using modus ponens, we'll use a combined Bayesian inference and law of total probability:

From P(A)=1 and P(B|A)=1, deduce P(B)=1.

continue reading »

Estimate Stability

6 lukeprog 13 April 2013 06:33PM

I've been trying to get clear on something you might call "estimate stability." Steven Kaas recently posted my question to StackExchange, but we might as well post it here as well:

I'm trying to reason about something I call "estimate stability," and I'm hoping you can tell me whether there’s some relevant technical language...
What do I mean by "estimate stability?" Consider these three different propositions:
  1. We’re 50% sure that a coin (known to be fair) will land on heads.
  2. We’re 50% sure that Matt will show up at the party.
  3. We’re 50% sure that Strong AI will be invented by 2080.
These estimates feel different. One reason they feel different is that the estimates have different degrees of "stability." In case (1) we don't expect to gain information that will change our probability estimate. But for cases (2) and (3), we may well come upon some information that causes us to adjust the estimate either up or down.
So estimate (1) is more "stable," but I'm not sure how this should be quantified. Should I think of it in terms of running a Monte Carlo simulation of what future evidence might be, and looking at something like the variance of the distribution of the resulting estimates? What happens when it’s a whole probability distribution for e.g. the time Strong AI is invented? (Do you do calculate the stability of the probability density for every year, then average the result?)
Here are some other considerations that would be useful to relate more formally to considerations of estimate stability:
  • If we’re estimating some variable, having a narrow probability distribution (prior to future evidence with respect to which we’re trying to assess the stability) corresponds to having a lot of data. New data, in that case, would make less of a contribution in terms of changing the mean and reducing the variance.
  • There are differences in model uncertainty between the three cases. I know what model to use when predicting a coin flip. My method of predicting whether Matt will show up at a party is shakier, but I have some idea of what I’m doing. With the Strong AI case, I don’t really have any good idea of what I’m doing. Presumably model uncertainty is related to estimate stability, because the more model uncertainty we have, the more we can change our estimate by reducing our model uncertainty.
  • Another difference between the three cases is the degree to which our actions allow us to improve our estimates, increasing their stability. For example, we can reduce the uncertainty and increase the stability of our estimate about Matt by calling him, but we don’t really have any good ways to get better estimates of Strong AI timelines (other than by waiting).
  • Value-of-information affects how we should deal with delay. Estimates that are unstable in the face of evidence we expect to get in the future seem to imply higher VoI. This creates a reason to accept delays in our actions. Or if we can easily gather information that will make our estimates more accurate and stable, that means we have more reason to pay the cost of gathering that information. If we expect to forget information, or expect our future selves not to take information into account, dynamic inconsistency becomes important. This is another reason why estimates might be unstable. One possible strategy here is to precommit to have our estimates regress to the mean.
Thanks for any thoughts!

Seize the Maximal Probability Moment

24 diegocaleiro 28 February 2013 11:22AM

Try and remember 3 or 4 things that you think would be effective hacks for your life but you have not so far implemented. Really, find three. 

 

Probably that was not so hard.

 

Now think of at which moment in time did you have a maximal probability of having implemented such hacks. Sometimes you had no idea that was the moment. But sometimes you did, like when a friend tells you "I just read this great paper on how people report cartoons being funnier when their face is shaped in a more smiling fashion." and you thought "Great! I may one day implement the algorithm: if studying, force a smile".

You knew you didn't plan to read the article, you knew you trust that friend, and you knew you'd either forget it later, or in any case that from that moment on, the likelihood of you implementing the algorithm would lower.

So my hack of the day is: If you feel you are likely at the maximal probability moment to start a new policy, start immediately.

 

My friend was telling me about how he went abroad to research: "...so at this place and people there used very strong lights as cognitive enhancement and yadda yadda yadda... (stopped listening for 40s) yadda yadda yadda.... and I wrote a paper on ..."  By that time my room had an extra 110W light working.   

 

Just now0 I thought: It was good I installed that light. Why didn't I do the same when I felt like finding a personalized shirt website where the front would be "I Don't want to talk about: [list]" and the back "Pick your topic: [list]" to once and for all stop the gossip and sports ice-breakers? 

I didn't seize the maximal probability moment. That's what happened.

 

Then I noticed that that1 was the maximal probability moment to install in my mind the maximal probability moment algorithm, I did,  and that2 was the maximal probability moment of writing this post.

Now if you'll excuse me, I have3 a shirt to buy.

[Link] On the Height of a Field

11 badger 02 January 2013 11:20AM

Mark Eichenlaub posted a great little case-study about the difficulty of updating beliefs, even over trivial matters like the slope of a baseball field. The basic story of Bayes-updating assumes the likelihood of evidence in different states is obvious, but feedback between observations and judgments about likelihood quickly complicate the situation:

The story of how belief is supposed to work is that for each bit of evidence, you consider its likelihood under all the various hypotheses, then multiplying these likelihoods, you find your final result, and it tells you exactly how confident you should be. If I can estimate how likely it is for Google Maps and my GPS to corroborate each other given that they are wrong, and how likely it is given that they are right, and then answer the same question for every other bit of evidence available to me, I don’t need to estimate my final beliefs – I calculate them. But even in this simple testbed of the matter of a sloped baseball field, I could feel my biases coming to bear on what evidence I considered, and how strong and relevant that evidence seemed to me.  The more I believed the baseball field was sloped, the more relevant (higher likelihood ratio) it seemed that there was that short steep hill on the side, and the less relevant that my intuition claimed the field was flat. The field even began looking more sloped to me as time went on, and I sometimes thought I could feel the slope as I ran, even though I never had before.

That’s what I was interested in here. I wanted to know more about the way my feelings and beliefs interacted with the evidence and with my methods of collecting it. It is common knowledge that people are likely to find what they’re looking for whatever the facts, but what does it feel like when you’re in the middle of doing this, and can recognizing that feeling lead you to stop?

Edit: Title changed from "An Empirical Evaluation into Runner's High," the original title of the article, to match the author's new title.

Some scary life extension dilemmas

2 Ghatanathoah 01 January 2013 06:41PM

Let's imagine a life extension drug has been discovered.  One dose of this drug extends one's life by 49.99 years.  This drug also has a mild cumulative effect, if it has been given to someone who has been dosed with it before it will extend their life by 50 years.

Under these constraints the most efficient way to maximize the amount of life extension this drug can produce is to give every dose to one individual.  If there was one dose available for all seven-billion people alive on Earth then giving every person one dose would result in a total of 349,930,000,000 years of life gained.  If one person was given all the doses a total of 349,999,999,999.99 years of life would be gained.  Sharing the life extension drug equally would result in a net loss of almost 70 million years of life.  If you're concerned about people's reaction to this policy then we could make it a big lottery, where every person on Earth gets a chance to gamble their dose for a chance at all of them.

Now, one could make certain moral arguments in favor of sharing the drug.  I'll get to those later.  However, it seems to me that gambling your dose for a chance at all of them isn't rational from a purely self-interested point of view either.  You will not win the lottery.  Your chances of winning this particular lottery are almost 7,000 times worse than your chances of winning the powerball jackpot.  If someone gave me a dose of the drug, and then offered me a chance to gamble in this lottery, I'd accuse them of Pascal's mugging.

Here's an even scarier thought experiment.  Imagine we invent the technology for whole brain emulation.  Let "x" equal the amount of resources it takes to sustain a WBE through 100 years of life.  Let's imagine that with this particular type of technology, it costs 10x to convert a human into a WBE and it costs 100x to sustain a biological human through the course of their natural life.  Let's have the cost of making multiple copies of a WBE once they have been converted be close to 0.

Again, under these constraints it seems like the most effective way to maximize the amount of life extension done is to convert one person into a WBE, then kill everyone else and use the resources that were sustaining them to make more WBEs, or extend the life of more WBEs.  Again, if we are concerned about people's reaction to this policy we could make it a lottery.  And again, if I was given a chance to play in this lottery I would turn it down and consider it a form of Pascal's mugging.

I'm sure that most readers, like myself, would find these policies very objectionable.  However, I have trouble finding objections to them from the perspective of classical utilitarianism.  Indeed, most people have probably noticed that these scenarios are very similar to Nozick's "utility monster" thought experiment.  I have made a list of possible objections to these scenarios that I have been considering:

1. First, let's deal with the unsatisfying practical objections.  In the case of the drug example, it seems likely that a more efficient form of life extension will likely be developed in the future.  In that case it would be better to give everyone the drug to sustain them until that time.  However, this objection, like most practical ones, seems unsatisfying.  It seems like there are strong moral objections to not sharing the drug.

Another pragmatic objection is that, in the case of the drug scenario, the lucky winner of the lottery might miss their friends and relatives who have died.  And in the WBE scenario it seems like the lottery winner might get lonely being the only person on Earth.  But again, this is unsatisfying.  If the lottery winner were allowed to share their winnings with their immediate social circle, or if they were a sociopathic loner who cared nothing for others, it still seems bad that they end up killing everyone else on Earth.   

2. One could use the classic utilitarian argument in favor of equality: diminishing marginal utility.  However, I don't think this works.  Humans don't seem to experience diminishing returns from lifespan in the same way they do from wealth.  It's absurd to argue that a person who lives to the ripe old age of 60 generates less utility than two people who die at age 30 (all other things being equal).  The reason the DMI argument works when arguing for equality of wealth is that people are limited in their ability to get utility from their wealth, because there is only so much time in the day to spend enjoying it.  Extended lifespan removes this restriction, making a longer-lived person essentially a utility monster.

3. My intuitions about the lottery could be mistaken.  It seems to me that if I was offered the possibility of gambling my dose of life extension drug with just one other person, I still wouldn't do it.  If I understand probabilities correctly, then gambling for a chance at living either 0 or 99.99 additional years is equivalent to having a certainty of an additional 49.995  years of life, which is better than the certainty of 49.99 years of life I'd have if I didn't make the gamble.  But I still wouldn't do it, partly because I'd be afraid I'd lose and partly because I wouldn't want to kill the person I was gambling with. 

So maybe my horror at these scenarios is driven by that same hesitancy.  Maybe I just don't understand the probabilities right.  But even if that is the case, even if it is rational for me to gamble my dose with just one other person, it doesn't seem like the gambling would scale.  I will not win the "lifetime lottery."

4. Finally, we have those moral objections I mentioned earlier.  Utilitarianism is a pretty awesome moral theory under most circumstances.  However, when it is applied to scenarios involving population growth and scenarios where one individual is vastly better at converting resources into utility than their fellows, it tends to produce very scary results.  If we accept the complexity of value thesis (and I think we should), this suggests that there are other moral values that are not salient in the "special case" of scenarios with no population growth or utility monsters, but become relevant in scenarios where there are.

For instance, it may be that prioritarianism is better than pure utilitarianism, and in this case sharing the life extension method might be best because of the benefits it accords the least off.  Or it may be (in the case of the WBE example) that having a large number of unique, worthwhile lives in the world is valuable because it produces experiences like love, friendship, and diversity. 

My tentative guess at the moment is that there probably are some other moral values that make the scenarios I described morally suboptimal, even though they seem to make sense from a utilitarian perspective.  However, I'm interested in what other people think.  Maybe I'm missing something really obvious.

 

EDIT:  To make it clear, when I refer to "amount of years added" I am assuming for simplicity's sake that all the years added are years that the person whose life is being extended wants to live and contain a large amount of positive experiences. I'm not saying that lifespan is exactly equivalent to utility. The problem I am trying to resolve is that it seems like the scenarios I've described seem to maximize the number of positive events it is possible for the people in the scenario to experience, even though they involve killing the majority of people involved.  I'm not sure "positive experiences" is exactly equivalent to "utility" either, but it's likely a much closer match than lifespan.

A solvable Newcomb-like problem - part 3 of 3

3 Douglas_Reay 06 December 2012 01:06PM

This is the third part of a three post sequence on a problem that is similar to Newcomb's problem but is posed in terms of probabilities and limited knowledge.

   Part 1 - stating the problem
   Part 2 - some mathematics
   Part 3 - towards a solution

 


 

In many situations we can say "For practical purposes a probability of 0.9999999999999999999 is close enough to 1 that for the sake of simplicity I shall treat it as being 1, without that simplification altering my choices."

However, there are some situations where the distinction does significantly alter that character of a situation so, when one is studying a new situation and one is not sure yet which of those two categories the situations falls into, the cautious approach is to re-frame the probability as being (1 - δ) where δ is small (eg 10 to the power of -12), and then examine the characteristics of the behaviour as δ tends towards 0.

LessWrong wiki describes Omega as a super-powerful AI analogous to Laplace's demon, who knows the precise location and momentum of every atom in the universe, limited only by the laws of physics (so, if time travel isn't possible and some of our current thoughts on Quantum Mechanics are correct, then Omega's knowledge of the future is probabilistic, being limited by uncertainty).

For the purposes of Newcomb's problem, and the rationality of Fred's decisions, it doesn't matter how close to that level of power Omega actually is.   What matters, in terms of rationality, is the evidence available to Fred about how close Omega is to having to that level of power; or, more precisely, the evidence available to Fred relevant to Fred making predictions about Omega's performance in this particular game.

Since this is a key factor in Fred's decision, we ought to be cautious.  Rather than specify when setting up the problem that Fred knows with a certainty of 1 that Omega does have that power, it is better to specify a concrete level of evidence that would lead Fred to assign a probability of (1 - δ) to Omega having that power, then examine the effect upon which option to the box problem it is rational for Fred to pick, as δ tends towards 0.

The Newcomb-like problem stated in part 1 of this sequence contains an Omega that it is rational for Fred to assign a less than unity probability of being able to perfectly predict Fred's choices.  By using bets as analogies to the sort of evidence Fred might have available to him, we create an explicit variable that we can then manipulate to alter the precise probability Fred assigns to Omega's abilities.

The other nice feature of the Newcomb-like problem given in part 1, is that it is explicitly solvable using the mathematics given in part 2.  By making randomness an external feature (the device Fred brings with him) rather than purely a feature of Fred's internal mind, we can acknowledge the question of Omega being able to predict quantum events, capture it as a variable, and take it into account when setting out the payoff matrix for the problem.

This means that, instead of Fred having to think "When I walked into this room I was determined to pick one-box.  As far as anyone knew or could predict, including myself, I intended to pick one-box.  However nothing I do now can change Omegas decision - the money is already in the box.  So I've nothing to lose by changing my mind."; Fred can now allocate a specific probability to whether Omega could predict Fred's chance of changing his mind in such circumstances, and Fred can take that into account in his strategy by making his chance of changing strategy explicit and external - basing it upon a random number device.

Or, to put it another way, we are modelling a rational human who has a specific finite chance of talking himself into over riding a pre-committed strategy, as being made up from two components: a component that will infallibly stick to a pre-committed strategy plus a component with a known chance of change; we then treat the combined rational human as being someone infallibly committed to a meta-strategy that includes a chance of change - a mixed equilibrium, from Omega's point of view.

 

Ok, time to look at the numbers and draw a pretty diagram...

Fred is Player A, and he has two meta options:
  A1 - play it safe.  100% chance of one-box and 0% chance of two-box
  A2 - take a risk.   Mentally pre-commit to using the device to produce 99% chance of one-box and 1% chance of two-box.

Omega is Player B, and he has two meta options:
  B1 - reward risk.  Not attempt to distinguish between the mental state of Fred taking 1% risk and Fred playing it safe.
  B2 - punish risk.  Attempt to distinguish and, if Omega guesses Fred is taking risk rather than playing safe, punish it.

We'll start out by assuming that if Omega does attempt to distinguish, then Omega will have a 1 in 10,000 false positive rate (thinking Fred is going to use the device, when actually Fred intends to play it safe) and a 1 in 10,000 false negative rate (thinking Fred is going to play it safe, when actually Fred intends to use the device).

A1 vs B1

Fred gains $1,000,000
Omega loses $1,000,000 to Fred but gains $1,000,000,000 from Alpha, for a net gain of $999,000,000

A2 vs B1

99% of the time, Fred gains $1,000,000 and Omega net gains $999,000,000
1% of the time, Fred gains $1,001,000 and Omega net loses $10,001,001,000

Combining those gives an average of:
Fred gains: $1,000,010
Omega gains: $979,008,999

A1 vs B2

99.99% of the time, Omega correctly discerns that Fred is playing safe
Fred gains $1,000,000
Omega gains $999,000,000

0.01% of the time, Omega falsely believes that Fred is taking a risk, and punishes that by putting $0 in Box A
Fred gains $0
Omega loses $10,000,000,000

Combining those gives an average of:
Fred gains: $999,900
Omega gains: $997,900,100

A2 vs B2

In 100 trials out of 1,000,000 trials Omega incorrectly thinks Fred will play it safe, when actually Fred takes the risk of using the device.  Of these:

In 1 trial out of 1,000,000 trials: Omega incorrectly thinks Fred will play it safe, when actually Fred takes the risk of using the device and, in this case, the device picks two-box
==> Fred gains $1,001,000
==> Omega loses $10,001,001,000

In 99 trials out of 1,000,000 trials: Omega incorrectly thinks Fred will play it safe, when actually Fred takes the risk of using the device and, in this case, the device picks one-box
==> Fred gains $1,000,000
==> Omega gains $999,000,000

In 999,900 trials out of 1,000,000 trials Omega correctly thinks that Fred is going to take the risk of using the device.  Of those:

In 9,999 trials out of 1,000,000 trials: Omega correctly thinks that Fred is going to take the risk of using the device and, in this case, the device picks two-box
==> Fred gains $1,000
==> Omega gains $999,999,000

In 989,901 trials out of 1,000,000 trials: Omega correctly thinks that Fred is going to take the risk of using the device and, in this case, the device picks one-box
==> Fred gains $0
==> Omega loses $10,000,000,000

Combining those gives an average of:
Fred gains $110
Omega loses $9,888,922,110

 

If Fred's best estimate, beforehand, of the chances of Omega choosing meta strategy B1 is P, then

Fred's expected gain from choosing meta strategy A1 would be: $1,000,000 P + $999,999 (1-P)
Fred's expected gain from choosing meta strategy A2 would be: $1,000,010 P + $110 (1-P)

We can ask ourselves at what value of P these two strategies would return the same expected gain:

$1,000,000 P + $999,999 (1-P)   =    $1,000,010 P + $110 (1-P)
                       $999,889 (1-P)   =                $10 P
                       $999,889           =        $999,899 P
                                          P    = 0.99999
                                      (1-P)  = 0.00001

In other words, Fred thinking that Omega has even a 1 in 100,000 chance of being vindictive should be enough to deter Fred from taking the risky strategy.

But how does that look from Omega's point of view?   If Omega thinks that Fred's chance of picking meta strategy A1 is Q, then what is the cost to Omega of picking B2 1 in 100,000 times?

Omega's expected gain from choosing meta strategy B1 would be: $999,000,000 Q + $979,008,999 (1-Q)
Omega's expected gain from choosing meta strategy B2 would be: $997,900,100 Q - $9,888,922,110 (1-Q)

0.99999 { $999,000,000 Q + $979,008,999 (1-Q)  } + 0.00001 { $997,900,100 Q - $9,888,922,110 (1-Q) }
= (1 - 0.00001) { $979,008,999 + $19,991,001 Q } + 0.00001 { - $9,888,922,110  + $10,886,822,210 Q  }
= $979,008,999 + $19,991,001 Q + 0.00001 { - $9,888,922,110  + $10,886,822,210 Q - $979,008,999 - $19,991,001 Q }
= $979,008,999 + $19,991,001 Q + 0.00001 { $9,907,813,211 + $10,866,831,209 Q }
= ( $979,008,999 + $99,078.13211) + ( $19,991,001 + $108,668.31209 ) Q
= $979,108,077 + $20,099,669 Q

 

Perhaps a meta strategy of 1% chance of two-boxing is not Fred's optimal meta strategy.  Perhaps, at that level compared to Omega's ability to discern, it is still worth Omega investing in being vindictive occasionally, in order to deter Fred from taking risk.   But, given sufficient data about previous games, Fred can make a guess at Omega's ability to discern.  And, likewise Omega, by including in the record of past games occasions when Omega has falsely accused a human player of taking risk, can signal to future players where Omega's boundaries are.   We can plot graphs of these to find the point at which Fred's meta strategy and Omega's meta strategy are in equilibrium - where if Fred took any larger chances, it would start becoming worth Omega's while to punish risk sufficiently often that it would no longer be in Fred's interests to take the risk.   Precisely where that point is will depend on the numbers we picked in Part 1 of this sequence.  By exploring the space created by using each variable number as a dimension, we can divide it into regions characterised by which strategies dominate within that region.

Extrapolating that as δ tends towards 0 should then carry us closer to a convincing solution to Newcomb's Problem.

 


 

  Back to Part 1 - stating the problem
  Back to Part 2 - some mathematics
  This is   Part 3 - towards a solution

A solvable Newcomb-like problem - part 2 of 3

0 Douglas_Reay 03 December 2012 04:49PM

This is the second part of a three post sequence on a problem that is similar to Newcomb's problem but is posed in terms of probabilities and limited knowledge.

   Part 1 - stating the problem
   Part 2 - some mathematics
   Part 3 - towards a solution

 


 

In game theory, a payoff matrix is a way of presenting the results of two players simultaneously picking options.

For example, in the Prisoner's Dilemma, Player A gets to choose between option A1 (Cooperate) and option A2 (Defect) while, at the same time Player B gets to choose between option B1 (Cooperate) and option B2 (Defect).   Since years spent in prison are a negative outcome, we'll write them as negative numbers:

payoff

So, if you look at the bottom right hand corner, at the intersection of Player A defecting (A2) and Player B defecting (B2) we see that both players end up spending 4 years in prison.   Whereas, looking at the bottom left we see that if A defects and B cooperates, then Player A ends up spending 0 years in prison and Player B ends up spending 5 years in prison.

Another familiar example we can present in this form is the game Rock-Paper-Scissors.

We could write it as a zero sum game, with a win being worth 1, a tie being worth 0 and a loss being worth -1:

But it doesn't change the mathematics if we give both players 2 points each round just for playing, so that a win becomes worth 3 points, a tie becomes worth 2 points and a loss becomes worth 1 point.  (Think of it as two players in a game show being rewarded by the host, rather than the players making a direct bet with each other.)

If you are Player A, and you are playing against a Player B who always chooses option B1 (Rock), then your strategy is clear.  You choose option A2 (Paper) each time.  Over 10 rounds, you'd expect to end up with $30 compared to B's $10.

Let's imagine a slightly more sophisticated Player B, who always picks Rock in the first round, and then for all other rounds picks whatever would beat Player A's choice the previous round.   This strategy would do well against someone who always picked the same option each round, but it is deterministic and, if we guess it correctly in advance, we can design a strategy that beats it every time.  (In this case, picking Paper-Rock-Scissors then repeating back to Paper).   In fact whatever strategy B comes up with, if that strategy is deterministic and we guess it in advance, then we end up with $30 and B ends up with $10.

What if B has a deterministic strategy that B picked in advance and doesn't change, but we don't know at the start of the first round what it is?   In theory B might have picked any of the 3-to-the-power-of-10 deterministic strategies that are indistinguishable from each other over a 10 round duel but, in practice, humans tend to favour some strategies over others so, if you know humans and the game of Rock-Paper-Scissors better than Player B does, you have a better than even chance of guessing his pattern and coming out ahead in the later rounds of the duel.

But there's a danger to that.  What if you have overestimated your comparative knowledge level and Player B uses your overconfidence to lure you into thinking you've cracked B's pattern, while really B is laying a trap, increasing the predictability of Player A's moves so Player B can then take advantage of that to work out which moves will trump them?  This works better in a game like poker, where the stakes are not the same each round, but it is still possible in Rock-Paper-Scissors, and you can imagine variants of the game where the host varies payoff matrix by increasing the lose-tie-win rewards from 1,2,3 in the first round, to 2,4,6 in the second round, 3,6,9 in the third round, and so on.

This is why the safest strategy is to not to have a deterministic strategy but, instead, use a source of random bits to each round pick option 1 with a probability of 33%, option 2 with a probability of 33% or option 3 with a probability of 33% (modulo rounding).  You might not get to take advantage of any predictability that becomes apparent in your opponents strategy, but neither can you be fooled into becoming predictable yourself.

On a side note, this still applies even when there is only one round, because unaided humans are not as good at coming up with random bits as they think they are.  Someone who has observed many first time players will notice that first time players more often than not choose as their Rock as their 'random' first move, rather than Paper or Scissors.  If such a person were confident that they were playing a first time player, they might therefore pick Paper as their first move more frequently than not.  Things soon get very Sicilian (in the sense of the duel between Westley and Vizzini in the film The Princess Bride) after that, because a yet more sophisticated player who guessed their opponent would try this, could then pick Scissors.  And so ad infinitum, with ever more implausible levels of discernment being required to react on the next level up.

We can imagine a tournament set up between 100 players taken randomly from the expertise distribution of game players, each player submitting a python program that always plays the same first move, and for each of the remaining 9 rounds produces a move determined solely by the the moves so far in that duel.  The tournament organiser would then run every player's program once against the programs of each of the other 99 players, so on average each player would collect 99x10x2 = $1,980

We could make things more complex by allowing the programs to use, as an input, how much money their opponent has won so far during the tournament; or iterate over running the tournament several times, to give each player an 'expertise' rating which the program in the following tournament could then use.  We could allow the tournament host to subtract from each player a sum of money depending upon the size of program that player submitted (and how much memory or cpu it used).   We could give each player a limited ration of random bits, so when facing a player with a higher expertise rating they might splurge and make their move on all 10 rounds completely random, and when facing a player with a lower expertise they might conserve their supply by trying to 'out think' them.

There are various directions we could take this, but the one I want to look at here is what happens when you make the payoff matrix asymmetric.  What happens if you make the game unfair, so not only does one player have more at stake than the other player, but the options are not even either, for example:

You still have the circular Rock-Paper-Scissors dynamic where:
   If B chose B3, then A wants most to have chosen A1
   If A chose A1, then B wants most to have chosen B2
   If B chose B2, then A wants most to have chosen A3
   If A chose A3, then B wants most to have chosen B1
   If B chose B1, then A wants most to have chosen A2
   If A chose A2, then B wants most to have chosen B3

so everything wins against at least one other option, and loses against at least one other option.   However Player B is clearly now in a better position, because B wins ties, and B's wins (a 9, an 8 and a 7) tend to be larger than A's wins (a 9, a 6 and a 6).

What should Player A do?  Is the optimal safe strategy still to pick each option with an equal weighting?

Well, it turns out the answer is: no, an equal weighting isn't the optimal response.   Neither is just picking the same 'best' option each time.  Instead what do you is pick your 'best' option a bit more frequently than an equal weighting would suggest, but not so much that the opponent can steal away that gain by reliably choosing the specific option that trumps yours.   Rather than duplicate material already well presented on the web, I will point you at two lecture courses on game theory that explain how to calculate the exact probability to assign to each option:

You do this by using the indifference theorem to arrive at a set of linear equations, which you can then solve to arrive at a mixed equilibrium where neither player increases their expected utility by altering the probability weightings they assign to their options.

 

The TL;DR; points to take away

If you are competing in what is effectively a simultaneous option choice game, with a being who you suspect may have an equal or higher expertise to you at the game, you can nullify their advantage by picking a strategy that, each round chooses randomly (using a weighting) between the available options.

Depending upon the details of the payoff matrix, there may be one option that it makes sense for you to pick most of the time but, unless that option is strictly better than all your other choices no matter what option your opponent picks, there is still utility to gain from occasionally picking the other options in order to keep your opponent on their toes.

 


 

  Back to Part 1 - stating the problem
  This is  Part 2 - some mathematics
  Next to Part 3 - towards a solution

A solvable Newcomb-like problem - part 1 of 3

1 Douglas_Reay 03 December 2012 09:26AM

This is the first part of a three post sequence on a problem that is similar to Newcomb's problem but is posed in terms of probabilities and limited knowledge.

   Part 1 - stating the problem
   Part 2 - some mathematics
   Part 3 - towards a solution

 


 

Omega is an AI, living in a society of AIs, who wishes to enhance his reputation in that society for being successfully able to predict human actions.  Given some exchange rate between money and reputation, you could think of that as a bet between him and another AI, let's call it Alpha.  And since there is also a human involved, for the sake of clarity, to avoid using "you" all the time, I'm going to sometimes refer to the human using the name "Fred".

 

Omega tells Fred:

I'd like you to pick between two options, and I'm going to try to predict which option you're going to pick.
    Option "one box" is to open only box A, and take any money inside it
    Option "two box" is to open both box A and box B, and take any money inside them

but, before you pick your option, declare it, then open the box or boxes, there are three things you need to know.

 

Firstly, you need to know the terms of my bet with Alpha.

If Fred picks option "one box" then:
   If box A contains $1,000,000 and box B contains $1,000 then Alpha pays Omega $1,000,000,000
   If box A contains $0              and box B contains $1,000 then Omega pays Alpha $10,000,000,000
   If anything else, then both Alpha and Omega pay Fred $1,000,000,000,000

If Fred picks option "two box" then:
   If box A contains $1,000,000 and box B contains $1,000 then Omega pays Alpha $10,000,000,000
   If box A contains $0              and box B contains $1,000 then Alpha pays Omega $1,000,000,000
   If anything else, then both Alpha and Omega pay Fred $1,000,000,000,000

 

Secondly, you should know that I've already placed all the money in the boxes that I'm going to, and I can't change the contents of the boxes between now and when you do the opening, because Alpha is monitoring everything.  I've already made my prediction, using a model I've constructed of your likely reactions based upon your past actions.

You can use any method you like to choose between the two options, short of contacting another AI, but be warned that if my model predicted that you'll use a method which introduces too large a random element (such as tossing a coin) then, while I may lose my bet with Alpha, I'll certainly have made sure you won't win the $1,000,000.  Similarly, if my model predicted that you'd make an outside bet with another human (let's call him George) to alter the value of winning $1,001,000 from me I'd have also taken that into account.  (I say "human" by the way, because my bet with Alpha is about my ability to predict humans so if you contact another AI, such as trying to lay a side bet with Alpha to skim some of his winnings, that invalidates not only my game with you, but also my bet with Alpha and there are no winning to skim.)

 

And, third and finally, you need to know my track record in previous similar situations.

I've played this game 3,924 times over the past 100 years (ie since the game started), with humans picked at random from the full variety of the population.   The outcomes were:
   3000 times players picked option "one box" and walked away with $1,000,000
   900  times players picked option "two box" and walked away with $1,000
   24 times players flipped a coin and or were otherwise too random.  Of those players:
        12 players picked option "one box" and walked away with $0
        12 players picked option "two box" and walked away with $1,000

Never has anyone ever ended up walking away with $1,001,000 by picking option "two box".

 

Omega stops talking.   You are standing in a room containing two boxes, labelled "A" and "B", which are both currently closed.  Everything Omega said matches what you expected him to say, as the conditions of the game are always the same and are well known - you've talked with other human players (who confirmed it is legit) and listened to their advice.   You've not contacted any AIs, though you have read the published statement from Alpha that also confirms the terms of the bet and details of the monitoring.  You've not made any bets with other humans, even though your dad did offer to bet you a bottle of whiskey that you'd be one of them too smart alecky fools who walked away with only $1,000.  You responded by pre-committing to keep any winnings you make between you and your banker, and to never let him know.

The only relevant physical object you've brought along is a radioactive decay based random number generator, that Omega would have been unable to predict the result of in advance, just in case you decide to use it as a factor in your choice.  It isn't a coin, giving only a 50% chance of "one box" and a 50% chance of "two box".   You can set arbitrary odds (tell it to generate a random integer between 0 and any positive integer you give it, up to 10 to the power of 100).   Omega said in his spiel the phrase "too large a random element" but didn't specify where that boundary was.

What do you do?   Or, given that such a situation doesn't exist yet, and we're talking about a Fred in a possible future, what advice would you give to Fred on how to choose, were he to ever end up in such a situation?

Pick "one box"?   Pick "two box"?   Or pick randomly between those two choices and, if so, at what odds?

And why?


 

         Part 1 - stating the problem
next   Part 2 - some mathematics
         Part 3 - towards a solution

View more: Next