Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

In response to Zombies Redacted
Comment author: CronoDAS 02 July 2016 09:04:32PM 4 points [-]

Can you make "something" with the same input-output behavior as a human, and have that thing not be conscious? It doesn't have to be atom-by-atom identical.

In response to comment by CronoDAS on Zombies Redacted
Comment author: Eliezer_Yudkowsky 02 July 2016 09:08:53PM 8 points [-]

Sure. Measure a human's input and output. Play back the recording. Or did you mean across all possible cases? In the latter case see http://lesswrong.com/lw/pa/gazp_vs_glut/

Comment author: Eliezer_Yudkowsky 27 April 2016 06:13:26PM 2 points [-]
Comment author: Eliezer_Yudkowsky 20 March 2016 02:41:58AM *  10 points [-]

Ed Fredkin has since sent me a personal email:

By the way, the story about the two pictures of a field, with and without army tanks in the picture, comes from me. I attended a meeting in Los Angeles, about half a century ago where someone gave a paper showing how a random net could be trained to detect the tanks in the picture. I was in the audience. At the end of the talk I stood up and made the comment that it was obvious that the picture with the tanks was made on a sunny day while the other picture (of the same field without the tanks) was made on a cloudy day. I suggested that the "neural net" had merely trained itself to recognize the difference between a bright picture and a dim picture.

Comment author: Eliezer_Yudkowsky 29 January 2016 01:04:40AM 1 point [-]

Moving to Discussion.

Comment author: Eliezer_Yudkowsky 18 December 2015 07:39:55PM 6 points [-]

Please don't.

Comment author: CarlShulman 17 September 2015 03:02:48AM 2 points [-]

Of course, with this model it's a bit of a mystery why A gave B a reward function that gives 1 per block, instead of one that gives 1 for the first block and a penalty for additional blocks. Basically, why program B with a utility function so seriously out of whack with what you want when programming one perfectly aligned would have been easy?

Comment author: Eliezer_Yudkowsky 18 September 2015 07:57:18PM 2 points [-]

I assume the point of the toy model is to explore corrigibility or other mechanisms that are supposed to kick in after A and B end up not perfectly value-aligned, or maybe just to show an example of why a non-value-aligning solution for A controlling B might not work, or maybe specifically to exhibit a case of a not-perfectly-value-aligned agent manipulating its controller.

Comment author: Eliezer_Yudkowsky 18 September 2015 07:51:54PM 6 points [-]

When I consider this as a potential way to pose an open problem, the main thing that jumps out at me as being missing is something that doesn't allow A to model all of B's possible actions concretely. The problem is trivial if A can fully model B, precompute B's actions, and precompute the consequences of those actions.

The levels of 'reason for concern about AI safety' might ascend something like this:

  • 0 - system with a finite state space you can fully model, like Tic-Tac-Toe
  • 1 - you can't model the system in advance and therefore it may exhibit unanticipated behaviors on the level of computer bugs
  • 2 - the system is cognitive, and can exhibit unanticipated consequentialist or goal-directed behaviors, on the level of a genetic algorithm finding an unanticipated way to turn the CPU into a radio or Eurisko hacking its own reward mechanism
  • 3 - the system is cognitive and humanish-level general; an uncaught cognitive pressure towards an outcome we wouldn't like, results in facing something like a smart cryptographic adversary that is going to deeply ponder any way to work around anything it sees as an obstacle
  • 4 - the system is cognitive and superintelligent; its estimates are always at least as good as our estimates; the expected agent-utility of the best strategy we can imagine when we imagine ourselves in the agent's shoes, is an unknowably severe underestimate of the expected agent-utility of the best strategy the agent can find using its own cognition

We want to introduce something into the toy model to at least force solutions past level 0. This is doubly true because levels 0 and 1 are in some sense 'straightforward' and therefore tempting for academics to write papers about (because they know that they can write the paper); so if you don't force their thinking past those levels, I'd expect that to be all that they wrote about. You don't get into the hard problems with astronomical stakes until levels 3 and 4. (Level 2 is the most we can possibly model using running code with today's technology.)

Comment author: gwern 19 August 2015 02:38:06AM *  10 points [-]

Yes, but that shows that Eliezer probably misremembered what the 40% referred to. In that study, "40%" refers not to how many didn't benefit, but rather to the maximal benefit on a particular measure of fitness received by any of the participants:

For example, the team found that training improved maximum oxygen consumption, a measure of a person’s ability to perform work, by 17% on average. But the most trainable volunteers gained over 40%, and the least trainable showed no improvement at all. Similar patterns were seen with cardiac output, blood pressure, heart rate and other markers of fitness.

Alternately, he might've been rounding the subsequent statistic:

Bouchard reported that the impact of training on insulin sensitivity – a marker of risk for diabetes and heart disease – also varied. It improved in 58% of the volunteers following exercise, but in 42% it showed no improvement or, in a few cases, may have got worse.

So, how many is many? What fraction of the subjects were resistant on the various metrics? Unfortunately, the NS article doesn't give exactly what we want to know, so we need to find the original scientific papers to figure it out ourselves, but the NS article doesn't give citations either, forcing us to fact-check it the hard way (a long time in Google Scholar punching in names and keywords).

Tracking down sources for this article is quite difficult. Bouchard quickly pulls up a bunch of papers all revolving around similar data from what is called the HERITAGE Family Study, which has apparently been running since 1995 (the abstract to "The HERITAGE family study: Aims, design, and measurement protocol", 1995, describes it as in-progress) and there are a lot of papers on various minutia of it. So we need to search with 'HERITAGE'.

The final paragraph about the 51/72 genes seems to be sourced from "Endurance training-induced changes in insulin sensitivity and gene expression", which was published around 2004, consistent with the NS date. The general stuff about responses to exercise is much harder to track down, but after quite a bit of browsing through Google Scholar, I think it's all summarized in "Individual differences in response to regular physical activity", Bouchard & Rankinen 2001, which sounds promising since its abstract mentions "For example, Vo2_max responses to standardized training programs have ranged from almost no gain up to 100% increase in large groups of sedentary individuals".

This review covers 4 major categories:

  1. VO2_max: "The average increase reached 384 mL O 2 with an SD of 202 mL O 2"; citing:

    • BOUCHARD , C., P. A N , T. RICE , et al. Familial aggregation of VO2max response to exercise training: results from the HERITAGE Family Study. J. Appl. Physiol. 87:1003–1008, 1999.
  2. heart-rate during exercise, "heart rate during submaximal exercise at 50 W" ; "A mean decrease of 11 beats·min -1 was observed among the 727 subjects with complete data. However, the SD reached 10 beats."

    • original to this review, it seems
  3. blood lipids, HDL-C: "They found that when the distribution of the percent changes in HDL-C was broken down into quartiles, the first quartile actually experienced a decrease in HDL-C of 9.3%, whereas the fourth quartile registered a mean increase of 18%." Cited to:

    • LEON , A. S., T. RICE , S. MANDEL , et al. Blood lipid response to 20 weeks of supervised exercise in a large biracial population: the HERITAGE Family Study. Metabolism 49:513–520, 2000.
  4. blood pressure, "systolic blood pressure during exercise in relative steady state at 50 W"; "Among these subjects, the mean decrease in SBP during cycling at 50 W was 8.2 mm Hg (SD 11.8)"

    • original to this review

So that covers 4 of the markers mentioned in the NS link. In those 4 cases, going by the graphs (the data is highly non-normal so you can't just estimate from the mean/SD), I'd guesstimate that 5-20% of each show <=0 benefit from the 20-weeks of endurance exercise.

That leaves the insulin sensitivity one, which seems to be "Effects of Exercise Training on Glucose Homeostasis: The HERITAGE Family Study", Boulé et al 2005. The graphs are hilarious, almost exactly 50-50 looking, and so correspond to the NS summary of 58%/42%.

(The papers don't seem to include any correlation matrixes, but this is definitely a problem which calls out for dimensionality reduction: presumably resistance on all 4 measurements correlates and you could extract a 'exercise resistance factor' which would be more informative than looking at things piecemeal. Since correlations between the 4 measurements are not given, it's possible that they are independent and so only ~0.2^4 or <1% of the subjects were exercise-resistant on all 4 measures, but that would surprise me: it would be strange if one's insulin improved but not VO2_max or cholesterol. I don't have any guesses on how large this 'exercise-resistant factor' might be, though.)

Not all of these are as important as one another and weight does not seem to be included judging by Bouchard's silence on individual differences w/r/t that. He does cite some interesting studies on resistance of body weight to change like two twin studies.


So going by the HERITAGE data described in that NS link, exercise resistance is a thing in maybe a fifth of the population but mostly on invisible things. 40%, however, is too high, since only 1 of the 5 measured things seemed to go that high, and the specific fractions were not mentioned, so most likely Eliezer was misremembering the other two stats as the more important stat.

Comment author: Eliezer_Yudkowsky 19 August 2015 06:29:54PM 7 points [-]

I recall originally reading something about a measure of exercise-linked gene expression and I'm pretty sure it wasn't that New Scientist article, but regardless, it's plausible that some mismemory occurred and this more detailed search screens off my memory either way. 20% of the population being immune to exercise seems to match real-world experience a bit better than 40% so far as my own eye can see - I eyeball-feel more like a 20% minority than a 40% minority, if that makes sense. I have revised my beliefs to match your statements. Thank you for tracking that down!

Comment author: 2irons 29 July 2015 03:41:35PM *  6 points [-]

"Does this change your confidence in Bob managing your retirement investments?" - if he never held himself out as a quant based investor or using a method reliant on much analytical or quantitative research I wouldn't worry about it.

Maybe he's been good at choosing ETF's because he's great at listening to investors and trader chat - can feel which arguments are about to dominate the market and allocates capital accordingly. Maybe he sits by a high performing proprietory trading team in his bank and you're piggy backing off of all their trades at a fraction of the fee. As a fund manager I know several other managers who would have no hope of following most of the articles on this website, misunderstand probability at basic levels (this has been teased out by in depth conversations on things like card counting - where they are high conviction yet wrong) but yet who I'd still have to concede are likely to continue outperforming me in the market because they are great at the parts that count.

I think this is put best by Nassim Taleb in Anti-Fragile:

“In one of the rare noncharlatanic books in finance, descriptively called What I Learned Losing A Million Dollars, the protagonist makes a big discovery. He remarks that a fellow called Joe Siegel, the most active trader in a commodity called “green lumber” actually thought that it was lumber painted green (rather than freshly cut lumber, called green because it had not been dried). And he made a living, even a fortune trading the stuff! Meanwhile the narrator was into theories of what caused the price of commodities to move and went bust.

The fact is that predicting the orderflow in lumber and the price dynamics narrative had little to do with these details —not the same ting. Floor traders are selected in the most nonnarrative manner, just by evolution in the sense that nice arguments don’t make much difference.”

Perhaps I'm being a bit harsh focusing on an analogy but I think there might be a wider point. Producing the right or wrong answers in one domain isn't necessarily a useful predictor of someone's ability to produce the right or wrong answer in another - even when they are highly connected.

Comment author: Eliezer_Yudkowsky 29 July 2015 08:27:14PM 5 points [-]

"Does somebody being right about X increase your confidence in their ability to earn excess returns on a liquid equity market?" has to be the worst possible question to ask about whether being right in one thing should increase your confidence about them being right elsewhere. Liquid markets are some of the hardest things in the entire world to outguess! Being right about MWI is enormously being easier than being right about what Microsoft stock will do relative to the rest of S&P 500 over the next 6 months.

There's a gotcha to the gotcha which is that you have to know from your own strength how hard the two problems are - financial markets are different from, e.g., the hard problem of conscious experience, in that we know exactly why it's hard to predict them, rather than just being confused. Lots of people don't realize that MWI is knowable. Nonetheless, going from MWI to Microsoft stock behavior is like going from 2 + 2 = 4 to MWI.

Comment author: Eitan_Zohar 08 July 2015 06:14:32PM *  0 points [-]

Sign up for cryonics. All of your subjective future will continue into quantum worlds that care enough to revive you, without regard for worlds where the cryonics organization went bankrupt or there was a nuclear war.

Doesn't this mean that you should deliberately avoid finding out whether cryonics can actually preserve your information in a retrievable way, because if it can't it would eliminate the vast majority of the worlds that would have brought you back? Whereas if you don't know it remains undetermined. Am I getting this right?

Comment author: Eliezer_Yudkowsky 14 July 2015 06:35:16PM 2 points [-]

You're confusing subjective probability and objective quantum measure. If you flip a quantum coin, half your measure goes to worlds where it comes up heads and half goes to where it comes up tails. This is an objective fact, and we know it solidly. If you don't know whether cryonics works, you're probably still already localized by your memories and sensory information to either worlds where it works or worlds where it doesn't; all or nothing, even if you're ignorant of which.

View more: Next