Comment author: Thrasymachus 22 April 2016 10:21:07PM -1 points [-]

I'm not sure. It seems important to see whether there is sleepwalk bias is to try and gather a representative sample of predictions/warnings and see how they go. Yet this is pretty hard to do: I can think of examples (like those mentioned in the post) where the disaster was averted, but I can think of others where the disaster did happen despite warnings (I'd argue climate change fits into this category, for example).

Comment author: Vaniver 09 January 2016 04:59:19PM *  3 points [-]

Formatting note: the brackets for links are greedy, so you need to escape them with a \ to avoid a long link.

[Testing] a long [link](https://www.google.com/)

Testing] a long [link

\[Testing\] a short [link](https://www.google.com/)

[Testing] a short link


principally because health is so important for our life and happiness we're less willing to sacrifice it to preserve face (I'd wager it is an even better tax on bs than money).

I agree that I expect people to be more willing to trade money for face than health for face. I think the system is slanted too heavily towards face, though.

I should also point out that this is mostly a demand side problem. If it were only a supply side problem, MetaMed could have won, but it's not--people are interested in face more than they're interested in health (see the example of the outdated brochure that was missing the key medical information, but looked like how a medical brochure is supposed to look).

It'd be surprising for IBM to unleash Watson on a very particular aspect of medicine (therapeutic choice in oncology) if simple methods could beat doctors across most of the board.

My understanding is that this is correct for the simple techniques, but incorrect for the complicated techniques. That is, you're right that a single linear regression can't replace a GP but a NLP engine plus a twenty questions bot plus a causal network probably could. (I unfortunately don't have any primary sources at hand; medical diagnostics is an interest but most of the academic citations I know are all machine diagnostics, since that's what my research was in.)

I should also mention that, from the ML side, the technical innovation of Watson is in the NLP engine. That is, a patient could type English into a keyboard and Watson would mostly understand what they're saying, instead of needing a nurse or doctor to translate the English into the format needed by the diagnostic tool. The main challenge with uptake of the simple techniques historically was that they only did the final computation, but most of the work in diagnostics is collecting the information from the patient. And so if the physcian is 78% accurate and the linear regression is 80% accurate, is it really worth running the numbers for those extra 2%?

From a business standpoint, I think it's obvious why IBM is moving slowly; just like with self-driving cars, the hard problems are primarily legal and social, not technical. Even if Watson has half the error rate of a normal doctor, the legal liability status is very different, just like a self-driving car that has half the error rate of a human driver would result in more lawsuits for the manufacturer, not less. As well, if the end goal is to replace doctors, the right way to do that is imperceptibly hand more and more work over to the machines, not to jump out of the gate with a "screw you, humans!"

I agree this should have happened sooner: that Atul Gwande's surgical checklist happened within living memory is amazing, but it is catching on, and (mildly against hansonian explanations) has been propelled by better outcomes.

So, just like the Hansonian view of Effective Altruism is that it replaces Pretending to Try not with Actually Trying but with Pretending to Actually Try, if there is sufficient pressure to pretend to care about outcomes then we should expect people to move towards better outcomes as their pretending has nonzero effort.

But I think you can look at the historical spread of anesthesia vs. the historical spread of antiseptics to get a sense of the relative importance of physician convenience and patient outcomes. (This is, I think, a point brought up by Gawande.)


I think I agree with your observations about MetaMed's competition but not necessarily about your interpretation. That is, MetaMed could have easily failed for both the reasons that its competition was strong and that its customers weren't willing to pay for its services. I put more weight on the latter because the experience that MetaMed reported was mostly not "X doesn't want to pay $5k for what they can get for free from NICE" but "X agrees that this is worth $100k to them, but would like to only pay me $5k for it." (This could easily be a selection effect issue, where everyone who would choose NICE instead is silent about it.)


However, this data by and large does not exist: much of medicine is still at the stage of working out whether something works generally, rather than delving into differential response and efficacy. It is not clear it ever will - humans might be sufficiently similar to one another that for almost all of them one treatment will be the best. The general success of increasing protocolization in medicine is some further weak evidence of this point.

This is why I'm most optimistic about machine medicine, because it basically means instead of going to a doctor (who is tired / stressed / went to medical school twenty years ago and only sort of keeps up) you go to the interactive NICE protocol bot, which asks you questions / looks at your SNPs and tracked weight/heart rate/steps/sleep/etc. data / calls in a nurse or technician to investigate a specific issue, diagnoses the issue and prescribes treatment, then follows up and adjusts its treatment outcome expectations accordingly.

Comment author: Thrasymachus 26 January 2016 07:46:17PM *  1 point [-]

(Sorry for delay, and thanks for the formatting note.)

My knowledge is not very up to date re. machine medicine, but I did get to play with some of the commercially available systems, and I wasn't hugely impressed. There may be a lot more impressive results yet to be released commercially but (appealing back to my priors) I think I would have heard of it as it would be a gamechanger for global health. Also, if fairly advanced knowledge work of primary care can be done by computer, I'd expect a lot of jobs without the protective features of medicine to be automated.

I agree that machine medicine along the lines you suggest will be superior to human performance, and I anticipate this to be achieved (even if I am right and it hasn't already happened) fairly soon. I think medicine will survive less by the cognitive skill required, but rather though technical facility and social interactions, where machines comparably lag (of course, I anticipate they will steadily get better at this too).

I grant a hansonian account can accomodate this sort of 'guided by efficacy' data I suggest by 'pretending to actually try' considerations, but I would suggest this almost becomes an epicycle: any data which supports medicine being about healing can be explained away by the claim that they're only pretending to be about healing as a circuitous route to signalling. I would say the general ethos of medicine (EBM, profileration of trials) looks like pro tanto reasons in favour about being about healing, and divergence from this (e.g. what happened to semmelweis, other lags) is better explained by doctors being imperfect and selfish, and patients irrational, rather than both parties adeptly following a signalling account.

But I struggle to see what evidence could neatly distinguish between these cases. If you have an idea, I'd be keen to hear it. :)

I agree with the selection worry re. Metamed's customers: they also are assumedly selected from people who modern medicine didn't help, which may also have some effects (not to mention making Metameds task harder, as their pool will be harder to treat than unselected-for-failure cases who see the doctor 'first line'). I'd also (with all respect meant to the staff of Metamed) suggest staff of Metamed may not be the most objective sources of why it failed: I'd guess people would prefer to say their startups failed because of the market or product market fit, rather than 'actually, our product was straight worse than our competitors'.

Comment author: gwern 24 January 2016 07:33:57PM 5 points [-]
Comment author: Thrasymachus 26 January 2016 07:25:11PM 1 point [-]

I put an inline link in the post. Have I missed a norm about puttting related posts I have written in the post more prominently?

Comment author: Vaniver 31 December 2015 07:44:06PM *  6 points [-]

Three main sources. (But first the disclaimer About Isn't About You seems relevant--that is, even if medicine is all a sham (which I don't believe), participating in the medical system isn't necessarily a black mark on you personally.)

First is Robin Hanson's summary on the literature on health economics. The medicine tag on Robin's blog has a lot, but a good place to start is probably Cut Medicine in Half and Medicine as Scandal followed by Farm and Pet Medicine and Dog vs. Cat Medicine. To summarize it shortly, it looks like medical spending is driven by demand effects (we care so we spend to show we care) rather than supply effects (medicine is better so we consume more) or efficacy (we don't keep good records of how effective various doctors are). His proposal for how to fund medicine shows what he thinks a more sane system would look like. (As 'cut medicine in half' suggests, he doesn't think the average medical spending has a non-positive effect, but that the marginal medical spending does, to a very deep degree.)

Second is the efficiency literature on medicine. This is statisticians and efficiency experts and so on trying to apply standard industrial techniques to medicine and getting pushback that looks ludicrous to me. For example, human diagnosticians perform at the level or worse than simple algorithms (I'm talking linear regressions, here, not even neural networks or decision trees or so on), and this has been known in the efficiency literature for well over fifty years. Only in rare cases does this actually get implemented in practice (for example, a flowchart for dealing with heart attacks in emergency rooms was popularized a few years back and seems to have had widespread acceptance). It's kind of horrifying to realize that our society is smarter about, say, streamlining the production of cars than we are streamlining the production of health, especially given the truly horrifying scale of medical errors. Stories like Semmelweis and the difficulty getting doctors to wash their hands between patients further expand this view.

Third is from 'the other side'; my father was a pastor and thus spent quite some time with dying people and their families. His experience, which is echoed by Yvain in Who By Very Slow Decay and seems to be the common opinion among end-of-life professionals in general, is that the person receiving end-of-life care generally doesn't want it and would rather die in peace, and the people around them insist that they get it (mostly so that they don't seem heartless). As Yvain puts it:

Robin Hanson sometimes writes about how health care is a form of signaling, trying to spend money to show you care about someone else. I think he’s wrong in the general case – most people pay their own health insurance – but I think he’s spot on in the case of families caring for their elderly relatives. The hospital lawyer mentioned during orientation that it never fails that the family members who live in the area and have spent lots of time with their mother/father/grandparent over the past few years are willing to let them go, but someone from 2000 miles away flies in at the last second and makes ostentatious demands that EVERYTHING POSSIBLE must be done for the patient.

Once you really grok that a huge amount of medical spending is useless torture, and if you are familiar with what it looks like to design a system to achieve an end, it becomes impossible to see the point of our medical system as healing people.

[edit]And look at today's Hanson post!

Comment author: Thrasymachus 09 January 2016 03:23:18PM *  6 points [-]

I broadly differ with the hansonian take on medicine. I think metamed failed not because it offered more effective healing but went bust because medicine doesn't really demand healing; but rather that medicine is about healing, generally does this pretty well, and Metamed was unable to provide a significant edge in performance over standard medicine. (I should note I am a doctor, albeit a somewhat contrarian one. I wrote the 80k careers guide on medicine).


I think medicine is generally less fertile ground for hansonian signalling accounts, principally because health is so important for our life and happiness we're less willing to sacrifice it to preserve face (I'd wager it is an even better tax on bs than money). If the efficacy of marginal health spending is near zero in rich countries, that seems evidence in support of, 'medicine is really about healing' - we want to live healthily so much we chase the returns curve all the way to zero!

There are all manner of ways in which western world medicine does badly, but I think sometimes the faults are overblown, and the remainder are best explained by human failings rather than medicine being a sham practice:

1) My understanding of the algorithms for diagnosis is that although linear regressions and simple methods can beat humans at very precise diagnostic questions (e.g. 'Given these factors of a patient who is mentally ill, what is their likelihood of committing suicide?), humans still have better performance in messier (and more realistic) situations. It'd be surprising for IBM to unleash Watson on a very particular aspect of medicine (therapeutic choice in oncology) if simple methods could beat doctors across most of the board.

(I'd be very interested to see primary sources if my conviction is mistaken)

2) Medicine has become steadily more and more protocolized, and clinical decision rules, standard operating procedures and standards of care are proliferating rapidly. I agree this should have happened sooner: that Atul Gwande's surgical checklist happened within living memory is amazing, but it is catching on, and (mildly against hansonian explanations) has been propelled by better outcomes.

I can't speak for the US, but there are clear protocols in the UK about initial emergency management of heart attacks. Indeed, take a gander at the UK's 'NICE Pathways' which gives a flow chart on how to act in all circumstances where a heart attack is suspected.

3) I agree that the lack of efficacy information about individual doctors isn't great. Reliable data on this is far from trivial to acquire however, and that with doctors understandable self-interest not to be too closely monitored seems to explain this lacuna as well as the hansonian story. (Patients tend to want to know this information if it is available, which doesn't fit well with them colluding with their doctors and family in a medical ritual unconnected to their survival).

4) Over-treatment is rife, but the US is generally held up as an anti-examplar of this fault, and (at least judging by the anecdotes) medics in the UK are better (albeit still far from perfect) at flogging the patient to death with medical torture. Outside of this zero or negative margin, performance is better: it is unclear how much is attributable to medicine, but life expectancy, disease free life expectancy, and age-standardized mortality rates for most conditions are declining.


Now, why Metamed failed (I appreciate one should get basically no credit for predicting a start up will fail given this is the usual outcome, but I called it a long time ago):

Metamed's business model relied on there being a lot of low hanging fruit to pluck. That in many cases, a diagnosis or treatment would elude the clinician because they weren't appraised of the most recent evidence, were only able to deal in generalities rather than personalized recommendations, or that they just were less adept at synthesizing the evidence available.

If it were Metamed versus the average doctor - the one who spends next-to-no time reading academic papers, who is incredibly busy, stressed out, and so on, you'd be forgiven for thinking that metamed has an edge. However, medics (especially generalists) have long realized they have no hope of keeping abreast of a large medical literature on their own. Enter division of labour: they instead commission the relevant experts to survey, aggregate and summarize the current state of the evidence base, leaving them the simpler task of applying in their practice. To make sure it was up to date, they'd commission the experts to repeat this fairly often.

I mentioned NICE (National Institute of Clinical Excellence) earlier. They're a body in the UK who are responsible (inter alia) for deciding when drugs and treatments get funded on the NHS. They spend a vast amount of time on evidence synthesis and meta-analysis. To see what sort of work this produces google 'NICE {condition}'. An example for depression is here. Although I think the UK is world leading in this aspect, there are similar bodies in similar countries in other countries, as well as commercial organizations (e.g. Uptodate.)

Against this, Metamed never had any edge: they didn't have groups of subject matter experts to call upon for each condition or treatment in question, nor (despite a lot of mathsy inclination amongst them) did they by and large have parity in terms of meta-analysis, evidence synthesis and related skills. They were also outmatched in terms of quantity of man hours that could be deployed, and the great headstart NICE et al. already had. When their website was still up I looked at some of their example reports, and my view was they were significantly inferior to what you could get via NICE (for free!) or Uptodate or similar services for their lower fees.

MEtamed might have had a hope if in the course of producing these general evidence summaries, a lot of fine-grained data was being aggregated out to produce something 'one size fits all' - their edge would be going back to the original data to find out that although generally drug X is good for a condition, in ones particular case in virtue of age, genotype, or whatever else, drug Y is superior.

However, this data by and large does not exist: much of medicine is still at the stage of working out whether something works generally, rather than delving into differential response and efficacy. It is not clear it ever will - humans might be sufficiently similar to one another that for almost all of them one treatment will be the best. The general success of increasing protocolization in medicine is some further weak evidence of this point.


I generally adduce meta-med as an example of rationalist overconfidence. That insurgent Bayesians can just trounce relevant professionals in terms of what they purport to do thanks to signalling etc. But again, given the expectation was for it to fail (as most start ups do), this doesn't provide evidence. If it had succeeded, I'd have updated much more strongly in the magic of rationalism meaning you can win and the world being generally dysfunctional.

Comment author: Thrasymachus 26 December 2015 12:05:22AM 4 points [-]

Congratulations on doing this sort of careful self-analysis. I'd like to recommend a further improvement.

Pre-register/Publish your intentions for these trials and analysis in advance

The file-drawer problem is well-known, as are the risks of post-hoc changes in analysis. Publishing what data you are gathering and the analyses you will perform on it reassure more skeptical people against both of these worries, and seems pretty easy to do.

Comment author: estimator 03 April 2015 06:57:08PM *  12 points [-]

I was kinda surprised to see IQ as an external factor; my impression is that internal vs. external locus of control is actually personality traits vs. circumstances and environment, and IQ obviously falls into the first category.

If you consider IQ and mental health external factors, what are the internal factors, then? Willpower? But willpower is determined by the brain structure just as IQ and mental health and other personality traits.

Basically, if you assign everything to the "external" category, so that the "internal" is an empty set (or almost empty), then one's success is determined by "external" factors. No surprise here.

Comment author: Thrasymachus 05 April 2015 12:58:59AM 2 points [-]

[I've seen your follow-up post on discussion. I thought it would be best to reply to both here.]

It may be that everything is determined by prior events all the way to the big bang. So there's no 'internal willer' isolated from previous events that can steer us one way or another. But we can keep talking about 'internal' and 'external' loci of control on a compatibilist view of free will (which I'd guess is the common view, including amongst those affirming an internal locus of control).

On this sort of view, internal factors are just those our choices can change - external factors, those which our choices cannot. If I want to run faster, how much time I spend training is an internal factor: it influences how fast I can run, and I can choose (in the compatibilist sense) how much time I spend training. If I have a dense hemiparesis secondary to a birth injury, that's an external factor - it also influences how fast I can run (indeed, whether I can run at all), and can't choose whether or not to have a hemiparesis.

So I take those with an internal locus of control to think that - in the main - the outcomes that matter are mainly sensitive to factors that in turn are sensitive to our choices (how hard I work, how long I practice, etc.), whilst those with an external locus of control say that these things are primarily determined by factors outside of that person's control.

It seems clear to me that IQ should be in the 'external factors' camp: IQ seems to be set early in life, has a large heritable component, and the non heritable bit is likely due to environmental things that I also can't change for myself, either at the time or retroactively. The failure of brain training programs suggests that you can't improve your IQ by any feat of effort. And we know it has all sorts of influences on how our lives turn out. If I have (due to factors outside my control) an IQ more than one standard deviation below the mean, I won't be able to become a doctor, or a physicist (or, indeed, joining the US armed services) - no matter what else I do. Mutatis mutandis cases where it might not serve as a strict bar but a variable handicap (c.f. evidence that the beneficial effects of IQ have no clear ceiling).

The alternative account you propose for demarcating 'external' versus 'internal' factors - internal factors are those causally distal to your brain's neural output - looks too broad: all internal factors need to be downstream of our neural output, but that isn't sufficient. The hemiparesis case I allude to above would be one example - that I can't move one side of my body is due to my neural output, but that is because of this insult which wasn't due to my neural output. I think the same applies for other cases of brain damage and particular types of mental illness: indeed, this is implicitly recognised by the criminal justice system.

(I've added remarks to this effect in the body of the post - thanks for this comment!)

Comment author: Thrasymachus 27 January 2015 01:26:17AM 12 points [-]

A general point: It is generally helpful to express risks as absolute risks (or number needed to treat) whenever possible, as odds ratios or percentages changes are hard to interpret, and often mislead if the 'base rate risk' is minute: often people's brains mistake '25% increased risk of Y if you do X', as risk of '25% of Y if you do X'.

E.g. pretend mobile phone use really caused an exotic brain cancer, and the OR is 100. But the base-rate of this brain cancer is something like 1/ten million. So although mobile phone users are 100 times more likely to get exotic brain cancer, their absolute risk goes up from 1/ten million to 1/million, so the absolute risk reduction of avoiding mobile phones would be a 0.0009% mortality risk benefit - just over a million people would have to stop using mobiles to avoid a single case of exotic brain cancer.

Comment author: homunq 22 August 2014 04:04:38PM *  2 points [-]

Bump.

(I realize you're busy, this is just a friendly reminder.)

Also, I added one clause to my comment above: the bit about "imperfectly measured", which is of course usually the case in the real world.

Comment author: Thrasymachus 19 November 2014 01:07:04AM 0 points [-]

Belatedly updated. Thanks for your helpful comments!

Comment author: homunq 02 August 2014 05:58:39PM *  10 points [-]

Great article overall. Regression to the mean is a key fact of statistics, and far too few people incorporate it into their intuition.

But there's a key misunderstanding in the second-to-last graph (the one with the drawn-in blue and red "outcome" and "factor"). The black line, indicating a correlation of 1, corresponds to nothing in reality. The true correlation is the line from the vertical tangent point at the right (marked) to the vertical tangent point at the left (unmarked). If causality indeed runs from "factor" (height) to "outcome" (skill), that's how much extra skill an extra helping of height will give you. Thus, the diagonal red line should follow this direction, not be parallel to the 45 degree black line. If you draw this line, you'll notice that each point on it has equal vertical distance to the top and bottom of the elliptical "envelope" (which is, of course, not a true envelope for all the probability mass, just an indication that probability density is higher for any point inside than any point outside).

Things are a little more complex if the correlation is due to a mutual cause, "reverse" causation (from "outcome" to "factor"), or if "factor" is imperfectly measured. In that case, the line connecting the vertical tangents may not correspond to anything in reality, though it's still what you should follow to get the "right" (minimum expected squared error) answer.

This may seem to be a nitpick, but to me, this kind of precision is key to getting your intuition right.

Comment author: Thrasymachus 03 August 2014 09:34:44PM 3 points [-]

Thanks for this important spot - I don't think it is a nitpick at all. I'm switching jobs at the moment, but I'll revise the post (and diagrams) in light of this. It might be a week though, sorry!

Comment author: ChristianKl 27 July 2014 12:35:14AM 1 point [-]

What is interesting is the strength of these relationships appear to deteriorate as you advance far along the right tail.

I read that claim as saying that if you sample the 45% to 55% percentile you will get a stronger correlation than if you sample the 90% to 100% percentile. Is that what you are arguing?

Comment author: Thrasymachus 02 August 2014 03:07:33AM 3 points [-]

This was badly written, especially as it offers confusion with range restriction. Sorry! I should just have said "what is interesting is that extreme values of the predictors predictors seldom pick out the most extreme outcomes".

View more: Next