Previous:

  1. "Living Forever is Hard, or, the Gompertz Curve"
  2. "Living Forever is Hard, part 2: Adult Longevity"

From the excellent Fight Aging! blog comes a pointer to "A Histogram of Results from Life Span Studies", a graph of thousands of animal studies by Kingsley G. Morse Jr. (updated version from mailing list):

 

(This is not the same as a funnel plot, as the y-axis is # of studies finding that percentage gain and nothing to do with the n of studies.)


On the closed GRG mailing list, the compiler says:

Many test the same intervention on a different strain of the same species, or with a different dose.

I asked some questions, and Steven B. Harris replied:

“I was gratified to be able to answer promptly, and I did. I said I didn't know.”  (Twain)

Though I can observe that the center of that distribution isn't very far from no-effect, and one would expect that there's a publication bias toward reporting positive effects vs. null effects. I would think that could account for it entirely.

There's also the problem we've discussed before, that feeling animals stuff they don't like the taste off, amounts to calorie restriction. So this clouds the issues terribly in non-CR studies, unless you're very, VERY careful to control them somehow.

The relevance of this summary graph to news like the C60 rodent life extension experiment is obvious. Reading GRG has been interesting and educational about that experiment; a rough summary of points made by various people including myself:

  • contradictory median/lifespan figures
  • duplicate image
  • small sample
  • doses of C60 small enough that the direct antioxidant activity can't be responsible
  • justifying cites not published when experiment started
  • the C60 was administered for brief period (think the analogy given was 'imagine taking a supplement only during your 40s and doubling your lifespan')
  • the massive life extension observed in the olive-oil-only rats - not doubling, but still really implausible
New Comment
18 comments, sorted by Click to highlight new comments since:

As expected, the C60 result has failed to replicate (and comes with bonus toxicity and serious purity concerns in the commercially-available C60). Just another day in the animal-studies Replication Crisis, and another result that was too good to be true.

This was/is hard for me to understand. I don't see why the fact that the distribution of effects is around zero is very revealing. I guess it suggests that there haven't been big research programs on things which are actually successful.

I guess it also suggests a prior distribution for this recent C60 experiment (which I don't know anything about) indicating that a big effect is weak evidence. Perhaps this was what you were trying to communicate.

I guess it also suggests a prior distribution for this recent C60 experiment (which I don't know anything about) indicating that a big effect is weak evidence.

Yes. It'd be interesting if someone could reverse-engineer a reasonable-looking prior distribution from the graph and tell us what it implies about the specific C60 result.

It took me a while to figure out what's so disturbing about this graph, and I'm still not sure I get it. Is it strange and unexpected that attempted life-extending drugs shorten lifespans as often as they increase them? Or is it disturbing that the drugs are very likely to simply do nothing at all?

Or does this graph only represent trials of one drug? I see a single drug mentioned specifically, but the graph is also labeled as created from a massive compilation of data from multiple sources. Could someone explain this to me?

Is it strange and unexpected that attempted life-extending drugs shorten lifespans as often as they increase them?

The activities of people like Ray Kurzweil, who take literally hundreds of supplements, suggest that this is strange and unexpected, and cannot be deprecated enough. It also particularly cautions us against caring about any supplement or drug without dozens of positive studies (at a minimum).

Or is it disturbing that the drugs are very likely to simply do nothing at all?

That's also disturbing. It suggests - pace Algernon's law - that there is no good simple intervention.

The fact that this is a pretty smooth bellcurve also indicates, as far as I can tell, the general field has found no intervention at all that works - because then there would be dozens or hundreds of studies tweaking it and replicating it and investigating why it works. We don't see a bimodal hump with most interventions at net 0%, and a second bell curve of caloric restriction/intermittent fasting centered at eg. +40%.

(This latter interpretation could be wrong, since the data is so heterogenous. It could be that the apparent excess of studies around +100% represent many of the CR/IF studies.)

So this graph does represent comiled data about a lot of different drugs, rather than just one drug?

Yes.

Looks to me like the peak of the bell curve falls at about +10%. That's in the same ballpark as what I'd expect from placebo, but nonetheless a little higher; if we're instead looking at a lot of low-impact interventions, the really interesting question is how parallelizable they are. Unfortunately, if the natural variance in lifespan is anything to go by I suspect the answer is "not very".

There's also that really sharp spike at 0%, but I can think we can probably put some of that down to psychology.

That's in the same ballpark as what I'd expect from placebo, but nonetheless a little higher

Or stuff like publication bias (one reason I was interested in whether a funnel plot could be formed from the data).

I don't understand what you're trying to say:

suggest that this is strange and unexpected, and cannot be deprecated enough.

It suggests - pace Algernon's law - that there is no good simple intervention.

No major evolutionary incentive to extend lifespan much beyond the point people would be likely to have died from violence or accidents or disease, so Algernon's law shouldn't necessarily apply here.

For humans, it should; while average mortality is high even in the Paleolithic, this is the usual infant mortality skew. If you make it to adulthood... Old kin are still kin and can be useful, even if only a little bit - selection can still act on that.

Sure to a certain point. But there's a limit to how much extension one will get just from that. Assume for example that post infancy there's a 2% chance of random death due to violence, disease etc. in the native environment. Then there's about a 3/4th chance that they will survive to age 75, assuming a rough constant. Given that, there's little evolutionary incentive to push mortality down much past that. This is of course a toy-model, but the basic point is sound.

Yikes. That image is doing horrible things to the LW layout on my browser. Maximizing the window doesn't seem to help.

ETA: Seems to be fixed now. Thanks.

Upvoted; very interesting.

[-][anonymous]20

How interesting?

Very.

"Interventions Tend To Combine Synergistically To Extend Life Span A Little, But The Typical Improvement Is Statistically Insignificant"; another production from Kingsley's ginormous spreadsheet. Abstract:

"I used the results of previous life span experiments to test the statistical hypothesis that interventions tend to work as well individually as in combination. I included previous experimental results reporting the risk of people dying from any cause, and changes to the mean and maximum life spans of lab animals, Combinations trended toward extending life spans more in all three, but the improvements were statistically insignificant."

My comments on GRG:

"If I'm understanding this right, you're treating this as basically a vote-counting model: a single study yields a single data point of a single/multiple binary variable & a average lifespan increase % (experimental minus control).

This seems like it could be masking a huge amount of variables and relevant info. For example, do multiple intervention studies administer the same net amount of substances as the single intervention studies? If a multiple intervention studies administers 10mg of 10 substances and a single intervention administers 100mg of 1 substance, and there are increasing returns like a U-curve, then the combined additive or multiplicative effect of the multiple-interventions could equal the single intervention. Or could there be a bias in subject selection? It's probably easier to get big humans to eat multiple substances than a tiny hydra or yeast cell, and human studies don't seem to work well in general regardless of multiple vs single, so that could produce a lack of effectiveness (the multiples look ineffective, because they tend to be done in humans; and the single look effective, because smaller simpler animals will tend more to receive singles). You can probably think of a few other plausible covariates."