ChristianKl comments on Should effective altruists care about the US gov't shutdown and can we do anything? - Less Wrong

-2 Post author: Ishaan 01 October 2013 08:24PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (111)

You are viewing a single comment's thread. Show more comments above.

Comment author: ChristianKl 02 October 2013 04:49:29PM *  1 point [-]

Outside view

You might be but I'm not really.

But we are making progress in biology at a rapid rate. For example, the use of genetic markers to figure out how to treat different cancers was first proposed in the early 1990s and is now a highly successful clinical method.

That's a crude method of measuring success.

The cost of new drugs rises exponentially via Eroom's law. Big Pharma constantly lays of people.

A problem like obesity grows worse over the years instead of progress. Diabetes gets worse.

Even if you say that science isn't about solving real world issues but about knowledge, I also think that replication rates of 11% in the case of breakthrough cancer research indicates that the field is not good at finding out what's going on.

Comment author: JoshuaZ 02 October 2013 04:58:04PM 0 points [-]

That's a crude method of measuring success.

It isn't a metric of success. It is an example, one of many in the biological sciences.

The cost of new drugs rises exponentially via Eroom's law.

This is likely due largely to policy issues and legal issues more than it is how the biologists are thinking. Clinical trials have gotten large.

A problem like obesity grows worse over the years instead of progress. Diabetes gets worse.

A systemic problem, but one that has even less to do with biological research than Eroom's law. Obesity is not due to a lack of theoretical underpinnings in biology.

Even if you say that science isn't about solving real world issues but about knowledge, I also think that replication rates of 11% in the case of breakthrough cancer research indicates that the field is not good at finding out what's going on.

The question isn't is the field very good. The question is are the problems which we both agree exist due at all to not enough theory? File drawer effects, cognitive biases, bad experimental design are all issues here, none of which fall into that category.

Comment author: ChristianKl 02 October 2013 05:24:25PM 1 point [-]

It isn't a metric of success. It is an example, one of many in the biological sciences.

Then at what grounds do you claim that the field is succesful? How would you know if it weren't succesful?

Obesity is not due to a lack of theoretical underpinnings in biology.

I'm not saying that theory lacks theoretical underpinnings but that the underpinning is of bad quality.

The question isn't is the field very good. The question is are the problems which we both agree exist due at all to not enough theory? File drawer effects, cognitive biases, bad experimental design are all issues here, none of which fall into that category.

Question about designing experiments in a way that they produce reproduceable results instead of only large p values are theoretical issues.

The question is are the problems which we both agree exist due at all to not enough theory?

Enough theory sounds like as attempt to quantify the amount of theory. That's not what I advocate. Theories don't get better through increase in their quantity. Good theoretical thinking can simply model and result in less complex theory.

Comment author: JoshuaZ 02 October 2013 09:21:47PM -1 points [-]

Then at what grounds do you claim that the field is succesful? How would you know if it weren't succesful?

That's a good question, but in this context, seeing a variety of novel discoveries in the last few years indicates a somewhat successful field. By the same token, I'm curious what makes you think this isn't a successful field?

Question about designing experiments in a way that they produce reproduceable results instead of only large p values are theoretical issues.

I've already mentioned the file drawer problem. I'm curious, do you think that is a theoretical problem? If so, this may come down in part due to a very different notion of what theory means.

Theories don't get better through increase in their quantity. Good theoretical thinking can simply model and result in less complex theory.

You seem to be treating biology to some extent like it is physics, But these are complex systems. What makes you think that such approaches will be at all successful?

Comment author: ChristianKl 03 October 2013 11:21:43AM *  1 point [-]

That's a good question, but in this context, seeing a variety of novel discoveries in the last few years indicates a somewhat successful field. By the same token, I'm curious what makes you think this isn't a successful field?

The fact that Big Pharma has to lay of a lot of scientists is a real world indication that the output of model of finding a drug target, screening thousands of components against it, runs those components through clinical trials to find whether they are any good and then coming out with drugs that cure important illnesses at the other end stops producing results. Eroom's law.

I've already mentioned the file drawer problem. I'm curious, do you think that is a theoretical problem?

Saying that there's a file drawer problem is quite easy. That however not a solution. I think your problem is that you can't imaging a theory that would solve the problem. That's typical. If it would be easy to imagine a theoretical breakthrough beforehand it wouldn't be much of a breakthrough.

Look at a theoretical breakthrough of moving from the model of numbers as IV+II=VI to 4+2=6. If you would have talked with a Pythagoras he probably couldn't imaging a theoretical breakthrough like that.

You seem to be treating biology to some extent like it is physics, But these are complex systems. What makes you think that such approaches will be at all successful?

I don't. I don't know much about physics. Paleo/Quantified Self people found the thing with Vitamin D in the morning through phenemology. The community is relatively small and the amount of work that's invested into the theoretical underpinning is small.

I think in my exposure with the field of biology from various angles that there are a lot of areas where things aren't clear and there room for improvement on the level on epistomolgy and ontology.

I just recently preordered two angel sensors from crowdsourcing website indiegogo. I think that the money that the company gets will do much more to advance medicine than the average NHI grant.

Comment author: JoshuaZ 03 October 2013 12:52:34PM 0 points [-]

The fact that Big Pharma has to lay of a lot of scientists is a real world indication that the output of model of finding a drug target, screening thousands of components against it, runs those components through clinical trials to find whether they are any good and then coming out with drugs that cure important illnesses at the other end stops producing results.

This seems like extremely weak evidence. Diminishing marginal returns is a common thing in many areas. For example, engineering better trains happened a lot in the second half 19th century and the early 20th century. That slowed down, not because of some lack of theoretical background, but because the technology reached maturity. Now, improvements in train technology do occur, but slowly.

Saying that there's a file drawer problem is quite easy. That however not a solution. I think your problem is that you can't imaging a theory that would solve the problem. That's typical. If it would be easy to imagine a theoretical breakthrough beforehand it wouldn't be much of a breakthrough.

On the contrary. We have ways of handling the file drawer problem, and they aren't theory based issues. Pre-registration of studies works. It isn't even clear to me what it would mean to have a theoretical solution of the file drawer problem given that it is a problem about how culture, and a type of problem exists in any field. It makes about as much sense to talk about how having better theory could somehow solve type I errors.

Look at a theoretical breakthrough of moving from the model of numbers as IV+II=VI to 4+2=6. If you would have talked with a Pythagoras he probably couldn't imaging a theoretical breakthrough like that.

The ancient Greeks used the Babylonian number system and the Greek system. They did not use Roman numerals.

Comment author: ChristianKl 03 October 2013 03:16:20PM -1 points [-]

It isn't even clear to me what it would mean to have a theoretical solution of the file drawer problem given that it is a problem about how culture, and a type of problem exists in any field.

The file drawer problem is about an effect. If you can estimate exactly how large the effect is when you look at the question of whether to take a certain drug you solve the problem because you can just run the numbers.

On the contrary. We have ways of handling the file drawer problem, and they aren't theory based issues. Pre-registration of studies works.

The concept of the file drawer problem first appeared in 1976 if I can trust google ngrams.

How much money do you think it cost to run the experiments to come up with the concept of the file drawer problem and the concept pre-registration of studies? I don't think that's knowledge that got created by running expensive experiments. It came from people engaging in theoretical thinking.

It makes about as much sense to talk about how having better theory could somehow solve type I errors.

Type I errors are a feature of frequentist statistics. If you don't use null hypotheses you don't make type I errors. Bayesians don't make type I errors because they don't have null hypotheses.

Comment author: Lumifer 03 October 2013 03:55:10PM 4 points [-]

Type I errors are a feature of frequentist statistics. If you don't use null hypotheses you don't make type I errors. Bayesians don't make type I errors because they don't have null hypotheses.

LOL. That's, um, not exactly true.

Let's take a new drug trial. You want to find out whether the drug has certain (specific, detectable) effects. Could you please explain how a Bayesian approach to the results of the trial would make it impossible to make a Type I error, that is, a false positive: decide that the drug does have effects while in fact it does not?

Comment author: ChristianKl 03 October 2013 04:08:22PM -1 points [-]

Let's take a new drug trial. You want to find out whether the drug has certain (specific, detectable) effects.

I don't. A real bayesian doesn't. The bayesian wants to know the probability which with the drug will improve the well being of a patient.

The output of a bayesian analysis isn't a truth value but a probability.

Comment author: Lumifer 03 October 2013 04:19:59PM 4 points [-]

The output of a bayesian analysis isn't a truth value but a probability.

So is the output of a frequentist analysis.

However real life is full of step functions which translate probabilities into binary decisions. The FDA needs to either approve the drug or not approve the drug.

Saying "I will never make a Type I error because I will never make a hard decision" doesn't look good as evidence for the superiority of Bayes...

Comment author: gwern 03 October 2013 04:35:09PM 1 point [-]

How much money do you think it cost to run the experiments to come up with the concept of the file drawer problem and the concept pre-registration of studies? I don't think that's knowledge that got created by running expensive experiments. It came from people engaging in theoretical thinking.

The earliest citation in the Rosenthal paper that coined the term 'file drawer' is to a 1959 paper by one Theodore Sterling; I jailbroke this to "Publication Decisions and Their Possible Effects on Inferences Drawn from tests of Significance - or Vice Versa".

After some background about NHST on page 1, Sterling immediately begins tallying tests of significance in a years' worth of 4 psychology journals, on page 2, and discovers that eg of 106 tests, 105 rejected the null hypothesis. On page 3, he discusses how this bias could come about.

So at least in this very early discussion of publication bias, it was driven by people engaged in empirical thinking.

Comment author: ChristianKl 04 October 2013 10:14:36AM *  0 points [-]

After some background about NHST on page 1, Sterling immediately begins tallying tests of significance in a years' worth of 4 psychology journals, on page 2, and discovers that eg of 106 tests, 105 rejected the null hypothesis. On page 3, he discusses how this bias could come about.

I think doing a literature review is engaging in using other people data. For the sake of this discussion JoshuaZ claimed that Einstein was doing theoretical work when he worked with other people's data.

If I want to draw information from a literature review to gather insights I don't need expensive equipment. JoshuaZ claimed that you need expensive equipement to gather new insights in biology. I claim that's not true. I claim that there enough published information that's not well organised into theories that you can make major advances in biology without needing to buy any equipment.

As far as I understand you don't run experiments on participants to see whether Dual 'n' back works. You simply gather Dual 'n' back data from other people and tried doing it yourself to know how it feel like. That's not expensive. You don't need to write large grants to get a lot of money to do that kind of work.

You do need some money to pay your bills. Einstein made that money through being a patent clerk. I don't know how you make your money to live. Of course you don't have to tell and I respect if that's private information.

For all I know you could be making money by being a patent clerk like Einstein.

A scientists who can't work on his grant projects because he of the government shutdown could use his free time to do the kind of work that you are doing.

If you don't like the label "theoretic" that's fine. If you want to propose a different label that distinguish your approach from the making fancy expensive experiments approach I'm open to use another label.

I think in the last decades we had an explosion in the amount of data in biology. I think that organising that data into theories lags behind. I think it takes less effort to advance biology by organising into theories and to do a bit of phenomenology than to push for further for expensive equipment produced knowledge.

If I phrase it that way, would you agree?

Comment author: gwern 04 October 2013 03:30:29PM *  0 points [-]

I claim that there enough published information that's not well organised into theories that you can make major advances in biology without needing to buy any equipment.

This can be true but also suboptimal. I'm sure that given enough cleverness and effort, we could extract a lot of genetic causes out of existing SNP databases - but why bother when we can wait a decade and sequence everyone for $100 a head? People aren't free, and equipment both complements and substitutes for them.

As far as I understand you don't run experiments on participants to see whether Dual 'n' back works. You simply gather Dual 'n' back data from other people and tried doing it yourself to know how it feel like. That's not expensive. You don't need to write large grants to get a lot of money to do that kind of work.

I assume you're referring to my DNB meta-analysis? Yes, it's not gathering primary data - I did think about doing that early on, which is why I carefully compiled all anecdotes mentioning IQ tests in my FAQ, but I realized that between the sheer heterogeneity, lack of a control group, massive selection effects, etc, the data was completely worthless.

But I can only gather the studies into a meta-analysis because people are running these studies. And I need a lot of data to draw any kind of conclusion. If n-back studies had stopped in 2010, I'd be out of luck, because with the studies up to 2010, I can exclude zero as the net effect, but I can't make a rigorous statement about the effect of passive vs active control groups. (In fact, it's only with the last 3 or 4 studies that the confidence intervals for the two groups stopped overlapping.) And these studies are expensive. I'm corresponding with one study author to correct the payment covariate, and it seems that on average participants were paid $600 - and there were 40, so they blew $24,000 just on paying the subjects, never mind paying for the MRI machine, the grad students, the professor time, publication, etc. At this point, the total cost of the research must be well into the millions of dollars.

It's true that it's a little irritating that no one has published a meta-analysis on DNB and that it's not that difficult for a random person like myself to do it, it requires little in the way of resources - but that doesn't change the fact that I still needed these dozens of professionals to run all these very expensive experiments to provide grist for the mill.

To go way up to Einstein, he was drawing on a lot of expensive data like that which showed the Mercury anomaly, and then was verified by very expensive data (I shudder to think how much those expeditions must have cost in constant dollars). Without that data, he would just be another... string theorist. Not Einstein.

You do need some money to pay your bills. Einstein made that money through being a patent clerk. I don't know how you make your money to live. Of course you don't have to tell and I respect if that's private information. For all I know you could be making money by being a patent clerk like Einstein.

Not by being a patent clerk, no. :)

A scientists who can't work on his grant projects because he of the government shutdown could use his free time to do the kind of work that you are doing.

To a very limited extent. There has to be enough studies to productively review, and there has to be no existing reviews you're duplicating. To give another example: suppose I had been furloughed and wanted to work on a creatine meta-analysis. I get as far as I got now - not that hard, maybe 10 hours of work - and I realize there's only 3 studies. Now what? Well, what I am doing is waiting a few months for 2 scientists to reply, and then I'll wait another 5 or 10 years for governments to fund more psychology studies which happen to use creatine. But in no way can I possibly "finish" this even given months of government-shutdown-time.

I think in the last decades we had an explosion in the amount of data in biology. I think that organising that data into theories lags behind. I think it takes less effort to advance biology by organising into theories and to do a bit of phenomenology than to push for further for expensive equipment produced knowledge.

I don't think that's a stupid or obviously incorrect claim, but I don't think it's right. Equipment is advancing fast (if not always as fast as my first example of genotyping/sequencing), so it'd be surprising to me if you could do more work by ignoring potential new data and reprocessing old work, and more generally, even though stuff like meta-analysis is accessible to anyone for free (case in point: myself), we don't see anyone producing any impressive discoveries. Case in point: more than a few researchers already believed n-back might be an artifact of the control groups before I started my meta-analysis - my results are a welcome confirmation, not a novel discovery; or to use your vitamin D example, yes, it's cool that we found an effect of vitamin D on sleep (I certainly believe it), but the counterfactual of "QS does not exist" is not "vitamin D's effect on sleep goes unknown" but "Gominak discovers the effect on her patients and publishes a review paper in 2012 arguing that vitamin D affects sleep".

Comment author: Eugine_Nier 04 October 2013 03:25:00AM 0 points [-]

That's a good question, but in this context, seeing a variety of novel discoveries in the last few years indicates a somewhat successful field.

No, seeing a bunch of novel true discoveries indicates a successful field. However, it's normally hard to independently verify the truth of novel discoveries except in cases where those discoveries have applications.

Comment author: JoshuaZ 04 October 2013 03:43:52AM -1 points [-]

This seems like a nitpick more than a serious remark: obviously one is talking about the true discoveries, and giving major examples of them in biology is not at all difficult. The discovery of RNA interference is in the biochem end of things, while a great number of discoveries have occurred in paleontology as well as using genetics to trace population migrations (both humans and non-humans).

it's normally hard to independently verify the truth of novel discoveries except in cases where those discoveries have applications.

So one question here is, for what types of discoveries is your prior high that the discovery is bogus? And how will you tell? General skepticism probably makes sense for a lot of medical "breakthroughs" but there's a lot of biology other than those.

Comment author: gwern 04 October 2013 02:55:11PM *  0 points [-]

Even if you say that science isn't about solving real world issues but about knowledge, I also think that replication rates of 11% in the case of breakthrough cancer research indicates that the field is not good at finding out what's going on.

I don't think a flat replication rate of 11% tells us anything without recourse to additional considerations. It's sort of like a Umeshism: if your experiments are not routinely failing, you aren't really experimenting. The best we can say is that 0% and 100% are both suboptimal...

For example, if I was told that anti-aging research was having a 11% replication rate for its 'stopping aging' treatments, I would regard this as shockingly too high and a collective crime on par with the Nazis, and if anyone asked me, would tell them that we need to spend far far more on anti-aging research because we clearly are not trying nearly enough crazy ideas. And if someone told me the clinical trials for curing balding were replicating at 89%, I would be a little uneasy and wonder what side-effects we were exposing all these people to.

(Heck, you can't even tell much about the quality of the research from just a flat replication rate. If the prior odds are 1 in 10,000, then 11% looks pretty damn good. If the prior odds are 1 in 5, pretty damn bad.)

What I would accept as a useful invocation of an 11% rate is, say, an economic analysis of the benefits showing that this represents over-investment (for example, falling pharmacorp share prices) or surprise by planners/scientists/CEOs/bureaucrats where they had held more optimistic assumptions (and so investment is likely being wasted). That sort of thing.

Comment author: Lumifer 04 October 2013 04:09:10PM 1 point [-]

Replication rate of experiments is quite different from the success rate of experiments.

An 11% success rate is often shockingly high. An 11% replication rate means the researchers are sloppy, value publishing over confidence in the results, and likely do way too much of throwing spaghetti at the wall...

Comment author: gwern 04 October 2013 04:49:03PM *  0 points [-]

Even granting your distinction, the exact same argument still applies: just substitute in an additional rate of, say, 10% chance of going from replication to whatever you choose to define as 'success'. You cannot say that a 11% replication rate and then a 1.1% success rate is optimal - or suboptimal - without doing more intellectual work!

Comment author: Lumifer 04 October 2013 05:02:07PM *  2 points [-]

No, I don't think so. An 11% replication rate means that 89% of the published results are junk and external observers have no problems seeing that. Which implies that if those who published it were a bit more honest/critical/responsible, they should have been able to do a better job of controlling for the effects which lead them to think there's statistical significance when in fact there's none.

If the prior odds are 1:10,000 you have no business publishing results at 0.05 confidence level.

Comment author: gwern 04 October 2013 05:39:46PM 0 points [-]

An 11% replication rate means that 89% of the published results are junk and external observers have no problems seeing that.

Yes, so? As Edison said, I have discovered 999 ways to not build a lightbulb.

Which implies that if those who published it were a bit more honest/critical/responsible, they should have been able to do a better job of controlling for the effects which lead them to think there's statistical significance when in fact there's none.

Huh? No. As I already said, you cannot go from replication rate to judgment of the honesty, competency, or insight of researchers without additional information. Most obviously, it's going to be massively influenced by the prior odds of the hypotheses.

If the prior odds are 1:10,000 you have no business publishing results at 0.05 confidence level.

No one has any business publishing at an arbitrary confidence level, which should be chosen with respect to some even half-assed decision analysis. 1:10,000 or 1:1000, doesn't matter.

Comment author: Lumifer 04 October 2013 06:23:43PM 1 point [-]

As Edison said, I have discovered 999 ways to not build a lightbulb.

You're still ignoring the difference between a failed experiment and a failed replication.

Edison did not publish 999 papers each of them claiming that this is the way to build the lightbulb (at p=0.05).

you cannot go from replication rate to judgment of the honesty, competency, or insight of researchers without additional information. Most obviously, it's going to be massively influenced by the prior odds of the hypotheses.

And what exactly prevents the researchers from considering the prior odds when they are trying to figure out whether their results are really statistically significant?

I disagree with you -- if a researcher consistently publishes research that cannot be replicated I will call him a bad researcher.

Comment author: gwern 04 October 2013 06:45:08PM 0 points [-]

You're still ignoring the difference between a failed experiment and a failed replication. Edison did not publish 999 papers each of them claiming that this is the way to build the lightbulb (at p=0.05).

So? What does this have to do with my point about optimizing return from experimentation?

And what exactly prevents the researchers from considering the prior odds when they are trying to figure out whether their results are really statistically significant?

Nothing. But no one does that because to point out that a normal experiment has resulted in a posterior probability of <5% is not helpful since that could be said of all experiments, and to run a single experiment so high-powered that it could single-handedly overcome the prior probability is ludicrously wasteful. You don't run a $50m clinical trial enrolling 50,000 people just because some drug looks interesting.

I disagree with you -- if a researcher consistently publishes research that cannot be replicated I will call him a bad researcher.

Too bad. You should get over that.

Comment author: Lumifer 04 October 2013 07:17:32PM *  1 point [-]

I think our disagreement comes (at least partially) from the different views on what does publishing research mean.

I see your position as looking on publishing as something like "We did A, B, and C. We got the results X and Y. Take it for what it is. The end."

I'm looking on publishing more like this: "We did multiple experiments which did not give us the magical 0.05 number so we won't tell you about them. But hey, try #39 succeeded and we can publish it: we did A39, B39, and C39 and got the results X39 and Y39. The results are significant so we believe them to be meaningful and reflective of actual reality. Please give our drug to your patients."

The realities of scientific publishing are unfortunate (and yes, I know of efforts to ameliorate the problem in medical research). If people published all their research ("We did 50 runs with the following parameters, all failed, sure #39 showed statistical significance but we don't believe it") I would have zero problems with it. But that's not how the world currently works.

P.S. By the way, here is some research which failed replication (via this)

Comment author: gwern 04 October 2013 08:09:32PM 0 points [-]

The realities of scientific publishing are unfortunate (and yes, I know of efforts to ameliorate the problem in medical research). If people published all their research ("We did 50 runs with the following parameters, all failed, sure #39 showed statistical significance but we don't believe it") I would have zero problems with it. But that's not how the world currently works.

That would be a better world. But in this world, it would still be true that there is no universal, absolute, optimal percentage of experiments failing to replicate, and the optimal percentage is set by decision-theoretic/economic concerns.