The Decline Effect and the Scientific Method (article @ the New Yorker)

First, as a physicist, I do have to point out that this article concerns mainly softer sciences, e.g. psychology, medicine, etc.

A summary of explanations for this effect:

  • "The most likely explanation for the decline is an obvious one: regression to the mean. As the experiment is repeated, that is, an early statistical fluke gets cancelled out."
  • "Jennions, similarly, argues that the decline effect is largely a product of publication bias, or the tendency of scientists and scientific journals to prefer positive data over null results, which is what happens when no effect is found."
  • "Richard Palmer... suspects that an equally significant issue is the selective reporting of results—the data that scientists choose to document in the first place. ... Palmer emphasizes that selective reporting is not the same as scientific fraud. Rather, the problem seems to be one of subtle omissions and unconscious misperceptions, as researchers struggle to make sense of their results."
  • "According to Ioannidis, the main problem is that too many researchers engage in what he calls “significance chasing,” or finding ways to interpret the data so that it passes the statistical test of significance—the ninety-five-per-cent boundary invented by Ronald Fisher. ... The current “obsession” with replicability distracts from the real problem, which is faulty design."

These problems are with the proper usage of the scientific method, not the principle of the method itself. Certainly, it's important to address them. I think the reason they appear so often in the softer sciences is that biological entities are enormously complex, and so higher-level ideas that make large generalizations are more susceptible to random error and statistical anomalies, as well as personal bias, conscious and unconscious.

For those who haven't read it, take a look at Richard Feynman on cargo cult science if you want a good lecture on experimental design.

New Comment
28 comments, sorted by Click to highlight new comments since:

For those who haven't read it, take a look at Richard Feynman on cargo cult science if you want a good lecture on experimental design.

I loved it. I have a question for anyone who might know: In that 1974 speech, Richard Feyman made a very specific criticism of experimental psychology. He mentioned an "a-number-one experiment" on lab rats running through a maze by a "Mr. Young" in 1937, which corrected for a hugely non-intuitive experimental design error. But then, according to Feynman:

The next experiment, and the one after that, never referred to Mr. Young. They never used any of his criteria of putting the corridor on sand, or being very careful. They just went right on running rats in the same old way, and paid no attention to the great discoveries of Mr. Young, and his papers are not referred to, because he didn't discover anything about the rats.

Who was "Mr. Young?" Did Richard Feynman succeed in drawing attention to this problem within the field of experimental psychology? Has Mr. Young been cited in any papers since 1974?

There may be a reason that more detailed Feynman references haven't been found: copyright. His papers are difficult to access (by appointment only to those "conducting research for which it is necessary"), not digitized, and you're not allowed to reproduce them without permission from the Caltech ('California Institute of Technology') archives and his heirs. Thus, everything in their archives is apparently still on paper & undigitized, in this year of our lord 2023, despite being one of the most important & popular Nobelist physicists of all time whose apparently quite comprehensive papers* are stored in the archives of one of the wealthiest & most high-tech institutes in the world. As usual, copyright is why we can't have nice things.

So, it's entirely possible that 'Young' would pop up almost immediately if you went to LA and started browsing.

(I found this out while idly rechecking the citations about Feynman's IQ being 'only 125' or whatever. I noticed that Gleick did provide a citation for that in the endnotes, and simply hadn't included any body indication of that - I hate that style of citation for precisely this reason - and I noticed it by accident. He cites a speech to his old high shcool, which the Caltech archive has both transcript & audio - but you aren't allowed access to it. Which is probably why only two authors appear to have ever cited the speech, despite it sounding like a major source on Feynman's childhood; presumably they are the only ones to have made the pilgrimage. So, given the difficulty of accessing Feynman's papers, the complete absence of any publicly-known information about Young tells us little about what the papers might contain.)

* Source is Feynman himself, regularly updating the corpus, so thickly covering the post-war period and presumably especially before 1974:

The Richard Phillips Feynman Papers were given to Caltech by Richard Feynman and Gweneth Feynman in two main installments.

The first group of papers, now boxes 1-20 of the collection, was donated by Richard Feynman himself beginning in 1968, with additions later. It contains materials dating from about 1933 to 1970. The second group occupies boxes 21-90. It was given to Caltech by Feynman's widow Gweneth early in 1989. Group 2 contains papers primarily from the 1970s and 1980s, although some older material is present. Supplements since 1994 occupy three boxes and have come from various donors outside the Feynman family.

We've found the source of the story. (The full background turns out to be much too much of a rabbit-hole to be a comment, however, or even a post.)

Extensive searching does not turn up a single result for such a person or study. The only result I can find that seems in the right vein is on a different aspect of rat behavior. What turned up more than anything else were quotations of the essay itself.

Apparently there is some doubt as to whether he actually exists.

The study mentioned in your first link is most likely "Preferences and demands of the white rat for food", by Paul Thomas Young. This paper is includes a section tantalisingly named "Spatial factors in the feeding behavior of rats", which turns out not to be related to Feynman's story:

Eight rats showed marked individual differences in spatial behavior between the extremes of right and left dominance. The tendency of an animal to eat the test-food in a given position, right or left, frequently appears instead of preferential discrimination. Two attempts were made to control the factor of position.

  1. The path of approach to the test-foods was moved bit by bit to the right or left.

  2. The angular position of the pair of food-containers in relation to the line of approach was varied so as to advance one and withdraw the other.

Both methods gave the same result. It was possible to reduce, eliminate and even reverse the tendency of an animal to eat the food in a given right or left position. When the spatial advantage of both foods was the same, the conditions were most favorable for discovering the rat's food preferences. The animal could not, however, be forced into making a choice. Preference is assumed to depend upon organic factors rather than upon the environmental arrangement of test-foods.

Young's biographical note does not include a list of papers, but his Distinguished Scientific Contribution Award note does:

For his lifelong study of hedonic processes in behavior. Convinced of their significance for psychology, he endeavored to give objective reference and experimental validity to the concept. Although hedonic theorizing ran counter to the prevailing temper, he persisted in the belief that the control of behavior must be analyzed for affective value as well as intensity value. His research on preference showed the effect of experience in modifying acceptability; his work on need-free organisms clarified acceptance and appetitive behavior. Most recently, he has been examining composite stimuli and preference. Current renewed interest in hedonic theory rests in good measure on his experimental demonstrations and theoretical arguments."

(continued below)

Your second link was more than I came up with. Thanks!

Here's a list of Young's scientific publications. I've highlighted five with titles which, to me, seemed relevant to the question.

1918

  • An experimental study of mixed feelings. American Journal of Psychology, 29, 237-271.

  • The localization of feeling. American Journal of Psychology, 29, 420-430.

  • Tunable bars, and some demonstrations with a simple bar and a stethoscope. Psychological Bulletin, 15, 293-300.

1921

  • Pleasantness and unpleasantness in relation to organic response. American Journal of Psychology, 32, 38-53.

  • The vibrations of a tuning fork. Science, 54, 604-605.

1922

  • Movements of pursuit and avoidance as expressions of simple feeling. American Journal of Psychology, 33, 511-526.

  • Series of difference tones obtained from tunable bars. American Journal of Psychology, 33, 385-393.

1923

  • Constancy of affective judgment to odors. Journal of Experimental Psychology, 6, 182-191.

  • A differential color mixer with stationary disks. Journal of Experimental Psychology, 6, 323-343.

1924

  • The phenomenological point of view. Psychological Review, 31, 288-296.

1925

  • The coexistence and localization of feelings. British Journal of Psychology, 15, 356-362.

  • The phenomena of organic set. Psychological Review, 32, 472-478.

1927

  • With R. Gundlach & D. A. Rothschild. A test and analysis of set. Journal of Experimental Psychology, 10, 247-280.

  • An analysis of observation in the field of affective psychology. In, Proceedings and Papers of the VIIIth International Congress of Psychology. Groningen: Noordhoff.

  • Studies in affective psychology: I. The localization and spatial character of pleasantness and unpleasantness. American Journal of Psychology, 38, 157-167.

  • Studies in affective psychology: II. The case for the affective processes. American Journal of Psychology, 38, 167-175.

  • Studies in affective psychology: III. The "trained" observer in affective psychology. American Journal of Psychology, 38, 175-185.

  • Studies in affective psychology: IV. The logic of affective psychology. American Journal of Psychology, 38, 186-189.

  • Studies in affective psychology: V. The framework of psychology. American Journal of Psychology, 38, 189-193.

1928

  • Auditory localization with acoustical transposition of the ears. Journal of Experimental Psychology, 11, 399-429.

  • Class-room demonstration of double images. American Journal of Psychology, 40, 497.

  • Studies in affective psychology: VI. Preferential discrimination of the white rat for different kinds of grain. American Journal of Psychology, 40, 372-394.

  • Studies in affective psychology: VII. Conflict of movement in relation to unpleasant feeling. American Journal of Psychology, 40, 394-400.

1930

  • Studies in affective psychology: VIII. The scale of values method. American Journal of Psychology, 42, 17-27.

  • Studies in affective psychology: IX. The point of view of affective psychology. American Journal of Psychology, 42, 27-35.

  • Studies in affective psychology: X. Some general conclusions. American Journal of Psychology, 42, 35-37.

1931

  • The role of head movements in auditory localization. Journal of Experimental Psychology, 14, 95-124.

  • Sex differences in handwriting. Journal of Applied Psychology, 15, 486-498.

  • With W. L. Morgan & E. H. Kniep. Studies in affective psychology: XI. Individual differences in affective reaction to odors. American Journal of Psychology, 43, 406-414.

  • With W. L. Morgan & E. H. Kniep. Studies in affective psychology: XII. The relation between age and affective reaction to odors. American Journal of Psychology, 43, 414-421.

1932

  • The relation of bright and dull pressure to affectivity. American Journal of Psychology, 44, 780-784.

  • Relative food preferences of the white rat. Journal of Comparative Psychology, 14, 297-319.

1933

  • Relative food preferences of the white rat. II. Journal of Comparative Psychology, 15, 149-165.

  • Food preferences and the regulation of eating. Journal of Comparative Psychology, 15, 167-176.

  • With R. K. Compton. A study of organic set: Immediate reproduction of spatial patterns presented by successive points to different senses. Journal of Experimental Psychology, 16, 775-797.

  • Memory for pleasant, unpleasant, and indifferent pairs of words. Journal of Experimental Psychology, 16, 454-455.

  • Motivation of human and animal behavior. Ann Arbor, Mich.: Edwards Brothers.

1936

  • Motivation of behavior, the fundamental determinants of human and animal activity. New York: Wiley.

1937

  • A group experiment upon the affective reaction to odors. American Journal of Psychology, 49, 277-286.

  • Is cheerfulness-depression a general temperamental trait? Psychological Review, 44, 313-319.

  • Laughing and weeping, cheerfulness and depression: A study of moods among college students. Journal of Social Psychology, 8, 311-334.

  • A study upon the recall of pleasant and unpleasant words. American Journal of Psychology, 49, 581-596.

  • Reversal of auditory localization. Psychological Review, 44, 505-520.

1938

  • With W. F. Thomas. Liking and disliking persons. Journal of Social Psychology, 9, 169-188.

  • Preferences and demands of the white rat for food. Journal of Comparative Psychology, 26, 545-588.

  • An hypothesis concerning the mechanism of appetite. Psychological Bulletin, 35, 716-717.

1940

  • Reversal of food preferences of the white rat through controlled pre-feeding. Journal of General Psychology, 22, 33-66.

  • With J. R. Wittenborn. Food preferences of rachitic and normal rats. Journal of Comparative Psychology, 30, 261-276.

  • Emotion in man and animal, a psychological interpretation. Ann Arbor, Mich.: Edwards Brothers.

(continued below)

1941

  • The experimental analysis of appetite. Psychological Bulletin, 38, 129-164.

  • With W. B. Singer. Studies in affective reaction: I. A new affective rating-scale. Journal of General Psychology, 24, 281-301.

  • With W. B. Singer. Studies in affective reaction: II. Dependence of affective ratings upon the stimulus-situation. Journal of General Psychology, 24, 303-325.

  • With W. B. Singer. Studies in affective reaction: III. The specificity of affective reactions. Journal of General Psychology, 24, 327-341.

  • Motivation. In W. S. Monroe (Ed.), Encyclopedia of educational research. New York: Macmillan.

1942

  • With W. F. Thomas. A study of organic set: Immediate reproduction, by different muscle groups, of patterns presented by successive visual flashes. Journal of Experimental Psychology, 30, 347-367.

1943

  • Emotion in man and animal, its nature and relation to attitude and motive. New York: Wiley.

1944

  • Studies of food preference, appetite and dietary habit: I. Running activity and dietary habit of the rat in relation to food preference. Journal of Comparative Psychology, 37, 327-370.

  • Studies of food preference, appetite and dietary habit: II. Group self-selection maintenance as a method in the study of food preferences. Journal of Comparative Psychology, 37, 371-391.

  • Food Preferences, food habits, and appetites of the rat. Report, Feb. 13, National Research Council, Committee on Food Habits, Washington, D. C.

1945

  • With J. P. Chaplin. Studies of food preference, appetite and dietary habit: III, Palatability and appetite in relation to bodily need. Comparative Psychology Monographs, 18, No. 3.

  • Studies of food preference, appetite and dietary habit: IV. The balance between hunger and thirst. Journal of Comparative Psychology, 38, 135-174.

  • Studies of food preference, appetite and dietary habit: V. Techniques for testing food preference and the significance of results obtained with different methods. Comparative Psychology Monographs, 19, No. 1.

  • With H. B. Carlson & R. P. Fischer. Improvement in elementary psychology as related to intelligence. Psychological Bulletin, 42, 27-34.

1946

  • Studies of food preference, appetite and dietary habit: VI. Habit, palatability and the diet as factors regulating the selection of food by the rat. Journal of Comparative Psychology, 39, 139-176.

  • With J. A. Yavitz. Activities in which college students experience success and failure and those in which they wish to be more successful. Journal of Social Psychology, 24, 131-148.

  • Motivation. In P. L. Harriman (Ed.), Encyclopedia of psychology. New York: Philosophical Library.

  • La emocion en el hombre y en el animal. (Trans. by Emilia Mira) Buenos Aires: Nova.

1947

  • Studies of food preference, appetite and dietary habit: VII. Palatability in relation to learning and performance. Journal of Comparative and Physiological Psychology, 40, 37-72.

  • Motivation, feeling, and emotion. In T. G. Andrews (Ed), Methods of psychology. New York: Wiley.

1948

  • Studies of food preference, appetite and dietary habit: VIII. Food-seeking drives, palatability and the law of effect. Journal of Comparative and Physiological Psychology, 41, 269-300.

  • Appetite, palatability and feeding habit: A critical review. Psychological Bulletin, 45, 289-320.

1949

  • Studies of food preference, appetite and dietary habit. IX. Palatability versus appetite as determinants of the critical concentrations of sucrose and sodium chloride. Comparative Psychology Monographs, 19(5), 1-44.

  • With J. P. Chaplin. Studies of food preference, appetite and dietary habit; X. Preferences of adrenalectomized rats for salt solutions of different concentrations. Comparative Psychology Monographs, 19(5), 45-74.

  • Food-seeking drive, affective process, and learning. Psychological Review, 56, 98-121.

  • Emotion as disorganized response: A reply to Professor Leeper. Psychological Review, 56, 184-191.

  • Mechanical aids for patients with muscular disability. Journal of Bone and Joint Surgery, 31-A, 428-430.

1950

  • Motivation, In W. S. Monroe (Ed.), Encyclopedia of educational research. (Rev. ed.) New York: Macmillan.

1951

  • Motivation of animal behavior. In C. P. Stone (Ed.), Comparative psychology. (3rd ed.) New York: Prentice-Hall.

1952

  • With H. W. Richey. Diurnal drinking patterns in the rat. Journal of Comparative and Physiological Psychology, 45, 80-89.

  • With A. W. Heyer & H. W. Richey. Drinking patterns in the rat following water deprivation and subcutaneous injections of sodium chloride, Journal of Comparative and Physiological Psychology, 45, 90-95.

  • The role of hedonic processes in the organization of behavior. Psychological Review, 59, 249-262.

  • Motivation, affectivite et emotion. In T. G. Andrews (Ed.), Methodes de la psychologie. (Trans. by C. Nony) Paris: Presses Universitaires.

1953

  • With J. T. Greene. Quantity of food ingested as a measure of relative acceptability. Journal of Comparative and Physiological Psychology, 46, 288-294.

  • With J. T. Greene. Relative acceptability of saccharine solutions as revealed by different methods. Journal of Comparative and Physiological Psychology, 46, 295-298.

  • Differential color-mixers. American Journal of Psychology, 66, 312-313.

  • Motivation. In W. Yust (Ed.), Encyclopedia Britannica. Chicago: Encyclopedia Britannica Press.

1954

  • With C. Pfaffman, V, G. Dethier, C. P. Richter, & E. Stellar. The preparation of solutions for research in chemoreception and food acceptance. Journal of Comparative and Physiological Psychology, 47, 93-96.

  • With E. H. Shuford, Jr. Intensity, duration, and repetition of hedonic processes as related to acquisition of motives. Journal of Comparative and Physiological Psychology, 47, 298-305. (Reprinted: Indianapolis, Ind.: Bobbs-Merrill. No. P-371.)

1955

  • With E. H. Shuford, Jr. Quantitative control of motivation through sucrose solutions of different concentrations. Journal of Comparative and Physiological Psychology, 48, 114-118.

  • Are there degrees of preference? American Journal of Psychology, 68, 318-319.

  • The role of hedonic processes in motivation. In M. R. Jones (Ed.), Nebraska symposium on motivation: 1955. Lincoln: Univer. Nebraska Press.

1956

  • With J. L. Falk. The relative acceptability of sodium chloride solutions as a function of concentration and water need. Journal of Comparative and Physiological Psychology, 49, 569-575.

  • With J. L. Falk. The acceptability of tap water and distilled water to nonthirsty rats. Journal of Comparative and Physiological Psychology, 49, 336-338.

1957

  • With D. Asdourian. Relative acceptability of sodium chloride and sucrose solutions. Journal of Comparative and Physiological Psychology, 50, 499-503.

  • Continuous recording of the fluid-intake of small animals. American Journal of Psychology, 70, 295-298.

  • Psychologic factors regulating the feeding process. (University of Minnesota Symposium on Nutrition and Behavior, April 27, 1956) American Journal of Clinical Nutrition, 5, 154-161.

1958

  • With J. L. Falk & W. E. Kappauf. Running activity and preference as related to concentration of sodium chloride solutions. American Journal of Psychology, 71, 255-262.

1959

  • The role of affective processes in learning and motivation. Psychological Review, 66, 104-125. (Reprinted: Indianapolis, Ind.: Bobbs-Merrill. No. P-372.)

1960

  • Motivation. In W. H. Crouse (Ed.), Encyclopedia of science and technology. New York: McGrawHill.

  • Isohedonic contour maps. Psychological Reports, 7, 478.

1961

  • Motivation and emotion, a survey of the determinants of human and animal activity. New York: Wiley.

1962

  • With W. E. Kappauf. Apparatus and procedures for studying taste-preferences in the white rat. American Journal of Psychology, 75, 482-484.

  • With K. R. Christensen. Algebraic summation of hedonic processes. Journal of Comparative and Physiological Psychology, 55, 332-336.

  • Methods for the study of feeling and emotion. In D. K. Candland (Ed.), Emotion, bodily change. Princeton, N. J.: Van Nostrand.

  • Drives. In L. Shores (Ed.), Collier's encyclopedia. New York: Crowell-Collier.

  • Feeling. In L. Shores (Ed.), Collier's encyclopedia. New York: Crowell-Collier.

1963

  • With R. H. Schulte. Isohedonic contours and tongue activity in three gustatory areas of the rat. Journal of Comparative and Physiological Psychology, 56, 465-475.

  • With C. H. Madsen, Jr. Individual isohedons in sucrose-sodium chloride and sucrose-saccharin gustatory areas. Journal of Comparative and Physiological Psychology, 56, 903-909.

  • With R. G. Burright & L. J. Tromater. Preferences of the white rat for solutions of sucrose and quinine hydrochloride. American Journal of Psychology, 76, 205-217.

  • Motivation. In A. Deutsch (Ed.), The encyclopedia of mental health. New York: Franklin Watts.

1964

  • With C. L. Trafton. Activity contour maps as related to preference in four gustatory stimulus areas of the rat. Journal of Comparative and Physiological Psychology, 58, 68-75.

  • With C. L. Trafton. Psychophysical studies of taste preference and fluid intake. In M. Wayner (Ed.), Thirst, proceedings of the First International Symposium on Thirst in the Regulation of Body Water held at the Florida State University in Tallahassee, May 1963. Oxford, England: Pergamon Press.

1965

  • Hedonic organization and regulation of behavior. Psychological Review, in press.

  • Physiological drives. In E. L. Sills (Ed.), International encyclopedia of the social sciences. New York: Macmillan, in press.

  • Emotion: The concept. In E. L. Sills (Ed.), International encyclopedia of the social sciences. New York: Macmillan, in press.

I have now read through:

None of them mention any experiments which match Feynman's description even a little bit, and the hits for 'sand' all refer to rats digging, not constructing mazes.

For anyone interested in helping with this problem, I have started a page in my wikipedia sandbox.

https://en.wikipedia.org/wiki/User:Niels_Olson/Young%27s_1937_experiment_on_rats

I noticed something interesting: in Google Scholar, when you punch in Young as author and the reasonable search terms 'rat' 'maze' 'sand' restricted to before Feynman's lecture, only 3 items pop up.

I don't have access to the 3, so I've requested them: http://lesswrong.com/lw/ji3/lesswrong_help_desk_free_paper_downloads_and_more/auye

(Frustratingly, Young wrote a whole textbook on rats/mice available on the Internet Archive - the year before Feynman says he did the experiment! Another textbook, Emotion in man and animal: its nature and dynamic basis, isn't on IA but is in Google Books; checking it with a few keywords like 'sand' and 'smell' and 'third', doesn't seem to throw up any particularly good hits.)

Emotion in man and animal can now be read on Hathitrust: https://catalog.hathitrust.org/Record/000426365 Checking the ToC doesn't turn up anything relevant, and an additional search for 'maze' shows some maze-running mice experiments but not the one in question.

I wonder if it's possible this is the wrong Young? It is not that rare a US surname (far from it, #28 in 1990). Thinking about it, isn't calling him "Mr. Young" a little odd? P.T. Young definitely had a PhD and was a tenured professor, so it's a bit disrespectful to not refer to him as 'Dr Young' or Professor Young'. (And some quick skimming doesn't turn up any obvious connections between Young's University of Illinois and Feynman, so how did he hear of it?)

[-][anonymous]00

xxx

[This comment is no longer endorsed by its author]Reply
[-][anonymous]00

xxx

[This comment is no longer endorsed by its author]Reply
[-][anonymous]00

The study mentioned in your first link is most likely "Preferences and demands of the white rat for food", by Paul Thomas Young. This paper is includes a section tantalisingly named "Spatial factors in the feeding behavior of rats", which turns out not to be related to Feynman's story:

Eight rats showed marked individual differences in spatial behavior between the extremes of right and left dominance. The tendency of an animal to eat the test-food in a given position, right or left, frequently appears instead of preferential discrimination. Two attempts were made to control the factor of position.

  1. The path of approach to the test-foods was moved bit by bit to the right or left.

  2. The angular position of the pair of food-containers in relation to the line of approach was varied so as to advance one and withdraw the other.

Both methods gave the same result. It was possible to reduce, eliminate and even reverse the tendency of an animal to eat the food in a given right or left position. When the spatial advantage of both foods was the same, the conditions were most favorable for discovering the rat's food preferences. The animal could not, however, be forced into making a choice. Preference is assumed to depend upon organic factors rather than upon the environmental arrangement of test-foods.

Young's biographical note does not include a list of papers, but his Distinguished Scientific Contribution Award note does:

For his lifelong study of hedonic processes in behavior. Convinced of their significance for psychology, he endeavored to give objective reference and experimental validity to the concept. Although hedonic theorizing ran counter to the prevailing temper, he persisted in the belief that the control of behavior must be analyzed for affective value as well as intensity value. His research on preference showed the effect of experience in modifying acceptability; his work on need-free organisms clarified acceptance and appetitive behavior. Most recently, he has been examining composite stimuli and preference. Current renewed interest in hedonic theory rests in good measure on his experimental demonstrations and theoretical arguments."

(continued below)

[This comment is no longer endorsed by its author]Reply

One interesting lead showed up on Twitter: Marvin Minsky on Usenet 10 April 1993 (sci.bio "Puling Habits out of Rats") in response to someone asking 'who was Mr Young and whatever happened with the mouse studies':

What happened around 1937 was that

  • [possibility #] 5. B. F. Skinner developed ways to control all those external variables by enclosing the experiment in a sealed, soundproof, lightproof, etc., box. The results were reliably reproducible, and a great deal was learned. The boxes were soon names "Skinner Boxes" and became the new paradigm for studying animal learning. Skinner and many others switched to pigeons, for various reasons, but others continued to use rats.

When I was undergraduate in the late '40s, I hung around that lab and helped with some switching and sequencing stuff to make the experiments more convenient. I don't remember the name of Young, but it was folklore that the change was because someone had found that rats appeared to be able to navigate by distant cues, e.g., the appearance of the ceiling, so that the traditional open-topped maze experiments might be flawed.

It is not Feynman, but we seem to have a confirmed anecdote of a similar problem in rat experiments. From "Shortcut Learning in Deep Neural Networks", Geirhos et al 2020:

2.1 Shortcut learning in Comparative Psychology: unintended cue learning

Rats learned to navigate a complex maze apparently based on subtle colour differences—very surprising given that the rat retina has only rudimentary machinery to support at best somewhat crude colour vision. Intensive investigation into this curious finding revealed that the rats had tricked the researchers: They did not use their visual system at all in the experiment and instead simply discriminated the colours by the odour of the colour paint used on the walls of the maze. Once smell was controlled for, the remarkable colour discrimination ability disappeared... [Nicholas Rawlins, personal communication with F.A.W. some time in the early 1990s, confirmed via email on 12.11.2019]

I asked on Twitter if Rawlins had seen this first hand or if it was secondhand, and the second author stated:

yes, the anecdote happened as described in Nicholas Rawlins' laboratory at Oxford, confirmed in personal communication with Felix Wichmann in Nov '19

The main problems here appear to be post hoc analysis and the file drawer effect. One reform that would make a huge difference would be a register of trials in advance of the trial taking place, including details of how they propose to analyze the data. Ideally, journals would accept papers for publication on the basis of the entry in the register, before the data arrives.

A register of proposals does seem like it would help to keep scientists honest, a step towards Feynman's "utter honesty." I would hesitate to say that journals should accept papers based solely on that register, though. Sometimes, the proposal might not end up as a wholly accurate description of the actual experiment, for a variety of reasons. I think that making both available would be a good way to judge how well the result actually applies to what is claimed by the scientists who published it. They could offer explanations for any differences between the proposal and the actual, and peer reviewers could give more thorough critiques with this extra information.

What's the reason to not demand that all experiments be videoed in their entirety?

You seem to be trying to accommodate the way scientists and journals already operate:

Sometimes, the proposal might not end up as a wholly accurate description of the actual experiment, for a variety of reasons.

It might not be bad to accommodate them, but the primary and central purpose of science is to know – to produce shared knowledge of the world.

I think an ideal journal would allow scientists to change their registered proposal – possibly. That would also be recorded in the journal's register, if they accept the changes.

Maybe I'm in a bad mood, but it's especially galling how terrible all of this still, e.g. NOT sharing all scientific results with the public for publicly funded research.

Why can't all of this be done in the open, on the researcher's blog? They register a proposal by publishing a post describing it, in as much detail as is feasible, e.g. including code they're registering to use on the data they collect. They record video of the entire experiment (where feasible); they publish that to YouTube. They publish all of their data. They perform their analysis – the exact one described in their registration post – and then publish a blog post, or a whole series of posts, about their analysis.

If the researchers want to change a registered, but un-performed, experiment, they publish a post describing their changes, in comparable detail as originally.

Blog posts don't need to be open for anyone to comment on. Researchers could explicitly invite other individuals or 'anyone with X degree in Y from an accredited institution recognized by professional association Z'.

The relevant people could comment on the registered proposal, on registered changes, on the documentation of the performance of the experiment itself, and on interpretation of the registered analysis.

Why do we need journals? Why do we want journals?

A recent psychology graduate tells me that what I propose here happens all the time already! I'd be curious to know where to go to find out more.

Note also the follow-up article by the author on Wired. Much of which centers around him saying "No, you guys, this doesn't mean you can believe in anything you want."

As usual, I can't tell what the hell TLP is trying to argue.

First, as a physicist, I do have to point out that this article concerns mainly softer sciences, e.g. psychology, medicine, etc.

It seems to me that biology and medicine are "softer" sciences than chemistry or physics, not due to the subject matter, but mainly because the scientists behave in softer, less rigorous ways - such as the explanations suggested for this article's findings.

What do you think?

Personally, I see them as being softer because they're about less fundamental, i.e. higher order, systems. Experiments and theories in biology and medicine treat high-level concepts (e.g. organs, tissues, symptoms of diseases) as basic. This is somewhat of a necessity, because the tools to model such things as complex collections of and interactions among the component particles don't exist yet.

However, when they treat conclusions drawn from those high-level approximations as universal, problems inevitably arise. We might all have organs that seem the same, and are constructed by the same cellular machinery in large part, but there's still a lot of room for variation. Even subtle variations might have effects that wouldn't be obvious from theories based on high-level approximations. This is where the lack of rigor is a factor: even if scientists understand that they're using high-level approximations, they may not want to admit a lack of universality in a result. (I have noticed this to be especially true in psychology.)

I think the focus on universal theories is a shame; something can be perfectly worthwhile even if it's only useful in some circumstances. Many theories in physics are like this; the usual example is, of course, Newtonian mechanics vs. Einsteinian relativity. A description of some biological or psychological phenomenon can still be useful even if it doesn't apply to every possible organism/brain. There is, however, the problem that determining the set of circumstances in which such a description might be useful can also be more difficult due to the high-level, complex nature of the subjects.

An example of a theory in psychology I find to be useful even if not universal is Kohlberg's stages of moral development. Kohlberg himself was quite convinced the stages were universal, and made a few contortions to try to keep it that way. Given that the original theory was formulated based solely on interviews with males in the USA, a limited range of applicability – based largely on culture and to a smaller degree gender, especially as one looks at the higher stages – is more likely. However, the theory is still a good way to understand how people in the USA think about moral issues; it just might not be as good a way to understand how people in other cultures and social settings think.

Maybe scientists in soft fields behave differently, but this correlation with other notions of "soft" needs to be explained. It could be a founder effect, but that doesn't seem plausible to me.

"If your experiment needs statistics, you ought to have done a better experiment." - Rutherford

One problem is that soft sciences need statistics or at least make them more tempting. I don't think it's useful to phrase it in terms of blame, that the soft scientists were less able to resist this temptation.

(I agree with everything the Dreaded Anomaly said, too.)

I have the strong impression that many of these problems can be avoided by using bayesian statistics and including reasonable prior information (possibly hierarchical). See Gelman on this topic. I suppose that this illustrates that prior information is especially important in more complex fields.