New article on in vitro iterated embryo selection

11 CarlShulman 08 August 2013 07:28PM

The article by Robert Sparrow:

A series of recent scientific results suggest that, in the not-too-distant future, it will be possible to create viable human gametes from human stem cells. This paper discusses the potential of this technology to make possible what I call ‘in vitro eugenics’: the deliberate breeding of human beings in vitro by fusing sperm and egg derived from different stem-cell lines to create an embryo and then deriving new gametes from stem cells derived from that embryo. Repeated iterations of this process would allow scientists to proceed through multiple human generations in the laboratory. In vitro eugenics might be used to study the heredity of genetic disorders and to produce cell lines of a desired character for medical applications. More controversially, it might also function as a powerful technology of ‘human enhancement’ by allowing researchers to use all the techniques of selective breeding to produce individuals with a desired genotype.

Quote:

However, to my knowledge, the current paper is the first to explicitly discuss the possibility of the iterative use of this technology for reproductive purposes

The possibility was discussed in MIRI's "Uncertain Future" toy forecasting model back in 2009, and the analysis formulated a few years before that.

ETA: And further discussed in James Miller's recent book, "Singularity Rising."

Why do theists, undergrads, and Less Wrongers favor one-boxing on Newcomb?

15 CarlShulman 19 June 2013 01:55AM

Follow-up to: Normative uncertainty in Newcomb's problem

Philosophers and atheists break for two-boxing; theists and Less Wrong break for one-boxing
Personally, I would one-box on Newcomb's Problem. Conditional on one-boxing for lawful reasons, one boxing earns $1,000,000, while two-boxing, conditional on two-boxing for lawful reasons, would deliver only a thousand. But this seems to be firmly a minority view in philosophy, and numerous heuristics about expert opinion suggest that I should re-examine the view.

In the PhilPapers survey, Philosophy undergraduates start off divided roughly evenly between one-boxing and two-boxing:

Newcomb's problem: one box or two boxes?

Other 142 / 217 (65.4%)
Accept or lean toward: one box 40 / 217 (18.4%)
Accept or lean toward: two boxes 35 / 217 (16.1%)

But philosophy faculty, who have learned more (less likely to have no opinion), and been subject to further selection, break in favor of two-boxing:

Newcomb's problem: one box or two boxes?

Other 441 / 931 (47.4%)
Accept or lean toward: two boxes 292 / 931 (31.4%)
Accept or lean toward: one box 198 / 931 (21.3%)

Specialists in decision theory (who are also more atheistic, more compatibilist about free will, and more physicalist than faculty in general) are even more convinced:

Newcomb's problem: one box or two boxes?

Accept or lean toward: two boxes 19 / 31 (61.3%)
Accept or lean toward: one box 8 / 31 (25.8%)
Other 4 / 31 (12.9%)

Looking at the correlates of answers about Newcomb's problem, two-boxers are more likely to believe in physicalism about consciousness, atheism about religion, and other positions generally popular around here (which are also usually, but not always, in the direction of philosophical opinion). Zooming in one correlate, most theists with an opinion are one-boxers, while atheists break for two-boxing:

Newcomb's problem:two boxes 0.125
  one box two boxes
atheism
28.6% (145/506)
48.8% (247/506)
theism
40.8% (40/98)
31.6% (31/98)
Response pairs: 655   p-value: 0.001

Less Wrong breaks overwhelmingly for one-boxing in survey answers for 2012:

NEWCOMB'S PROBLEM
One-box: 726, 61.4%
Two-box: 78, 6.6%
Not sure: 53, 4.5%
Don't understand: 86, 7.3%
No answer: 240, 20.3%

When I elicited LW confidence levels in a poll, a majority indicated 99%+ confidence in one-boxing, and 77% of respondents indicated 80%+ confidence.

What's going on?

I would like to understand what is driving this difference of opinion. My poll was a (weak) test of the hypothesis that Less Wrongers were more likely to account for uncertainty about decision theory: since on the standard Newcomb's problem one-boxers get $1,000,000, while two-boxers get $1,000, even a modest credence in the correct theory recommending one-boxing could justify the action of one-boxing.

If new graduate students read the computer science literature on program equilibrium, including some local contributions like Robust Cooperation in the Prisoner's Dilemma and A Comparison of Decision Algorithms on Newcomblike Problems, I would guess they would tend to shift more towards one-boxing. Thinking about what sort of decision algorithms it is rational to program, or what decision algorithms would prosper over numerous one-shot Prisoner's Dilemmas with visible source code, could also shift intuitions. A number of philosophers I have spoken with have indicated that frameworks like the use of causal models with nodes for logical uncertainty are meaningful contributions to thinking about decision theory. However, I doubt that for those with opinions, the balance would swing from almost 3:1 for two-boxing to 9:1 for one-boxing, even concentrating on new decision theory graduate students.

On the other hand, there may be an effect of unbalanced presentation to non-experts. Less Wrong is on average less philosophically sophisticated than professional philosophers. Since philosophical training is associated with a shift towards two-boxing, some of the difference in opinion could reflect a difference in training. Then, postings on decision theory have almost all either argued for or assumed one-boxing as the correct response on Newcomb's problem. It might be that if academic decision theorists were making arguments for two-boxing here, or if there was a reduction in pro one-boxing social pressure, there would be a shift in Less Wrong opinion towards two-boxing.

Less Wrongers, what's going on here? What are the relative causal roles of these and other factors in this divergence?

ETA: The SEP article on Causal Decision Theory.

Normative uncertainty in Newcomb's problem

6 CarlShulman 16 June 2013 02:16AM

Here is Wikipedia's description of Newcomb's problem:

The player of the game is presented with two boxes, one transparent (labeled A) and the other opaque (labeled B). The player is permitted to take the contents of both boxes, or just the opaque box B. Box A contains a visible $1,000. The contents of box B, however, are determined as follows: At some point before the start of the game, the Predictor makes a prediction as to whether the player of the game will take just box B, or both boxes. If the Predictor predicts that both boxes will be taken, then box B will contain nothing. If the Predictor predicts that only box B will be taken, then box B will contain $1,000,000.

Nozick also stipulates that if the Predictor predicts that the player will choose randomly, then box B will contain nothing.

By the time the game begins, and the player is called upon to choose which boxes to take, the prediction has already been made, and the contents of box B have already been determined. That is, box B contains either $0 or $1,000,000 before the game begins, and once the game begins even the Predictor is powerless to change the contents of the boxes. Before the game begins, the player is aware of all the rules of the game, including the two possible contents of box B, the fact that its contents are based on the Predictor's prediction, and knowledge of the Predictor's infallibility. The only information withheld from the player is what prediction the Predictor made, and thus what the contents of box B are.

Most of this is a fairly general thought experiment for thinking about different decision theories, but one element stands out as particularly arbitrary: the ratio between the amount the Predictor may place in box B and the amount in box A. In the Newcomb formulation conveyed by Nozick, this ratio is 1000:1, but this is not necessary. Most decision theories that recommend one-boxing do so as long as the ratio is greater than 1.

The 1000:1 ratio strengthens the intuition for one-boxing, which is helpful for illustrating why one might find one-boxing plausible. However, given uncertainty about normative decision theory, the decision to one-box can diverge from one's best guess at the best decision theory, e.g. if I think there is a 1 in 10 chance that one-boxing decision theories I may one-box on Newcomb's problem with a potential payoff ratio of 1000:1 but not if the ratio is only 2:1.

So the question, "would you one-box on Newcomb's problem, given your current state of uncertainty?" is not quite the same as "would the best decision theory recommend one-boxing?" This occurred to me in the context of this distribution of answers among target philosophy faculty from the PhilPapers Survey:

Newcomb's problem: one box or two boxes?

Accept: two boxes 13 / 31 (41.9%)
Accept: one box 7 / 31 (22.6%)
Lean toward: two boxes 6 / 31 (19.4%)
Agnostic/undecided 2 / 31 (6.5%)
Other 2 / 31 (6.5%)
Lean toward: one box 1 / 31 (3.2%)


If all of these answers are about the correct decision theory (rather than what to do in the actual scenario), then two-boxing is the clear leader, with a 2.85:1 ratio of support (accept or lean) in its favor, but this skew would seem far short of that needed to justify 1000:1 confidence in two-boxing on Newcomb's Problem.

Here are Less Wrong survey answers for 2012:

NEWCOMB'S PROBLEM
One-box: 726, 61.4%
Two-box: 78, 6.6%
Not sure: 53, 4.5%
Don't understand: 86, 7.3%
No answer: 240, 20.3%

Here one-boxing is overwhelmingly dominant. I'd like to sort out how much of this is disagreement about theory, and how much reflects the extreme payoffs in the standard Newcomb formulation. So, I'll be putting a poll in the comments below.

[Retracted] Simpson's paradox strikes again: there is no great stagnation?

30 CarlShulman 30 July 2012 05:55PM

ETA: The table linked by Landsburg has been called into serious question by Evan Soltas [H.T. CronoDAS]. I edited the post to leave only the table to provide context for the comment discussion of its status.

Economist Steve Landsburg has a post [H.T. David Henderson] about the supposed stagnation of median wages in the United States in recent decades. In the linked table median wages have risen for: 

Satire of Journal of Personality and Social Psychology's publication bias

26 CarlShulman 05 June 2012 12:08AM

Follow-up to:  Follow-up on ESP study: "We don't publish replications", Using degrees of freedom to change the past for fun and profit

As I discussed in the above posts, the Journal of Personality and Social Psychology, a leading psych journal, published a deeply flawed parapsychology study (see the second post for details) which had apparently been tortured to produce results. Then they rejected an attempt to replicate that found no effect, citing a sadly typical policy of not publishing replications. Some of you may enjoy reading one enterprising researcher's amusing satire article, purportedly (not actually) "tallying" past confirmations and disconfirmations in JPSP and drawing conclusions.

 

ETA: To clarify the last sentence, they didn't really find 4800+ confirmation and two disconfirmations. As they say in small print, the data were made up. It's right by the chart.

Using degrees of freedom to change the past for fun and profit

41 CarlShulman 07 March 2012 02:51AM

Follow-up to: Follow-up on ESP study: "We don't publish replications"Feed the Spinoff Heuristic!

Related to: Parapsychology: the control group for scienceDealing with the high quantity of scientific error in medicine

Using the same method as in Study 1, we asked 20 University of Pennsylvania undergraduates to listen to either “When I’m Sixty-Four” by The Beatles or “Kalimba.” Then, in an ostensibly unrelated task, they indicated their birth date (mm/dd/yyyy) and their father’s age. We used father’s age to control for variation in baseline age across participants. An ANCOVA revealed the predicted effect: According to their birth dates, people were nearly a year-and-a-half younger after listening to “When I’m Sixty-Four” (adjusted M = 20.1 years) rather than to “Kalimba” (adjusted M = 21.5 years), F(1, 17) = 4.92, p = .040

That's from "False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant," which runs simulations of a version of Shalizi's "neutral model of inquiry," with random (null) experimental results, augmented with a handful of choices in the setup and analysis of an experiment. Even before accounting for publication bias, these few choices produced a desired result "significant at the 5% level" 60.7% of the time, and at the 1% level 21.5% at the time.

I found it because of another paper claiming time-defying effects, during a search through all of the papers on Google Scholar citing Daryl Bem's precognition paper, which I discussed in a past post about the problems of publication bias and selection over the course of a study. For Bem, Richard Wiseman established a registry for the methods, and tests of the registered studies could be set prior to seeing the data (in addition to avoiding the file drawer).

Now a number of purported replications have been completed, with several available as preprints online, including a large "straight replication" carefully following the methods in Bem's paper, with some interesting findings discussed below. The picture does not look good for psi, and is a good reminder of the sheer cumulative power of applying a biased filter to many small choices.

continue reading »

"The Journal of Real Effects"

13 CarlShulman 05 March 2012 03:07AM

Luke's recent post mentioned that The Lancet has a policy encouraging the advance registration of clinical trials, while mine examined an apparent case study of data-peeking and on-the-fly transformation of studies. But how much variation is there across journals on such dimensions? Are there journals that buck the standards of their fields (demanding registration, p=0.01 rather than p=0.05 where the latter is typical in the field, advance specification of statistical analyses and subject numbers, etc)? What are some of the standouts? Are there fields without any such?

I wonder if there is a niche for a new open-access journal, along the lines of PLoS, with standards strict enough to reliably exclude false-positives. Some possible titles:

 

  • The Journal of Real Effects
  • (Settled) Science
  • Probably True
  • Journal of Non-Null Results, Really
  • Too Good to Be False
  • _________________?

 

Feed the spinoff heuristic!

49 CarlShulman 09 February 2012 07:41AM

Follow-up to:

Parapsychology: the control group for science

Some Heuristics for Evaluating the Soundness of the Academic Mainstream in Unfamiliar Fields

Recent renewed discussions of the parapsychology literature and Daryl Bem's recent precognition article brought to mind the "market test" of claims of precognition. Bem tells us that random undergraduate students were able to predict with 53% accuracy where an erotic image would appear in the future. If this effect was actually real, I would rerun the experiment before corporate earnings announcements, central bank interest rate changes, etc, and change the images based on the reaction of stocks and bonds to the announcements. In other words, I could easily convert "porn precognition" into "hedge fund trillionaire precognition."

If I was initially lacking in the capital to do trades, I could publish my predictions online using public key cryptography and amass an impressive track record before recruiting investors. If anti-psi prejudice was a problem, no one need know how I was making my predictions. Similar setups could exploit other effects claimed in the parapsychology literature (e.g. the remote viewing of the Scientologist-founded Stargate Project of the U.S. federal government). Those who assign a lot of credence to psi may want to actually try this, but for me this is an invitation to use parapsychology as control group for science, and to ponder a general heuristic for crudely estimating the soundness of academic fields for outsiders.

One reason we trust that physicists and chemists have some understanding of their subjects is that they produce valuable technological spinoffs with concrete and measurable economic benefit. In practice, I often make use of the spinoff heuristic: If an unfamiliar field has the sort of knowledge it claims, what commercial spinoffs and concrete results ought it to be producing? Do such spinoffs exist? What are the explanations for their absence?

For psychology, I might cite systematic desensitization of specific phobias such as fear of spiders, cognitive-behavioral therapy, and military use of IQ tests (with large measurable changes in accident rates, training costs, etc). In financial economics, I would raise the hundreds of billions of dollars invested in index funds, founded in response to academic research, and their outperformance relative to managed funds. Auction theory powers tens of billions of dollars of wireless spectrum auctions, not to mention evil dollar-auction sites

This seems like a great task for crowdsourcing: the cloud of LessWrongers has broad knowledge, and sorting real science from cargo cult science is core to being Less Wrong. So I ask you, Less Wrongers, for your examples of practical spinoffs (or suspicious absences thereof) of sometimes-denigrated fields in the comments. Macroeconomics, personality psychology, physical anthropology, education research, gene-association studies, nutrition research, wherever you have knowledge to share.

ETA: This academic claims to be trying to use the Bem methods to predict roulette wheels, and to have passed statistical significance tests on his first runs. Such claims have been made for casinos in the past, but always trailed away in failures to replicate, repeat, or make actual money. I expect the same to happen here. 

Robopocalypse author cites Yudkowsky's paperclip scenario

3 CarlShulman 17 July 2011 02:18AM

In this Bloggingheads clip. Apparently the book is going to be made into a big movie by Steven Spielburg.

Follow-up on ESP study: "We don't publish replications"

71 CarlShulman 12 July 2011 08:48PM

Related to: Parapsychology: the control group for science, Dealing with the high quantity of scientific error in medicine

Some of you may remember past Less Wrong discussion of the Daryl Bem study, which claimed to show precognition, and was published with much controversy in a top psychology journal, JPSP. The editors and reviewers explained their decision by saying that the paper was clearly written and used standard experimental and statistical methods so that their disbelief in it (driven by physics, the failure to show psi in the past, etc) was not appropriate grounds for rejection. 

Because of all the attention received by the paper (unlike similar claims published in parapsychology journals) it elicited a fair amount of both critical review and attempted replication. Critics pointed out that the hypotheses were selected and switched around 'on the fly' during Bem's experiments, with the effect sizes declining with sample size (a strong signal of data mining). More importantly, Richard Wiseman established a registry for advance announcement of new Bem replication attempts.

A replication registry guards against publication bias, and at least 5 attempts were registered. As far as I can tell, at the time of this post the subsequent replications have, unsurprisingly, failed to replicate Bem's results.1 However, JPSP and the other high-end psychology journals refused to publish the results, citing standing policies of not publishing straight replications.

From the journals' point of view, this (common) policy makes sense: bold new claims will tend to be cited more and raise journal status (which depends on citations per article), even though this means most of the 'discoveries' they publish will be false despite their p-values. However, this means that overall the journals are giving career incentives for scientists to massage and mine their data for bogus results, but not to challenge bogus results by others. Alas.

 


 

A purported  "successful replication" by a pro-psi researcher in Vienna turns out to be nothing of the kind. Rather, it is a study conducted in 2006 and retitled to take advantage of the attention on Bem's article, selectively pulled from the file drawer.

ETA: The wikipedia article on Daryl Bem makes an unsourced claim that one of the registered studies has replicated Bem.

ETA2: Samuel Moulton, who formerly worked with Bem, mentions an unpublished (no further details) failed replication of Bem's results conducted before Bem submitted his article (the failed replication was not mentioned in the article).

ETA3: There is mention of a variety of attempted replications at this blog post, with 6 failed replications, and 1 successful replication from a pro-psi researcher (not available online). It is based on this ($) New Scientist article.

ETA4: This large study performs an almost straight replication of Bem (same methods, same statistical tests, etc) and finds the effect vanishes.

ETA5: Apparently, the mentioned replication was again submitted to the British Journal of Psychology:

When we submitted it to the British Journal of Psychology, it was finally sent for peer review. One referee was very positive about it but the second had reservations and the editor rejected the paper. We were pretty sure that the second referee was, in fact, none other than Daryl Bem himself, a suspicion that the good professor kindly confirmed for us. It struck us that he might possibly have a conflict of interest with respect to our submission. Furthermore, we did not agree with the criticisms and suggested that a third referee be brought in to adjudicate. The editor rejected our appeal.

View more: Next