All of Grant Demaree's Comments + Replies

That implies the ability to mix and match human chromosomes commercially is really far off

I agree that the issues of avoiding damage and having the correct epigenetics seem like huge open questions, and successfully switching a fruit fly chromosome isn't sufficient to settle them

Would this sequence be sufficient?

1. Switch a chromosome in a fruit fly
Success = normal fruit fly development

2a. Switch a chromosome in a rat
Success = normal rat development

2b. (in parallel, doesn't depend on 2a) Combine several chromosomes in a fruit fly to optimize aggressively f... (read more)

3kman
It seems fairly straightforward to test whether a chromosome transfer protocol results in physical/genetic damage in small scale experiments (e.g. replace chromosome X in cell A with chromosome Y in cell B, culture cell A, examine cell A's chromosomes under a microscope + sequence the genome). The epigenetics seems harder. Having a good gears-level understanding of the epigenetics of development seems necessary, because then you'd know what to measure in an experiment to test whether your protocol was epigenetically sound.

Maybe the test case is to delete one chromosome and insert another a chromosome in a fruit fly. Only 4 pairs of chromosomes, already used for genetic modifications with CRISPR

Goal = complete the insertion and still develop a normal fruit fly. I bet this is a fairly inexpensive experiment, within reach of many people on LessWrong

3kman
You probably wouldn't be able to tell if the fruit fly's development was "normal" to the same standards that we'd hold a human's development to (human development is also just way more complicated, so the results may not generalize). That said, this sort of experiment seems worth doing anyways; if someone on LW was able to just go out and do it, that would be great.

Chromosome selection seems like the most consequential idea here if it's possible

Is it possible now, even in animals? Can you isolate chromosomes without damaging them and assemble them into a viable nucleus?

Edit: also -- strong upvoted because I want to see more of this on LW. Not directly AI but massively affects the gameboard

9kman
A working protocol hasn't been demonstrated yet, but it looks like there's a decent chance it's doable with the right stitching together of existing technologies and techniques. You can currently do things like isolating a specific chromosome from a cell line, microinjecting a chromosome into the nucleus of a cell, or deleting a specific chromosome from a cell. The big open questions are around avoiding damage and having the correct epigenetics for development.

My model of "steering" the military is a little different from that It's over a thousand partially autonomous headquarters, which each have their own interests. The right hand usually doesn't know what the left is doing

Of the thousand+ headquarters, there's probably 10 that have the necessary legitimacy and can get the necessary resources. Winning over any one of the 10 is a sufficient condition to getting the results I described above

In other words, you don't have to steer the whole ship. Just a small part of it. I bet that can be done in 6 months

I don't agree, because a world of misaligned AI is known to be really bad. Whereas a world of AI successfully aligned by some opposing faction probably has a lot in common with your own values

Extreme case: ISIS successfully builds the first aligned AI and locks in its values. This is bad, but it's way better than misaligned AI. ISIS want to turn the world into an idealized 7th Century Middle East, which is a pretty nice place compared to much of human history. There's still a lot in common with your own values

I bet that's true

But it doesn't seem sufficient to settle the issue. A world where aligning/slowing AI is a major US priority, which China sometimes supports in exchange for policy concessions sounds like a massive improvement over today's world

The theory of impact here is that there's a lot of policy actions to slow down AI, but they're bottlenecked on legitimacy. The US military could provide legitimacy

They might also help alignment, if the right person is in charge and has a lot of resources. But even if 100% their alignment research is noise that doesn... (read more)

1trevor
I don't know about "providing legitimacy", that's like spending a trillion dollars in order to procure one single gold toilet seat. Gold toilet seats are great, due to the human signalling-based psychology, but it's not worth the trillion dollars. The military is not built to be easy to steer, that would be a massive vulnerability to foreign intelligence agencies.

Because maximizing the geometric rate of return, irrespective of the risk of ruin, doesn't reflect most peoples' true preferences

In the scenario above with the red and blue lines, the full Kelly has a 9.3% chance of losing at least half your money, but the .4 Kelly only has a 0.58% chance of getting an outcome at least that bad

I agree. I think this basically resolves the issue. Once you've added a bunch of caveats:

  • The bet is mind-bogglingly favorable. More like the million-to-one, and less like the 51% doubling
  • The bet reflects the preferences of most of the world. It's not a unilateral action
  • You're very confident that the results will actually happen (we have good reason to believe that the new Earths will definitely be created)

Then it's actually fine to take the bet. At that point, our natural aversion is based on our inability to comprehend the vast scale of a million Earths. I still want to say no, but I'd probably be a yes at reflective equilibrium

Therefore... there's not much of a dilemma anymore

It doesn't matter who said an idea. I'd rather just consider each idea on its own merits

3Dagon
Unfortunately, ideas don't have many merits to consider in the pure abstract (ok, they do, but they're not the important merits.  The important consideration is how well it works for decisions you are likely to face).  You need to evaluate applications of ideas, as embodied by actions and consequences.    For that evaluation, the actions of people who most strongly espouse an idea are a data point to the utility of the idea.  
2Slider
Applying the idea to the world tends to reveal merits.

I don't think that solves it. A bounded utility function would stop you from doing infinite doublings, but it still doesn't prevent some finite number of doublings in the million-Earths case

That is, if the first round multiplies Earth a millionfold, then you just have to agree that a million Earths is at least twice as good as one Earth

7interstice
Right. But just doing a finite number of million-fold-increase bets doesn't seem so crazy to me. I think this is confounded a bit by the resources in the universe being mind-bogglingly large already, so it feels hard to imagine doubling the future utility. As a thought experiment, consider the choice between the following futures: (a) guarantee of 100 million years of flourishing human civilization, but no post-humans or leaving the solar system, (b) 50% chance extinction, 50% chance intergalactic colonization and transhumanism. To me option (b) feels more intuitively appealing.

There's definitely bet patterns superior to risking everything -- but only if you're allowed to play a very large number of rounds

If the pharmacist is only willing to play 10 rounds, then there's no way to beat betting everything every round

As the number of rounds you play approaches infinity (assuming you can bet fractions of a coin), your chance of saving everyone approaches 1. But this takes a huge number of rounds. Even at 10,000 rounds, the best possible strategy only gives each person around a 1 in 20 chance of survival

I buy that… so many of the folks funded by Emergent Ventures are EAs, so directly arguing against AI risk might alienate his audience

Still, this Straussian approach is a terrible way to have a productive argument

3Bill Benzon
FWIW, Cowen rarely has arguments. He'll state strong positions on any number of things in MR but (almost) he never engages with comments at MR. If you want an actual back and forth discussion, the most likely way to get it is in conversation in some forum.

Many thanks for the update… and if it’s true that you could write the very best primer, that sounds like a high value activity

I don’t understand the astroid analogy though. Does this assume the impact is inevitable? If so, I agree with taking no action. But in any other case, doing everything you can to prevent it seems like the single most important way to spend your days

1Artir
The asteroid case - it wouldn't be inevitable; it's just the knowledge that there are people out there substantially more motivated than me (and better positioned) to deal with it. For some activities where I'm really good (like... writing blogposts) and where I expect my actions to make more of an impact relative to what others would be doing I could end up writing a blogpost about 'what you guy should do' and emailing it to some other relevant people.   Also, you can edit your post accordingly to reflect my update!

Many thanks! It looks like EA was the right angle... found some very active English-speaking EA groups right next to where I'll be

I bet you're right that a perceived lack of policy options is a key reason people don't write about this to mainstream audiences

Still, I think policy options exist

The easiest one is adding right right types AI capabilities research to the US Munitions List, so they're covered under ITAR laws. These are mind-bogglingly burdensome to comply with (so it's effectively a tax on capabilities research). They also make it illegal to share certain parts of your research publicly

It's not quite the secrecy regime that Eliezer is looking for, but it's a big step in that direction

I think 2, 3, and 8 are true but pretty easy to overcome. Just get someone knowledgeable to help you

4 (low demand for these essays) seems like a calibration question. Most writers probably would lose their audience if they wrote about it as often as Holden. But more than zero is probably ok. Scott Alexander seems to be following that rule, when he said that we was summarizing the 2021 MIRI conversations at a steady drip so as not to alienate the part of his audience that doesn’t want to see that

I think 6 (look weird) used to be true, but it’s not any more. It’s hard to know for sure without talking to Kelsey Piper or Ezra Klein, but I suspect they didn’t lose any status for their Vox/NYT statements

2Davidmanheim
I think that you're grossly underestimating the difficulty of developing and communicating a useful understanding, and the value and scarcity of expert time. I'm sure Kelsey or someone similar can get a couple of hours of time from one of the leading researchers to ensure they understand and aren't miscommunicating, if they really wanted to call in a favor - but they can't do it often, and most bloggers can't do it at all.  Holden has the advantage of deep engagement in the issues as part of his job, working directly with tons of people who are involved in the research, and getting to have conversations as a funder - none of which are true for most writers.

I agree that it's hard, but there are all sorts of possible moves (like LessWrong folks choosing to work at this future regulatory agency, or putting massive amounts of lobbying funds into making sure the rules are strict)

If the alternative (solving alignment) seems impossible given 30 years and massive amounts of money, then even a really hard policy seems easy by comparison

9ChristianKl
Given the lack of available moves that are promising, attempting to influence policy is a reasonable move. It's part of the 80,000 hours career suggestions. On the other hand it's a long-short and I see no reason to expect a high likelihood of success. 

How about if you solve a ban on gain-of-function research first, and then move on to much harder problems like AGI?  A victory on this relatively easy case would result in a lot of valuable gained experience, or, alternatively, allow foolish optimists to have their dangerous optimism broken over shorter time horizons.

Eliezer gives alignment a 0% chance of succeeding. I think policy, if tried seriously, has >50%. So it's a giant opportunity that's gotten way too little attention

I'm optimistic about policy for big companies in particular. They have a lot to lose from breaking the law, they're easy to inspect (because there's so few), and there's lots of precedent (ITAR already covers some software). Right now, serious AI capabilities research just isn't profitable outside of the big tech companies

Voluntary compliance is also a very real thing. Lots of AI researchers a... (read more)

7otto.barten
This is exactly what we have piloted at the Existential Risk Observatory, a Dutch nonprofit founded last year. I'd say we're fairly successful so far. Our aim is to reduce human extinction risk (especially from AGI) by informing the public debate. Concretely, what we've done in the past year in the Netherlands is (I'm including the detailed description so others can copy our approach - I think they should): 1. We have set up a good-looking website, found a board, set up a legal entity. 2. Asked and obtained endorsement from academics already familiar with existential risk. 3. Found a freelance, well-known ex-journalist and ex-parliamentarian to work with us as a media strategist. 4. Wrote op-eds warning about AGI existential risk, as explicitly as possible, but heeding the media strategist's advice. Sometimes we used academic co-authors. Four out of six of our op-eds were published in leading newspapers in print. 5. Organized drinks, networked with journalists, introduced them to others who are into AGI existential risk (e.g. EAs). Our most recent result (last weekend) is that a prominent columnist who is agenda-setting on tech and privacy issues in NRC Handelsblad, the Dutch equivalent of the New York Times, wrote a piece where he talked about AGI existential risk as an actual thing. We've also had a meeting with the chairwoman of the Dutch parliamentary committee on digitization (the line between a published article and a policy meeting is direct), and a debate about AGI xrisk in the leading debate centre now seems fairly likely. We're not there yet, but we've only done this for less than a year, we're tiny, we don't have anyone with a significant profile, and we were self-funded (we recently got our first funding from SFF - thanks guys!). I don't see any reason why our approach wouldn't translate to other countries, including the US. If you do this for a few years, consistently, and in a coordinated and funded way, I would be very surprised if you cannot

Look at gain of function research for the result of a government moratorium on research. At first Baric feared that the moratorium would end his research. Then the NIH declared that his research isn't officially gain of function and continued funding him. 

Regulating gain of function research away is essentially easy mode compared to AI.

A real Butlerian jihad would be much harder.

It sounds like Eliezer is confident that alignment will fail. If so, the way out is to make sure AGI isn’t built. I think that’s more realistic than it sounds

1. LessWrong is influential enough to achieve policy goals

Right now, the Yann LeCun view of AI is probably more mainstream, but that can change fast.

LessWrong is upstream of influential thinkers. For example:
- Zvi and Scott Alexander read LessWrong. Let’s call folks like them Filter #1
- Tyler Cowen reads Zvi and Scott Alexander. (Filter #2)
- Malcolm Gladwell, a mainstream influencer, reads Tyler Cowen... (read more)

8otto.barten
I think you have to specify which policy you mean. First, let's for now focus on regulation that's really aiming to stop AGI, at least until safety is proven (if possible), not on regulation that's only focusing on slowing down (incremental progress). I see roughly three options: software/research, hardware, and data. All of these options would likely need to be global to be effective (that's complicating things, but perhaps a few powerful states can enforce regulation on others - not necessarily unrealistic). Most people who talk about AGI regulation seem to mean software or research regulation. An example is the national review board proposed by Musk. A large downside of this method is that, if it turns out that scaling up current approaches is mostly all that's needed, Yudkowsky's argument that a few years later, anyone can build AGI in their basement (unregulatable) because of hardware progress seems like a real risk. A second option not suffering from this issue is hardware regulation. The thought experiment of Yudkuwsky that an AGI might destroy all CPUs in order to block competitors, is perhaps its most extreme form. One nod less extreme, chip capability could be forcibly held at either today's capability level, or even at a level of some safe point in the past. This could be regulated at the fabs, which are few and not easy to hide. Regulating compute has also been proposed by Jaan Tallinn in a Politico newsletter, where he proposes regulating flops/km2. Finally, an option could be to regulate data access. I can't recall a concrete proposal but it should be possible in principle. I think a paper should urgently be written about which options we have, and especially what the least economically damaging, but still reliable and enforcible regulation method is. I think we should move beyond the position that no regulation could do this - there are clearly options with >0% chance (depending strongly on coordination and communication) and we can't afford to w

I tend to agree that Eliezer (among others) underestimates the potential value of US federal policy. But on the other hand, note No Fire Alarm, which I mostly disagree with but which has some great points and is good for understanding Eliezer's perspective. Also note (among other reasons) that policy preventing AGI is hard because it needs to stop every potentially feasible AGI project but: (1) defining 'AGI research' in a sufficient manner is hard, especially when (2) at least some companies naturally want to get around such regulations, and (3) at least ... (read more)

Is there a good write up of the case against rapid tests? I see Tom Frieden’s statement that rapid tests don’t correlate with infectivity, but I can’t imagine what that’s based on

In other words, there’s got to be a good reason why so many smart people oppose using rapid tests to make isolation decisions

Could you spell out your objection? It’s a big ask, having read a book just to find out what you mean!

Short summary: Biological anchors are a bad way to predict AGI. It’s a case of “argument from comparable resource consumption.” Analogy: human brains use 20 Watts. Therefore, when we have computers with 20 Watts, we’ll have AGI! The 2020 OpenPhil estimate of 2050 is based on a biological anchor, so we should ignore it.

Longer summary:

Lots of folks made bad AGI predictions by asking: 

  1. How much compute is needed for AGI?
  2. When that compute will be available?

To find (1), they use a “biological anchor,” like the computing power of the human brain, or the tota... (read more)

6Sammy Martin
Holden also mentions something a bit like Eliezer's criticism in his own write-up, When Holden talks about 'ingenuity' methods that seems consistent with Eliezer's  I.e. if you wanted to fold this consideration into OpenAI's estimate you'd have to do it by having a giant incredibly uncertain free-floating variable for 'speedup factor' because you'd be nonsensically trying to estimate the 'speed-up' to brain processing applied from using some completely non-Deep Learning or non-brainlike algorithm for intelligence. All your uncertainty just gets moved into that one factor, and you're back where you started.   It's possible that Eliezer is confident in this objection partly because of his 'core of generality' model of intelligence - i.e. he's implicitly imagining enormous numbers of varied paths to improvement that end up practically in the same place, while 'stack more layers in a brainlike DL model' is just one of those paths (and one that probably won't even work), so he naturally thinks estimating the difficulty of this one path we definitely won't take (and which probably wouldn't work even if we did try it) out of the huge numbers of varied paths to generality is useless. However, if you don't have this model, then perhaps you can be more confident that what we're likely to build will look at least somewhat like a compute-limited DL system and that these other paths will have to share some properties of this path. Relatedly, it's an implication of the model that there's some imaginable (and not e.g. galaxy sized) model we could build right now that would be an AGI, which I think Eliezer disputes?

What particular counterproductive actions by the public are we hoping to avoid?

I should’ve been more clear…export controls don’t just apply to physical items. Depending on the specific controls, it can be illegal to publicly share technical data, including source code, drawings, and sometimes even technical concepts

This makes it really hard to publish papers, and it stops you from putting source code or instructions online

Why isn’t there a persuasive write-up of the “current alignment research efforts are doomed” theory?

EY wrote hundreds of thousands of words to show that alignment is a hard and important problem. And it worked! Lots of people listened and started researching this

But that discussion now claims these efforts are no good. And I can’t find good evidence, other than folks talking past each other

I agree with everything in your comment except the value of showing EY’s claim to be wrong:

  • Believing a problem is harder than it is can stop you from finding creative
... (read more)
9Martin Randall
I think by impending doom you mean AI doom after a few years or decades, so "impending" from a civilizational perspective, not from an individual human perspective. If I misinterpret you, please disregard this post. I disagree on your mental health point. Main lines of argument: people who lose belief in heaven seem to be fine, cultures that believe in oblivion seem to be fine, old people seem to be fine, etc. Also, we evolved to be mortal, so we should be surprised if evolution has left us mentally ill-prepared for our mortality. However, I discovered/remembered that depression is a common side-effect of terminal illness. See Living with a Terminal Illness. Perhaps that is where you are coming from? There is also Death row phenomenon, but that seems to be more about extended solitary confinement than impending doom. I don't think this is closely analogous to AI doom. A terminal illness might mean a life expectancy measured in months, whereas we probably have a few years or decades. Also our lives will probably continue to improve in the lead up to AI doom, where terminal illnesses come with a side order of pain and disability. On the other hand, a terminal illness doesn't include the destruction of everything we value. Overall, I think that belief in AI doom is a closer match to belief in oblivion than belief in cancer and don't expect it to cause mental health issues until it is much closer. On a personal note, I've placed > 50% probability on AI doom for a few years now, and my mental health has been fine as far as I can tell. However, belief in your impending doom, when combined with belief that "Belief in your impending doom is terrible for your mental heath", is probably terrible for your mental health. Also, belief that "Belief in your impending doom is terrible for your mental heath" could cause motivated reasoning that makes it harder to salvage value in the face of impending doom.
5Grant Demaree
Zvi just posted EY's model

I agree. This wasn’t meant as an object level discussion of whether the “alignment is doomed” claim is true. What I’d hopes to convey is that, even if the research is on the wrong track, we can still massively increase the chances of a good outcome, using some of the options I described

That said, I don’t think Starship is a good analogy. We already knew that such a rocket can work in theory, so it was a matter of engineering, experimentation, and making a big organization work. What if a closer analogy to seeing alignment solved was seeing a proof of P=NP this year?

In fact, what I’d really like to see from this is Leverage and CFAR’s actual research, including negative results

What experiments did they try? Is there anything true and surprising that came out of this? What dead ends did they discover (plus the evidence that these are truly dead ends)?

It’d be especially interesting if someone annotated Geoff’s giant agenda flowchart with what they were thinking at the time and what, if anything, they actually tried

Also interested in the root causes of the harms that came to Zoe et al. Is this an inevitable consequence of Leverage’s beliefs? Or do the particular beliefs not really matter, and it’s really about the social dynamics in their group house?

8Viliam
Probably not what you wanted, but you can read CFAR's handbook and updates (where they also reflect on some screwups). I am not aware of Leverage having anything equivalent publicly available.

I don’t agree with the characterization of this topic as self-obsessed community gossip. For context, I’m quite new and don’t have a dog in the fight. But I drew memorable conclusions from this that I couldn’t have gotten from more traditional posts

First, experimenting with our own psychology is tempting and really dangerous. Next time, I’d turn up the caution dial way higher than Leverage did

Second, a lot of us (probably including me) have an exploitable weakness brought on high scrupulously combined with openness to crazy-sounding ideas. Next time, I’d b... (read more)

In fact, what I’d really like to see from this is Leverage and CFAR’s actual research, including negative results

What experiments did they try? Is there anything true and surprising that came out of this? What dead ends did they discover (plus the evidence that these are truly dead ends)?

It’d be especially interesting if someone annotated Geoff’s giant agenda flowchart with what they were thinking at the time and what, if anything, they actually tried

Also interested in the root causes of the harms that came to Zoe et al. Is this an inevitable consequence of Leverage’s beliefs? Or do the particular beliefs not really matter, and it’s really about the social dynamics in their group house?

So is this an accurate summary of your thinking?

  1. You agree with FDT on some issues. The goal of decision theory is to determine what kind of agent you should be. The kind of agent you are (your "source code") affects other agents' decisions
  2. FDT requires you to construct counterfactual worlds. For example, if I'm faced with Newcomb's problem, I have to imagine a counterfactual world in which I'm a two-boxer
  3. We don't know how to construct counterfactual worlds. Imagining a consistent world in which I'm a two-boxer is just as hard as imagining a one where object
... (read more)
2Chris_Leong
"The goal of decision theory is to determine what kind of agent you should be" I'll answer this with a stream of thought: I guess my position on this is slightly complex. I did say that the reason for preferring one notion of counterfactual over another must be rooted in the fact that agents adopting these counterfactuals do better over a particular set of worlds. And maybe that reduces to what you said, although maybe it isn't quite as straightforward as that because I content "possible" is not in the territory.  This opens the door to there being multiple notions of possible and hence counterfactuals being formed by merging lessons from the various notions. And it seems that we could merge these lessons either at the individual decision level or at the level of properties about agent or at the level of agents. Or at least that's how I would like my claims in this post to be understood. That said, the lesson from my post The Counterfactual Prisoner's Dilemma is that merging at the decision-level doesn't seem viable. "FDT requires you to construct counterfactual worlds" I highly doubt that Eliezer embraces David Lewis' view of counterfactuals, especially given his post Probability is in the Mind. However, the way FDT is framed sometimes gives the impression that there's a true definition we're just looking for. Admittedly, if you're just looking for something that works such as in Newcomb's and Regret of Rationality then that avoids this mistake. And I guess if you look at how MIRI has investigated this, which is much more mathematical than philosophical that the do seem to be following this pragmatism principle. I would like to suggest though that this can only get you so far.  "We don't know how to construct counterfactual worlds. You get around this..." I'm not endorsing counterfactual models out of lack of knowledge of how to construct counterfactual worlds, but because I don't think - contra Lewis - that there are strong reasons for asserting such worlds

Really enjoyed this. I’m skeptical, because (1) a huge number of things have to go right, and (2) some of them depend on the goodwill of people who are disincentivized to help

Most likely: the Vacated Territory flounders, much like Birobidzhan (Which is a really fun story, by the way. In the 1930’s, the Soviet Union created a mostly-autonomous colony for its Jews in Siberia. Macha Gessen tells the story here)

Best case:

In September 2021, the first 10,000 Siuslaw Syrians touched down in Siuslaw National Forest, land that was previously part of Oregon.

It was a... (read more)

1DanB
Thanks for the positive feedback and interesting scenario. I'd never heard of Birobidzhan.