Eli Tyre — LessWrong

But the idea that the shape of one's life may be, in part, an unconscious treatment for mental flaws is a disquieting one.

I find this not at all disquieting? It seems like all of the badness comes from labeling your preferences a "mental flaw".

Is the implicit claim that you would have an overall better life if you pushed yourself to change or grow (another framing word) along this dimension? This is at least not obvious to me.

Bending The Curve

Eli Tyre3d20

Somehow not to Zvi!

Plans A, B, C, and D for misalignment risk

Eli Tyre3d20

It seems like I misunderstood your reading of Ray's claim.

I read Ray as saying "a large fraction of the benefits of advanced AI are only in the biotech sector, and so we could get a large fraction of the benefits by pushing forward on only AI for biotech."

It sounds like you're pointing at a somewhat different axis, in response, saying "we won't get anything close to the benefits of advanced AI agents with only narrow AI systems, because narrow AI systems are just much less helpful."

(And implicitly, the biotech AIs are either narrow AIs (and therefore not very helpful) or they're general AIs that are specialized on biotech, in which case you're not getting the the safety benefits, you're imagining getting by only focusing biotech.)

Plans A, B, C, and D for misalignment risk

Eli Tyre4d*20

Huh? No it doesn't capture much of the benefits. I would have guessed it captures a tiny fraction of the benefits for advanced AI, even for AIs around the level where you might want to pause at human level.

Where do you think that most of the benefits come from?

Edit: My personal consumption patterns are mostly not relevant to this question, so I moved what was formally the rest of this comment to a footnote.^[1]

^{^}
Perhaps I am dumb or my personal priorities are different than most people's, but I expect a large share of the benefits from AI, to my life, personally, are going to be biotech advances, that eg could extend my life or make me smarter.

Like basically the things that could make my life better are 1) somehow being introduced to a compatible romantic partner, 2) cheaper housing, 3) biotech stuff. There isn't much else.

I guess self-driving cars might make travel easier? But most of the cost of travel is housing.

I care a lot about ending factory farming, but that's biotechnology again.

I guess AI, if it was trustworthy, could also substantially improve governance, which could have huge benefits to society.

Plans A, B, C, and D for misalignment risk

Eli Tyre4d2918

I think the current CCP having control over most/all of the universe seems like 50% as bad as AI takeover in my lights

This is a wild claim to me.

Can you elaborate on why you think this?

Plans A, B, C, and D for misalignment risk

Eli Tyre4d50

Thank you for writing this!

Some important things that I learned / clarified for myself from this comment:

Many plans depend on preserving the political will to maintain a geopolitical regime that isn't the nash equilibrium, for years or decades. A key consideration for those plans is "how much much of the benefit of this plan will we have gotten, if the controlled regime breaks down early?"
- Plans that depend on having human level AIs do alignment work (if those plans work at all), don't have linear payoff in time spent working, but they are much closer to linear than plans that depend on genetically engineered super geniuses doing the alignment work.
  - In the AI alignment researcher plan, the AIs can be making progress as soon as they're developed. In the super-genius plan, we need to develop the genetic engineering techniques and (potentially) have the super-geniuses grow up before they can get to work. The benefits to super-geniuses are backloaded, instead of linear.
  - (I don't want to overstate this difference however, because if the plan of automating alignment research is just fundamentally unworkable, it doesn't matter that the returns to automated alignment research would be closer to linear in time, if it did work. The more important crux is "could this work at all?")
The complexity of "controlled takeoff" is in setting up the situation so that things are actually being done responsibly and safely, instead of only seeming so to people that aren't equipped to judge. The complexity of "shut it all down" is in setting up an off-ramp. If "shut it all down" is also including "genetically engineer super-geniuses" as part of the plan, then it's not clearly simpler than "controlled takeoff."

You Should Get a Reusable Mask

Eli Tyre5d40

Hell yeah.

I bought one, plus some filters.

CFAR update, and New CFAR workshops

Eli Tyre5d*83

FWIW, this broadly matches my own experience of working with Anna and participating in CFAR workshops.

There were tensions in how to relate to participants at AIRCS workshops, in particular.

These were explicitly recruitment programs for MIRI. This was extremely explicit—it was stated on the website, and I believe (though Buck could confirm) that all or most of the participants did a technical interview before they were invited to a workshop.

The workshops were part of an extended interview process. It was a combination of 1) the staff assessing the participants, 2) the participants assessing the staff, and 3) (to some extent) enculturating the participants into MIRI/rationalist culture.

However, the environment was dramatically less formal and more vulnerable than most job interviews: about a fourth of the content of the workshops was Circling, for instance.

This meant that workshop staff were both assessing the participants, and assessing their fit-to-the-culture while also aiming to be helpful to them and their personal development by their own lights, including helping them untangle philosophical confusions or internal conflicts.

These intentions were not incompatible, but there were sometimes in tension. It could feel callous to spend a few days having deep personal conversations with someone, talking with them and trying to support them, but then later, in a staff meeting, relatively quickly coming to a judgement about them: evaluating that they don't make the cut.

This was a tension that we were aware of and discussed at the time. I think we overall did a good job of navigating it.

This was a very weird environment, by normal profesional standards. But to my knowledge, there was no incident in which we failed to do right by a AIRCS participant, exploited them, or treated them badly.

The majority of people who came had a good experience, regardless of whether they eventually got hired by MIRI. Of those that did not have a good experience, I believe this was predominantly (possibly entirely?) people who felt that the workshop was a waste of time, rather than that they had actively been harmed.

I would welcome any specific information to the contrary. I could totally believe that there was stuff that I was unaware of, or subtle dynamics that I wasn't tracking at the time, but I would conclude were fucked up on reflection.

But as it is, I don't think we failed to discharge our deontological duty towards any participants.

CFAR update, and New CFAR workshops

Eli Tyre5d228

FWIW, I my guess is that there would be more benefit to the social fabric if you went into (some of) the details of what you observed, instead of making relatively high level statements and asking people to put weight on them to the extent that they respect your reputation.

(Hearing object level details at least makes it easier for people to make up their minds. eg there might be various specific behaviors that you think were irresponsible that some readers would think were non-issues if they heard the details.

Further, sharing observations allows for specific points to be addressed and resolved. The alternative is an unanswerable miasma that hangs over the org forever. "Duncan, one of the people who used to work at CFAR, explicitly disendorsed the org, but wouldn't give details." is the kind of thing that people can gossip about for years, but it doesn't add gears to people's models, and there's nothing that anyone can say that can address the latent objections, because they're unstated.)

However, I also acknowledge that for things like this, there may be a bunch of private details that you are either reluctant to share or are not a liberty to share, and there might be other reasons beside, to say less.

But, insofar as you're willing to make this much of a callout post, I guess it would be better to be as specific as possible, especially as regards "malpractice" that you observed at CFAR workshops.

Why Corrigibility is Hard and Important (i.e. "Whence the high MIRI confidence in alignment difficulty?")

Eli Tyre6d52

Is that true?

Name three.

It doesn't seem like a formalism like VNM makes predictions, the way eg the law of gravity does. You can apply the VNM formalisms to model agents, and that model sometimes seems more or less applicable. But what observation could I see that would undermine or falsify the VNM formalism, as opposed to learning that that some particular agent doesn't obey the VNM axioms.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments