Krieger

How do people do remote research collaborations effectively?

I think remote research collaboration has a valley-of-inefficiency, where at mid-low levels of collaboration-competence it is much better to just do all research on your own rather than trying to split things up between multiple people (probably due to coordination/communication overhead and all the usual stuff).

While there's been a lot of posts here regarding advice for independent researchers, I haven't seen much for research collaborations (especially at small-scale, without a legible central leadership). People seem to do it pretty frequently/reliably though, e.g., all the forum posts with multiple authors.

Are there any advice/resources/procedures for effective remote research collaborations on the scale of 2-5 people?

Replying toVanessa Kosoy's PreDCA, distilled

Krieger3y

Vanessa Kosoy's PreDCA, distilled

AGI might not literally search over all possible policies, but just employ some heuristics to get a good approximation of the best policy. But then this is a capabilities short-coming, not misalignment

...

Coming back to our scenario, if our model just finds an approximate best policy, it would seem very unlikely that this policy consistently brings about some misaligned goal

In my model this isn't a capabilities failure, because there are demons in imperfect search; what you would get out of a heuristic-search-to-approximate-the-best-policy wouldn't only be something close to the global optimum, but something that has also been optimized by whatever demons (don't even have to be "optimizers" necessarily) that emerged through the selection pressures.

Maybe I'm still misunderstanding PreDCA and it somehow rules out this possibility, but afaik it only seems to do so in the limit of perfect search.

Replying toVanessa Kosoy's PreDCA, distilled

Krieger3y

Vanessa Kosoy's PreDCA, distilled

After having chosen a utility function to maximize, how would it maximize? I'm thinking that the search/planning process for finding good policies naturally introduce mesa-optimizers, regardless of everything that came before in the PreDCA (detecting precursors and extrapolating their utility function).

Signaling Guilt

Krieger

Does anyone feel "signaling guilt"?

When I have to write something like a CV/Resume/essay for the purpose of admission into a high-status institution (e.g., academia, high-paying job), I feel an insane amount of ugh-field and find myself unable to properly motivate myself into actually focusing on it.

All of it just feels ... dishonest.

It gets worse with more competitive positions; just good ol' Moloch doing what it's good at: race-to-the-bottom, throwing away everything of value for the sake of slightly increasing competitiveness at the margin, etc.

Every time when I have to (fake intentions and emotions / intentionally distort my value system / signal fake compliance / pander to the reviewers by adopting their political... (read 206 more words →)

Broken Links for the Audio Version of 2021 MIRI Conversations

Krieger

It seems like about half the audio links for https://intelligence.org/late-2021-miri-conversations/ are broken. Anyone know what happened (or an alternative)?

It seems like the AI risk mitigation solutions you've listed aren't mutually exclusive, but we'll likely have to use a combination of them to succeed. While I agree that it would be ideal for us to end up with a FAS, the pathway towards the outcome would likely involve "sponge coordination" and "pivotal acts" as mechanisms by which our civilization can buy some time before FAS.

A possible scenario in a world where FAS takes some time (chronological):

"sponge coordination" with the help of various AI policy work
one cause-aligned lab executes a "pivotal act" with their (not Friendly FAS, but likely corrigible) AI and ends the period of vulnerability from humanity being a sponge;

Krieger3y

Any further work on AI Safety Success Stories?

It seems like the exact model which the AI will adopt is kinda confounding my picture when I'm trying to imagine how "existentially secure" a world looks like. I'm current thinking there are two possible existentially secure worlds:

The obvious one is where all human dependence is removed from setting/modifying the AI's value system (like CEV, fully value-aligned)—this would look much more unipolar.

The alternate is for the well-intentioned-and-coordianted group to use a corrigible AI that is aligned with its human instructor. To me, whether this scenario looks existentially secure probably depends on "whether small differences in capability can magnify to great power differences"—if false, it would be much easier for capable groups to... (read more)

Replying toAny further work on AI Safety Success Stories?

Krieger3y

Any further work on AI Safety Success Stories?

Is it even possible for a non-pivotal act to ever achieve existential security? Even if we max-ed up AI lab communication and had awesome interpretability, that doesn't help in the long-run given that the amount of minimum resources required to build a misaligned AGI will probably be keep dropping.

Replying toAny further work on AI Safety Success Stories?

Krieger3y

Any further work on AI Safety Success Stories?

Thanks, I found your post very helpful and I think this community would benefit from posts similar as such.

I agree that we would need a clear categorization. Ideally, they would provide us a way to explicitly quantify/make-legible the claims of various proposals e.g. "my proposal, under these assumptions about the world, may give us X years of time, changes the world in these ways, and interacts with proposal A, B, C in these ways.

The lack of such is perhaps one of the reasons as to why I feel the pivotal act framing is still necessary. It seems to me that, while proposals closer to the "gradual steering" end of the spectrum (e.g.... (read more)

Any further work on AI Safety Success Stories?

Krieger

TL;DR, Are they any works similar to Wei_Dai's Ai Safety "Success Stories" that provide a framework to think about the landscape of possible success stories & pathways humanity will take to survive misaligned AI?

I've been trying to think of systematic ways of assessing non-technical proposals for improving the odds of humanity's survival from misaligned AI.

Aside from numerous frameworks for assessing technical alignment proposals, I haven't seen much resources on non-technical proposals that provide a concrete framework to think about the question: "What technological/geopolitical/societal pathway will our civilization most likely take (or should ideally take) in order to survive AI?"

Having such a framework seems pretty valuable since it would let us think about... (read 291 more words →)

Replying toLW Petrov Day 2022 (Monday, 9/26)

Krieger3y

LW Petrov Day 2022 (Monday, 9/26)

well crap, that was fast. does anyone know what karma threshold the button was pressed at?

LESSWRONG
LW

LESSWRONG
LW

Signaling Guilt

Any further work on AI Safety Success Stories?

How do people do remote research collaborations effectively?

Broken Links for the Audio Version of 2021 MIRI Conversations

Krieger

How do people do remote research collaborations effectively?

Signaling Guilt

Broken Links for the Audio Version of 2021 MIRI Conversations

Any further work on AI Safety Success Stories?

Krieger

Signaling Guilt

Any further work on AI Safety Success Stories?

How do people do remote research collaborations effectively?

Broken Links for the Audio Version of 2021 MIRI Conversations

Krieger

How do people do remote research collaborations effectively?

Signaling Guilt

Broken Links for the Audio Version of 2021 MIRI Conversations

Any further work on AI Safety Success Stories?