Ebenezer Dukakis - LessWrong

Technically the point of going to college is to help you thrive in the rest of your life after college. If you believe in AI 2027, the most important thing for the rest of your life is for AI to be developed responsibly. So, maybe work on that instead of college?

I think the EU could actually be good place to protest for an AI pause. Because the EU doesn't have national AI ambitions, and the EU is increasingly skeptical of the US, it seems to me that a bit of protesting could do a lot to raise awareness of the reckless path that the US is taking. That, in turn, could motivate the EU to apply leverage via ASML, sanctions, etc.

The only thing I'm worried about is that EU criticism of the US could create anti-EU polarization among the GOP in the US, which motivates them to be more reckless on AI. This question seems worth a lot more study.

jacquesthibs's Shortform

Ebenezer Dukakis9d31

If people start losing jobs from automation, that could finally build political momentum for serious regulation.

Suggested in Zvi's comments the other month (22 likes):

The real problem here is that AI safety feels completely theoretical right now. Climate folks can at least point to hurricanes and wildfires (even if connecting those dots requires some fancy statistical footwork). But AI safety advocates are stuck making arguments about hypothetical future scenarios that sound like sci-fi to most people. It's hard to build political momentum around "trust us, this could be really bad, look at this scenario I wrote that will remind you of a James Cameron movie"

Here's the thing though - the e/acc crowd might accidentally end up doing AI safety advocates a huge favor. They want to race ahead with AI development, no guardrails, full speed ahead. That could actually force the issue. Once AI starts really replacing human workers - not just a few translators here and there, but entire professions getting automated away - suddenly everyone's going to start paying attention. Nothing gets politicians moving like angry constituents who just lost their jobs.

Here's a wild thought: instead of focusing on theoretical safety frameworks that nobody seems to care about, maybe we should be working on dramatically accelerating workplace automation. Build the systems that will make it crystal clear just how transformative AI can be. It feels counterintuitive - like we're playing into the e/acc playbook. But like extreme weather events create space to talk about carbon emissions, widespread job displacement could finally get people to take AI governance seriously. The trick is making sure this wake-up call happens before it's too late to do anything about the bigger risks lurking around the corner.

Source: https://thezvi.substack.com/p/the-paris-ai-anti-safety-summit/comment/92963364

Just skimming the thread, I didn't see anyone offer a serious attempt at counterargument, either.

Prospects for Alignment Automation: Interpretability Case Study

Ebenezer Dukakis1mo10

I am optimistic that further thinking on automation prospects could identify other automation-tractable areas of alignment and control (e.g. see here for previous work).

This tag might be helpful: https://www.lesswrong.com/w/ai-assisted-alignment

Here's a recent shortform on the topic: https://www.lesswrong.com/posts/mKgbawbJBxEmQaLSJ/davekasten-s-shortform?commentId=32jReMrHDd5vkDBwt

I wonder about getting an LLM to process LW archive posts, and tag posts which contain alignment ideas that seem automatable.

Ebenezer Dukakis's Shortform

Ebenezer Dukakis2mo10

it will also set off the enemy rhetoric detectors among liberals

I'm not sure about that, does Bernie Sanders rhetoric set off that detector?

Ebenezer Dukakis's Shortform

Ebenezer Dukakis2mo10

I think the way the issue is framed matters a lot. If it's a "populist" framing ("elites are in it for themselves, they can't be trusted"), that frame seems to have resonated with a segment of the right lately. Climate change has a sanctimonious frame in American politics that conservatives hate.

The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better

Ebenezer Dukakis2mo32

It looks like the comedian whose clip you linked has a podcast:

https://www.joshjohnsoncomedy.com/podcasts

I don't see any guests in their podcast history, but maybe someone could invite him on a different podcast? His website lists appearances on other podcasts. I figure it's worth trying stuff like this for VoI.

I think people should emphasize more the rate of improvement in this technology. Analogous to early days of COVID -- it's not where we are that's worrisome; it's where we're headed.

How to Make Superbabies

Ebenezer Dukakis2mo10

For humans acting very much not alone, like big AGI research companies, yeah that's clearly a big problem.

How about a group of superbabies that find and befriend each other? Then they're no longer acting alone.

I don't think the problem is about any of the people you listed having too much brainpower.

I don't think problems caused by superbabies would look distinctively like "having too much brainpower". They would look more like the ordinary problems humans have with each other. Brainpower would be a force multiplier.

(I feel we're somewhat talking past each other, but I appreciate the conversation and still want to get where you're coming from.)

Thanks. I mostly just want people to pay attention to this problem. I don't feel like I have unique insight. I'll probably stop commenting soon, since I think I'm hitting the point of diminishing returns.

How to Make Superbabies

Ebenezer Dukakis2mo7029

I think this project should receive more red-teaming before it gets funded.

Naively, it would seem that the "second species argument" matches much more strongly to the creation of a hypothetical Homo supersapiens than it does to AGI.

We've observed many warning shots regarding catastrophic human misalignment. The human alignment problem isn't easy. And "intelligence" seems to be a key part of the human alignment picture. Humans often lack respect or compassion for other animals that they deem intellectually inferior -- e.g. arguing that because those other animals lack cognitive capabilities we have, they shouldn't be considered morally relevant. There's a decent chance that Homo supersapiens would think along similar lines, and reiterate our species' grim history of mistreating those we consider our intellectual inferiors.

It feels like people are deferring to Eliezer a lot here, which seems unjustified given how much strategic influence Eliezer had before AI became a big thing, and how poorly things have gone (by Eliezer's own lights!) since then. There's been very little reasoning transparency in Eliezer's push for genetic enhancement. I just don't see why we're deferring to Eliezer so much as a strategist, when I struggle to name a single major strategic success of his.

How to Make Superbabies

Ebenezer Dukakis2mo83

There's a good chance their carbon children would have about the same attitude towards AI development as they do. So I suspect you'd end up ruled by their silicon grandchildren.

How to Make Superbabies

Ebenezer Dukakis2mo31

These are incredibly small peanuts compared to AGI omnicide.

The jailbreakability and other alignment failures of current AI systems are also incredibly small peanuts compared to AGI omnicide. Yet they're still informative. Small-scale failures give us data about possible large-scale failures.

You're somehow leaving out all the people who are smarter than those people, and who were great for the people around them and humanity? You've got like 99% actually alignment or something

Are you thinking of people such as Sam Altman, Demis Hassabis, Elon Musk, and Dario Amodei? If humans are 99% aligned, how is it that we ended up in a situation where major lab leaders look so unaligned? MIRI and friends had a fair amount of influence to shape this situation and align lab leaders, yet they appear to have failed by their own lights. Why?

When it comes to AI alignment, everyone on this site understands that if a "boxed" AI acts nice, that's not a strong signal of actual friendliness. The true test of an AI's alignment is what it does when it has lots of power and little accountability.

Maybe something similar is going on for humans. We're nice when we're powerless, because we have to be. But giving humans lots of power with little accountability doesn't tend to go well.

Looking around you, you mostly see nice humans. That could be because humans are inherently nice. It could also be because most of the people around you haven't been given lots of power with little accountability.

Dramatic genetic enhancement could give enhanced humans lots of power with little accountability, relative to the rest of us.

[Note also, the humans you see while looking around are strongly selected for, which becomes quite relevant if the enhancement technology is widespread. How do you think you'd feel about humanity if you lived in Ukraine right now?]

Which, yes, we should think about this, and prepare and plan and prevent, but it's just a totally totally different calculus from AGI.

I want to see actual, detailed calculations of p(doom) from supersmart humans vs supersmart AI, conditional on each technology being developed. Before charging ahead on this, I want a superforecaster-type person to sit down, spend a few hours, generate some probability estimates, publish a post, and request that others red-team their work. I don't feel like that is a lot to ask.

LESSWRONG
LW

Posts

Wikitag Contributions

Comments