Prometheus

Why do so many think deception in AI is important?

I see a lot of energy and interest being devoted toward detecting deception in AIs, trying to make AIs less deceptive, making AIs honest, etc. But I keep trying to figure out why so many think this is very important. For less-than-human intelligence, deceptive tactics will likely be caught by...

Jan 13, 202424

Back to the Past to the Future

You are sitting in your bedroom, dorm, tent, sidewalk, wherever you are staying right now, reading these words, when a dark void appears in your peripheral vision. A figure walks out of the void. They look a bit different, but also familiar. Surprise! It’s you from the future! And you...

Oct 18, 20235

Why aren't more people in AIS familiar with PDP?

Why haven't more people who work on alignment read Parallel Distributed Processing or even seem at all familiar with Rumelhart's work? This is the fundamental model of cognition and behaviour that all of modern AI is built on, the work that Hinton used for most of his insights. The model...

Sep 1, 202312

Why Is No One Trying To Align Profit Incentives With Alignment Research?

A whole lot of Alignment work seems to be resource-constrained. Many funders have talked about how they were only able to give grants to a small percentage of projects and work they found promising. Many researchers also receive a small fraction of what they could make in the for-profit sector...

Aug 23, 202351

Slaying the Hydra: toward a new game board for AI

AI Timelines as a Hydra Think of current timelines as a giant hydra. You can’t exactly see where the head is, and you don’t know exactly if you’re on the neck of the beast or the body. But you do have some sense of what a hydra is, and the...

Jun 23, 20230

Lightning Post: Things people in AI Safety should stop talking about

This is experimenting with a new kind of post which is meant to convey a lot of ideas very quickly, without going into much detail for each. Things I wish people in AI Safety would stop talking about A list of topics people concerned about x-risk from AI spend, in...

Jun 20, 202323

Aligned Objectives Prize Competition

The goal of this prize is to generate ideas and plans for a long-term, positive future. The starting prize pool is $500 USD, though others are welcome to add to it. We are going to pretend for this exercise that the technical part of AI Alignment is already solved. Your...

Jun 15, 20238

LESSWRONG
LW

LESSWRONG
LW

Prometheus

Prometheus

Why Is No One Trying To Align Profit Incentives With Alignment Research?

Humans are not prepared to operate outside their moral training distribution

Why do so many think deception in AI is important?

Lightning Post: Things people in AI Safety should stop talking about

Prometheus

Why Is No One Trying To Align Profit Incentives With Alignment Research?

Humans are not prepared to operate outside their moral training distribution

Why do so many think deception in AI is important?

Lightning Post: Things people in AI Safety should stop talking about

Why do so many think deception in AI is important?

Back to the Past to the Future

Why aren't more people in AIS familiar with PDP?

Why Is No One Trying To Align Profit Incentives With Alignment Research?

Slaying the Hydra: toward a new game board for AI

Lightning Post: Things people in AI Safety should stop talking about

Aligned Objectives Prize Competition