Ram Rachum

Replying to12 Angry Agents, or: A Plan for AI Empathy

12 Angry Agents, or: A Plan for AI Empathy

Good observation. I would bet that the trade-off you suggest is indeed the original reason, however, our social structures have evolved around our pain, physical and emotional. One way to put it is that degraded performance is a bug on the individual level but a feature on the social level. It would be difficult to have a friend who doesn't have insecurities.

-1

12 Angry Agents, or: A Plan for AI Empathy

Ram Rachum

Ram Rachum, Davidmanheim

4mo

In the previous two posts (first, second) we laid out our take on AI alignment, which involves conservative philosophy and the political school of thought of Agonistic Democracy. We also suggested an approach to AI alignment in which the conflicts between multiple agents lead to an AI system that has a sufficiently evolved understanding of right and wrong, and therefore may develop into a deserving steward for the future of humankind.

In this post we'll flesh out the mechanics of why we think conflicts between multiple AI agents could create an AI system that has a moral understanding, empathy, or a conscience. We'll do that by taking as a case study the classic... (read 3327 more words →)

Messy on Purpose: Part 2 of A Conservative Vision for the Future

Davidmanheim

Davidmanheim, Ram Rachum

4mo

In our previous post, we outlined a view of AI alignment we disagree with as a central assumption in current discussions of AI alignment, and suggested that it might be useful to push in a different direction, which we started to outline. Here, we’ll point out that we think alignment itself is the wrong goal, and leads to a suboptimal future. However, we think that if control is handed over to AI systems, we’ll have problems anyways, and we should look in other directions.

A key problem we implicitly noted with traditional AI alignment is that any AI designed to serve everyone risks, or guarantees, dissolving traditions, deviance, or resistance. These “values” differ... (read 3427 more words →)

A Conservative Vision For AI Alignment

Davidmanheim

Davidmanheim, Ram Rachum

6mo

Current plans for AI alignment (examples) come from a narrow, implicitly filtered, and often (intellectually, politically, and socially) liberal standpoint. This makes sense, as the vast majority of the community, and hence alignment researchers, have those views. We, the authors of this post, belong to the minority of AI Alignment researchers who have more conservative beliefs and lifestyles, and believe that there is something important to our views that may contribute to the project of figuring out how to make future AGI be a net benefit to humanity. In this post, and hopefully series of posts, we want to lay out an argument we haven’t seen for what a conservative view of AI alignment... (read 3314 more words →)

Can AI agents learn to be good?

Ram Rachum

Hi everyone!

My name is Ram Rachum and I'm working on AI Safety research. I want to elicit social behavior in RL agents and use it to achieve AI Safety goals such as alignment, interpretability and corrigibility.

I made a guest post on the Future of Life Institute's blog: https://futureoflife.org/ai-research/can-ai-agents-learn-to-be-good/

This isn't specifically about my research, as it's mostly geared towards the public so it's pretty basic. I do have a plug for my latest paper at the bottom. This is my first public writing on AI Safety, so I'd appreciate any comments or corrections.

I'm currently raising funding for my research. If you know of relevant funders, I'd appreciate a connection.

RaD-AI workshop

Ram Rachum

Hi everyone!

I'm co-organizing a workshop on a really interesting topic that's very relevant for AI safety. We call it "Rebellion and Disobedience in AI".

Full text below:

RaD-AI agents are artificial agents (virtual or robots) that reason intelligently about why, when, and how to rebel and disobey their given commands. The need for agents to disobey contrasts with most existing research on collaborative robots and agents, where the definition of a “good” agent is one that complies with the commands it is given, and that works in a predictable manner under the consent of the human it serves. However, as exemplified in Issac Asimov’s Second Law of Robotics, this compliance is not always desired,... (read 233 more words →)

Second call: CFP for Rebellion and Disobedience in AI workshop

Ram Rachum

Hi everyone!

I'm co-organizing a workshop on a really interesting topic that's very relevant for AI safety. We call it "Rebellion and Disobedience in AI". If you're doing work that could be relevant for us, please submit it! If you have questions or want to discuss the scope of this workshop, feel free to ask on this thread and I'll try to answer.

Dear colleagues,

This is a second call for papers for the workshop on Rebellion and Disobedience in AI (RaD-AI) which will take place on May 30, 2023, as part of the AAMAS workshop program.

This call contains an extended submission deadline (February 13) and a list of confirmed speakers.

More details can be found... (read 327 more words →)

CFP for Rebellion and Disobedience in AI workshop

Ram Rachum

Hi everyone!

Full CFP below:

Call for Participation: Workshop on Rebellion and Disobedience at AAMAS’23

This workshop will take place on May 29 or 30, 2023, as part of the AAMAS workshop program.

More details can be found on the workshop’s website:

https://sites.google.com/view/rad-ai/home

RaD-AI agents are artificial agents (virtual or robots) that reason intelligently about why, when, and... (read 281 more words →)

Replying toI there a demo of "You can't fetch the coffee if you're dead"?

Ram Rachum3y

I there a demo of "You can't fetch the coffee if you're dead"?

Thank you Koen. The video by Stuart Armstrong linked in the DeepMind paper is pretty close to what I wanted to do :( The DeepMind paper also does similar things.

While I might be able to improve a bit on these examples, I'm thinking that this probably isn't the best place for me to invest my efforts. Thanks for letting me know about these.

I'm interested in your solutions, I'll send an email to you privately about it.

Replying toI there a demo of "You can't fetch the coffee if you're dead"?

Ram Rachum3y

I there a demo of "You can't fetch the coffee if you're dead"?

I'll check that out, thank you.

Could you please expand on the hot take, please? Consider that a big part of the appeal for me is just being able to display the problem and make it relatable for people who aren't from the field.

Also, what kind of richness do you think makes the qualitative difference that you allude to? If the world was 3D or had continuous action or had more game mechanics, would that have made the difference for you?

I there a demo of "You can't fetch the coffee if you're dead"?

Ram Rachum

Hi everyone! My name is Ram Rachum, and this is my first post here :)

I'm an ex-Google software engineer turned MARL researcher. I want to do MARL research that promotes AI safety. You can read more about my research here and sign up for monthly updates.

I had an idea for a project I could do, and I want you to tell me whether it's been done before.

I want to create a demo of Stuart Russell's "You can't fetch the coffee if you're dead" scenario. I'm imagining a MARL environment where agent 1 can "turn on" agent 2 to prepare coffee for agent 1, and then agent 2 at some point understands how... (read more)

LESSWRONG
LW

LESSWRONG
LW

A Conservative Vision For AI Alignment

12 Angry Agents, or: A Plan for AI Empathy

Messy on Purpose: Part 2 of A Conservative Vision for the Future

CFP for Rebellion and Disobedience in AI workshop

Ram Rachum

Ram Rachum

12 Angry Agents, or: A Plan for AI Empathy

Messy on Purpose: Part 2 of A Conservative Vision for the Future

A Conservative Vision For AI Alignment

Can AI agents learn to be good?

Second call: CFP for Rebellion and Disobedience in AI workshop

CFP for Rebellion and Disobedience in AI workshop

I there a demo of "You can't fetch the coffee if you're dead"?

Ram Rachum

A Conservative Vision For AI Alignment

12 Angry Agents, or: A Plan for AI Empathy

Messy on Purpose: Part 2 of A Conservative Vision for the Future

CFP for Rebellion and Disobedience in AI workshop

Ram Rachum

Ram Rachum

12 Angry Agents, or: A Plan for AI Empathy

Messy on Purpose: Part 2 of A Conservative Vision for the Future

A Conservative Vision For AI Alignment

Can AI agents learn to be good?

Second call: CFP for Rebellion and Disobedience in AI workshop

CFP for Rebellion and Disobedience in AI workshop

I there a demo of "You can't fetch the coffee if you're dead"?