LESSWRONG
LW

8

[ Question ]

Why don't we currently have AI agents?

26th Dec 2024

1 min read

8

Intuitively, the AutoGPT concept sounds like it should be useful if a company invests in it. Yet, all the big publically available systems are seem to be chat interfaces where the human writes a messages and then the computer writes another message.

Even if AutoGPT-driven by an LLM alone wouldn't achieve all ends, a combination where a human could oversee the steps and shepard AutoGPT, could likely be very productive.

The idea sounds to me like it's simple enough that people at big companies should have considered it. Why isn't something like that deployed?

8

Why don't we currently have AI agents?

16Matt Goldenberg

4Matt Goldenberg

4Nathan Helm-Burger

4Caleb Biddulph

2Gordon Seidoh Worley

New Answer

New Comment

3 Answers sorted by
top scoring

Matt Goldenberg

Dec 26, 2024

163

When you start trying to make an agent, you realize how much your feedback, rerolls, etc are making chat based llms useful

the error correction mechanism is you in a chat based llms, and in the absence of that, it's quite easy for agents to get off track

you can of course add error correction mechanism like multiple llms checking each other, multiple chains of thought, etc, but the cost can quickly get out of hand

[-]ChristianKl4mo20

Is answer assumes that you either have a fully chat based version or one that operates fully autonomous.

You could build something in the middle where every step of the agent gets presented to a human who can press next or correct the agent. An agent might even propose multiple ways forward and let the human decide. That then produces the training data for the agent to get better in the future.

4Matt Goldenberg4mo

This exists and is getting more popular, especially with coding, but also in other verticals

2ChristianKl4mo

Which one's do you see as the top ones?

4Nathan Helm-Burger3mo

I've been using Aider recently with coding. It's a mixed bag, but overall I think I like it. You can configure whether it just acts, or asks for permission first.

Dec 26, 2024

70

I have an AI agent that wrote myself; I use it on average 5x per week over the last 6 months. I think it's moderately useful. I mostly use it for simple shell tasks that would otherwise require copy-pasting back and forth with claude.ai.

My guess is that the big AI companies don't think the market for this is big enough to be worth making a product out of it.

[-]Leon Lang4mo77

I have an AI agent that wrote myself

Best typo :D

3

Dec 26, 2024

41

Anthropic's computer use model and Google's Deep Research both do this. Training systems like this to work reliably has been a bottleneck to releasing them

2 comments, sorted by

Click to highlight new comments since: Today at 6:18 PM

[-]Gordon Seidoh Worley4mo2-1

I can't help but wonder if part of the answer is that they seem dangerous and people are selecting out of producing them.

Like I'm not an expert but creating AI agents seems extremely fun and appealing, and I'm intentionally working on it none because it seems safer not to build them. (Whether you think my contributions to trying to build them would matter or not is another question.)

I think the actual answer is: the AI isn't smart enough and trips up a lot.

But I haven't seen a detailed write up anywhere that talks about why the AI trips up and what are the types of places where it trips up. It feels like all of the existing evals work optimize for legibility/reproducibility/being clearly defined. As a result, it's not measuring the one thing that I'm really interested in: why don't we have AI agents replacing workers. I suspect that some startup's internal doc on "why does our agent not work yet" would be super interesting to read and track over time.

More from ChristianKl

Curated and popular this week