We often talk about the dangers and challenges of AI and self-improving agents, but I'm curious what you view as potential beneficial applications of AI - if any! As a ML researcher I encounter a lot of positivity and hype in the field, so the very different perspective of the rationality community would be very interesting.

Specifically I'd like to focus on reinforcement learning, because this most closely matches the concept of an AI that has agency and makes decisions. A reinforcement learning agent is usually defined as a program that interacts with an environment, maximising the sum of rewards it receives.

The environment represents the problem to be solved, and the rewards are a measure of how good the solution is. For some problems - a board game, 3-SAT - assessing a solution (giving a reward) is easy, for others computing a reward may be as difficult as solving the problem in the first place. Those are likely not good candidates to be solved with RL ;)

To facilitate discussion, I would suggest one top level reply per problem, specifying:

  • a short description of the problem
  • the action space - how does the agent interact with the problem / environment
  • the reward - how do we know the agent has done well?

Disclaimer: I work on RL, so if you make suggestions that are feasible and would have substantial positive impact on the world, I may pursue them.

New Answer
New Comment

2 Answers sorted by

Daniel Kokotajlo

60

The problem: Winning Diplomacy games against humans.

The action space: You control an empire in early twentieth-century Europe. You can write orders to your armies and fleets to move and cities to build more armies and fleets. Crucially, you can also engage in text messaging with the other empires. When a timer runs out, the orders are all simultaneously carried out.

The reward: 0 for losing, 1 for winning by yourself, some sort of fractional reward for victories split with other players.

I'm no AI scientist, so I might be totally wrong, but I suspect that you could fine-tune a language model like GPT-2 on a corpus of online diplomacy game chat logs, and then use that model somehow as a component of the RL agent that you train to play the game.

Not sure this would be good or bad for the world, just thought it would be interesting. I sure would love to see it, haha.

You might be interested to learn about some recently announced work on training agents with reinforcement learning to play "no-press" Diplomacy:

Filipe Marchesini

40

Problem: Automatic planting

Action space: the agent obtains data from the sensors and decides how to use the actuators (temperature modifiers, humidity, exposure to sunlight/other modifier) to maximize specific crop characteristics

The reward: the agent knows that he is performing better when he minimizes the time needed for the plants to reach specific characteristics. For example, when trying to minimize the time required for three plants to reach a specific height of 0.2m, a higher score would be attributed to the action policy that led the plant to grow faster to (0.02, 0.05, 0.10, 0.15, 0.20)m. Or say a watermelon plantation, the policy of mapping conditions of temperature, humidity (etc) that led to the emergence of the largest watermelon (given a threshold) in the shortest time possible would reward the agent with higher scores.

If it is possible to achieve high efficiency on food production using RL agents that control cheap sensors on a simple wooden box and cheap products (earth, seeds, water), we could mass produce boxes and distribute them with the embedded agent and a few rules to the final user. Users with this system would get enough food that would pay the cost of the system itself. Users could buy more boxes by selling the exceeding food, and they could distribute the boxes with neighbours, providing substantial positive impact on the world.

I really believe we should decentralize food production and it would be easier with low cost systems that automate practically the whole process, and the user would just do easy things. People would get healthier foods, they would spend less money on food (leaving more money to invest in other needs), they would develop less diseases associated with the consumption of high industrialized products or products with high amount of herbicides.

1 comment, sorted by Click to highlight new comments since:

I'm surprised this hasn't got more comments. Julian, I've been incredibly impressed by your work in RL so far, and I'm super excited to see what you end up working on next.

I hope folks will forgive me just putting down some opinions about what problems in RL to work on:

I think I want us (RL, as a field) to move past games -- board games, video games, etc -- and into more real-world problems.

Where to go looking for problems?

These are much harder to make tractable! Most of the unsolved problems are very hard. I like referencing the NAE's Engineering Grand Challenges and the UN's Sustainable/Millennium Development Goals when I want to think about global challenges. Each one is much bigger than a research project, but I find them "food for thought" when I think about problems to work on.

What characteristics probably make for good problems for deep RL?

1. Outside of human factors -- either too big for humans, or too small, or top fast, or too precise, etc.

2. Episodic/resettable -- has some sort of short periodicity, giving bounds on long-term credit assignment

3. Already connected to computers -- solving a task with RL in a domain that isn't already hooked up to software/sensors/computers is going to be 99% setup and 1% RL

4. Supervised/Unsupervised failed -- I think in general it makes sense only to try RL after we've tried the simpler methods and they've failed to work (perhaps the data is too few, or labels too weak/noisy)

What are candidate problem domains?

Robotics is usually the first thing people say, so best just get it out of the way first. I think this is exactly right, but I think the robots we have access to today are terrible, so this turns into mostly a robot design problem with a comparatively smaller ML problem on top. (After working with robots & RL for years I have hours of this but saving that for another time)

Automatic control systems is underrated as a domain. Many problems involving manufacturing with machines involve all sorts of small/strange adjustments to things like "feed rate" "rotor speed" "head pressure" etc etc etc. Often these are tuned/adjusted by people who build up intuition over time, then transfer intuition to other humans. I expect it would be possible for RL to learn how to "play" these machines better and faster than any human. (Machines include: CNC machines, chemical processing steps, textile manufacture machines, etc etc etc)

Language models have been very exciting to me lately, and I really like this approach to RL with language models: https://openai.com/blog/fine-tuning-gpt-2/ I think the large language models are a really great substrate to work with (so far much better than robots!) but specializing them to particular purposes remains difficult. I think having much better RL science here would be really great.

Some 'basic research' topics

Fundamental research into RL scaling. It seems to me that we still don't really have a great understanding of the science of RL. Compared to scaling laws in other domains, RL is hard to predict, and has a much less well understood set of scaling laws (model size, batch size, etc etc). https://arxiv.org/abs/2001.08361 is a great example of the sort of thing I'd like to have for RL.

Multi-objective RL. In general if you ask RL people about multi-objective, you'll get a "why don't you just combine them into a single objective" or "just use one as an aux goal", but it's much more complex than that in the deep RL case, where the objective changes the exploration distribution. I think having multiple objectives is a much more natural way of expressing what we want systems to do. I'm very excited about at some point having transferrable objectives (since there are many things we want many systems to do, like "don't run into the human" and "don't knock over the shelf", in addition to whatever specific goal).

Trying to find some concrete examples, I'm coming up short.

I'm sorry I didn't meet the recommendation for replies, but glad to have put something on here. I think this is far too few replies for a question like this.