Drive-less AIs and experimentation

One of the things I've been thinking about is how to safely explore the nature of intelligence. I'm unconvinced of FOOMing and would rather we didn't avoid AI entirely if we can't solve Yudkowsky style Friendliness. So some method of experimentation is needed to determine how powerful intelligence actually is.

So can we create an AI that has very limited scope? That is try and avoid the drives by setting goals such as avoiding changing the world and turning itself off after having achieved a small goal?

Let us say the goal is to change the colour of a ball from green to red. You can leave paint and paint brushes and a robot around to make it easy, but it might determine the best way (least world-changing) is to create a dye manufacturing bacteria instead. How well it did on the test would also allow you to gauge the optimising power of the system to know whether we need "first mover/winner take all" style friendliness or societal friendliness for many AI .

Creating AIs without drives seems easier than creating ones that do have goals to shape the rest of human history. What do other people think

So can we create an AI that has very limited scope? That is try and avoid the drives by setting goals such as avoiding changing the world and turning itself off after having achieved a small goal?

Creating AIs without drives seems easier than creating ones that do have goals to shape the rest of human history. What do other people think

The strongest counterargument offered was that a scope-limited AI doesn't stop rogue unfriendly AIs from arising and destroying the world.

I don't quite understand that argument, maybe someone could elaborate.

If there is a rule that says 'optimize X for X seconds' why would an AGI make a difference between 'optimize X' and 'for X seconds'? In other words, why is it assumed that we can succeed to create a paperclip maximizer that cares strongly enough about the design parameters of paperclips to consume the universe (why would it do that as long as it isn't told to do so) but somehow ignores all design parameters that have to do with spatio-temporal scope boundaries or resource limitations?

I see that there is a subset of unfriendly AGI designs that would never halt, or destroy humanity while pursuing their goals. But how large is that subset, how many do actually halt or proceed very slowly?

(I wrote this before seeing timtyler's post.)

If there is a rule that says 'optimize X for X seconds' why would an AGI make a difference between 'optimize X' and 'for X seconds'?

I does seem like you misinterpreted the argument, but one possible failure there is if the most effective way to maximize paperclips within the time period is to build paperclip-making Von Neumann machines. If it designs the machines from scratch, it won't build a time limit into them because that won't increase the production of paperclips within the period of time it cares about.

3timtyler15y

I discuss the associated problems here: [...]

0[anonymous]15y

I think the idea is that if I make a perfectly safe AI by constraining it in some way, that doesn't prevent someone else from making an unsafe AI and killing us all.

8

Drive-less AIs and experimentation

8

8

8

Drive-less AIs and experimentation

8

8