One of the things I've been thinking about is how to safely explore the nature of intelligence. I'm unconvinced of FOOMing and would rather we didn't avoid AI entirely if we can't solve Yudkowsky style Friendliness. So some method of experimentation is needed to determine how powerful intelligence actually is.
So can we create an AI that has very limited scope? That is try and avoid the drives by setting goals such as avoiding changing the world and turning itself off after having achieved a small goal?
Let us say the goal is to change the colour of a ball from green to red. You can leave paint and paint brushes and a robot around to make it easy, but it might determine the best way (least world-changing) is to create a dye manufacturing bacteria instead. How well it did on the test would also allow you to gauge the optimising power of the system to know whether we need "first mover/winner take all" style friendliness or societal friendliness for many AI .
Creating AIs without drives seems easier than creating ones that do have goals to shape the rest of human history. What do other people think
With 2 differences: CEV is tries to correct any mistakes in the initial formulation of the wish(aiming for an attractor), and it doesn't force the designers to specify details like whether making bacteria is ok or not-ok.
It's the difference between painting a painting of a specific scene, and making an auto-focus camera.
I do currently think it is possible to create a powerful cross-domain optimizer that is not a person and will not create persons or unbox itself or look at our universe or tile the universe with anything or make AI that doesn't comply with this. But I approach this line of thought with extreme caution, and really only to accelerate whatever it takes to get to CEV, because AI can't safely make changes to the real world without some knowledge of human volition, even if it wants to.
What if I missed something that's on the scale of the nonperson predicate? My AI works, creatively paints the apple, but somehow it's solution is morally awful. Even staying within pure math could be bad for unforseen reasons.