It looks as though lukeprog has finished his series on how to purchase AI risk reduction. But the ideas lukeprog shares are not the only available strategies. Can Less Wrong come up with more?
A summary of recommendations from Exploring the Idea Space Efficiently:
- Deliberately avoid exposing yourself to existing lines of thought on how to solve a problem. (The idea here is to defeat anchoring and the availability heuristic.) So don't review lukeprog's series or read the comments on this thread before generating ideas.
- Start by identifying broad categories where ideas might be found. If you're trying to think of calculus word problems, your broad categories might be "jobs, personal life, the natural world, engineering, other".
- With these initial broad categories, try to include all the categories that might contain a solution and none that will not.
- Then generate subcategories. Subcategories of "jobs" might include "agriculture, teaching, customer service, manufacturing, research, IT, other". You're also encouraged to generate subsubcategories and so on.
- Spend more time on those categories that seem promising.
- You may wish to map your categories and subcategories on a piece of paper.
If you're strictly a lurker, you can send your best ideas to lukeprog anonymously using his feedback box. Or send them to me anonymously using my feedback box so I can post them here and get all your karma.
Thread Usage
Please reply here if you wish to comment on the idea of this thread.
You're encouraged to discuss the ideas of others in addition to coming up with your own ideas.
If you split your ideas into individual comments, they can be voted on individually and you will probably increase your karma haul.
Check for an AI breakout in a toy model
Without deliberately stacking the deck, setup a situation in which an AI has a clear role to play in a toy model. But make the toy model somewhat sloppy, and give the AI great computer power, in the hope that it will "escape" and achieve its goals in unconventional way. If it doesn't, that useful information; if it does, that's even more useful, and we can get some info by seeing how it did that.
Then instead of the usual "paperclip maximiser goes crazy", we could point to this example as a canonical model of misbehaviour. Not something that is loaded with human terms and seemingly vague sentiments about superintelligences, but more like "how do you prevent the types of behaviour that agent D-f55F showed in the factorising Fibonacci number in the FCS world? After all, interacting with humans and the outside world throws up far more vulnerabilities than the specific ones D-f55F took advantage of in that problem. What are you doing to formally rule out exploitation of these vulnerabilities?"
(if situations like this have happened before, then no need to recreate them, but they should be made more prominent).
The issue here is that almost no-one other than SI sees material utilitarianism as fundamental definition of intelligence (actually there probably aren't even any proponents of material utilitarianism as something to strive for at all). We don't have definition of what is number of paperclips, such definition seems very difficult to create, it is actually unnecessary for using computers to aid creation of paperclips, and it is trivially obvious that material utilitarianism is dangerous; you don't need to go around raising awareness of that among AI researc... (read more)