Eliezer_Yudkowsky comments on Friendly AI Research and Taskification - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (44)
I've know I've written previously about how the next step at the top of my basic theoretical To-Do List is come up with a reflective decision theory, one that can talk about modifying the part of itself that does the self-modification. Can someone link to that?
The beginning of your post My Kind of Reflection seems to talk about that. Couldn't find anything more direct.
FWIW, dealing with self-modification isn't very high on my to-do list, because for now I've shifted to thinking of AI as a one-action construction. This approach handles goal stability pretty much automatically, but I'm not sure if it satisfies your needs.
Nesov's reply sounds right to me. It doesn't handle goal stability automatically, it sweeps an issue that you confess you don't understand under the carpet and hopes the AI handles it, in a case where you haven't described an algorithm that you know will handle it and why.
Thanks. I don't understand your reply yet (and about half of Nesov's points are also unparseable to me as usual), but will think more.
This sounds sort of interesting but seems to fall under the rubric of general artificial intelligence research rather than addressing the taskification of Friendly AI research. Note that a research program is only as strong as its weakest link.
You ask after the Friendliness ToDo list, but discount the #1 item according to Eliezer Yudkowsky?!?
Yudkowsky explains why he thinks this area is an important one - e.g. in "Strong AI and Recursive Self-Improvement" - http://video.google.com/videoplay?docid=-821191370462819511
FWIW, IMO, such an approach would make little sense if your plan was just to build a machine intelligence.
We already have a decision theory good enough to automate 99% of jobs on the planet - if only we knew how to implement it. A pure machine intelligence project would be likely to focus on those implementation details - not on trying to adjust decision theory to better handle proofs about the dynamics of self-improving systems.