Mitchell_Porter comments on Holden Karnofsky's Singularity Institute critique: other objections - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (6)
In connection with this discussion, I am pleased to announce a new initiative, the Unfriendly AI Pseudocode Contest!
Objective of the contest: To produce convincing examples of how a harmless-looking computer program, that has not been specifically designed to be "friendly", could end up destroying the world. To explore the nature of AI danger without actually doing dangerous things.
Examples: A familiar example of unplanned unfriendliness, is the program designed to calculate pi, which reasons that it could calculate pi with much more accuracy if it turned the Earth into one giant computer. Here a harmless-looking goal (calculate pi) combines with a harmless-looking enhancement (vastly increased "intelligence") to produce a harmful outcome (Earth turned into one giant computer which does nothing but calculate pi).
An entry in the Unfriendly AI Pseudocode Contest which was intended to illustrate this scenario, would need to be specified in much more detail than this. For example, it might contain a pseudocode specification of the pi-calculating program in a harmless "unenhanced" state, then a description of a harmless-looking enhancement, and then an analysis demonstrating that the program has now become an existential risk.
Prizes: The accolades of your peers. The uneasy admiration of a terrified humanity, for whom your little demo has become the standard example of why "friendliness" matters. The gratitude of nihilist supervillains, for whom your pseudocode provides a convenient blueprint for action...
A variant of this contest with less catastrophic unfriendliness actually ran for a few years. The (now defunct) Underhanded C Contest (description below from the contest web page):
This contest sounds seriously cool and possibly useful, but it looks like a valid entry would require the pseudocode for a general intelligence, which as far as I know is beyond the capability of anyone reading this post.
I expect at this stage, you'd be allowed an occasional "and then a miracle occurs" until we work out what step two looks like.
Lesswrong is not an enjoyable place to post pseudocode. I learned this today.