Richard_Loosemore comments on The genie knows, but doesn't care - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (515)
Richard: I'll stick with your original example. In your hypothetical, I gather, programmers build a seed AI (a not-yet-superintelligent AGI that will recursively self-modify to become superintelligent after many stages) that includes, among other things, a large block of code I'll call X.
The programmers think of this block of code as an algorithm that will make the seed AI and its descendents maximize human pleasure. But they don't actually know for sure that X will maximize human pleasure — as you note, 'human pleasure' is an unbelievably complex concept, so no human could be expected to actually code it into a machine without making any mistakes. And writing 'this algorithm is supposed to maximize human pleasure' into the source code as a comment is not going to change that. (See the first few paragraphs of Truly Part of You.)
Now, why exactly should we expect the superintelligence that grows out of the seed to value what we really mean by 'pleasure', when all we programmed it to do was X, our probably-failed attempt at summarizing our values? We didn't program it to rewrite its source code to better approximate our True Intentions, or the True Meaning of our in-code comments. And if we did attempt to code it to make either of those self-modifications, that would just produce a new hugely complex block Y which might fail in its own host of ways, given the enormous complexity of what we really mean by 'True Intentions' and 'True Meaning'. So where exactly is the easy, low-hanging fruit that should make us less worried a superintelligence will (because of mistakes we made in its utility function, not mistakes in its factual understanding of the world) hook us up to dopamine drips? All of this seems crucial to your original point in 'The Fallacy of Dumb Superintelligence':
It seems to me that you've already gone astray in the second paragraph. On any charitable reading (see the New Yorker article), it should be clear that what's being discussed is the gap between the programmer's intended code and the actual code (and therefore actual behaviors) of the AGI. The gap isn't between the AGI's intended behavior and the set of things it's smart enough to figure out how to do. (Nowhere does the article discuss how hard it is for AIs to do things they desire to. Over and over again is the difficulty of programming AIs to do what we want them to discussed — e.g., Asimov's Three Laws.)
So all the points I make above seem very relevant to your 'Fallacy of Dumb Superintelligence', as originally presented. If you were mixing those two gaps up, though, that might help explain why you spent so much time accusing SIAI/MIRI of making this mistake, even though it's the former gap and not the latter that SIAI/MIRI advocates appeal to.
Maybe it would help if you provided examples of someone actually committing this fallacy, and explained why you think those are examples of the error you mentioned and not of the reasonable fact/value gap I've sketched out here?
Rob,
This afternoon I spent some time writing a detailed, carefully constructed reply to your essay. I had trouble posting it due to an internet glitch when I was at work, but now I am home I was about to submit when suddenly discovered that my friends were warning me about the following comment that was posted to the thread:
Comment author: Eliezer_Yudkowsky 05 September 2013 07:30:56PM 1 point [-]
Warning: Richard Loosemore is a known permanent idiot, ponder carefully before deciding to spend much time arguing with him.
(If you're fishing for really clear quotes to illustrate the fallacy, that may make sense.)
--
So. I will not be posting my reply after all.
I will not waste any more of my time in a context controlled by an abusive idiot.
If you want to discuss the topic (and I had many positive, constructive thoughts to contribute), feel free to suggest an alternative venue where we can engage in a debate without trolls interfering with the discussion.
Sincerely,
Richard Loosemore Mathematical and Physical Sciences, Wells College Aurora, NY 13026 USA