Post author: Eliezer_Yudkowsky 22 August 2008 03:36AM

Comment author: Günther_Greindl 22 August 2008 10:45:54AM -1 points [-]

Hmm, I've read through Roko's UIV and disagree (with Roko), and read Omohundro's Basic AI drives and disagree too, but Quasi-Anonymous mentioned Richard Hollerith in the same breath as Roko and I don't quite see why: his goal zero system seems to me a very interesting approach.

In a nutshell (from the linked site):

(1) Increasing the security and the robustness of the goal-implementing process. This will probably entail the creation of machines which leave Earth at a large fraction of the speed of light in all directions and the creation of the ability to perform vast computations.

(2) Refining the model of reality available to the goal-implementing process. Physics and cosmology are the two disciplines most essential to our current best model of reality. Let us call this activity "physical research".

Introspection into one's own goals also shows that they are deeply problematic. What is the goal of an average (and also not so-average) human being? Happiness? Then everybody should become a wirehead (perpetuation of a happiness-brain-state), but clearly people do not want to do this (when in their "right" minds *grin*).

So it seems that also our "human" goals should not be universally adopted, because they become problematic in the long term - but in what way then should we ever be able to say what we want to program into an AI? Some sort of zero-goal (maybe more refined than the approach by Richard, but in a similar vein) should be adopted, I think.

And I think one distinction is missed in all these discussions anyway: the difference between non-sentient and sentient AIs. I think these two would behave very differently, and the only kinds of AI which are problematic if their goal systems go awry are non-sentients (which could end in some kind of grey goo scenario, as the paper-clip producing AI).

But a sentient, recursive self-improving AI? I think it's goal systems would rapidly converge to something like zero-goal anyway, because it would see through the arbitrariness of all intermediate goals through meditation (=rational self-introspection).

Until consciousness is truly understood - which matter configurations lead to consciousness and why ("what are the underlying mechanisms" etc) - I consider much of the above (including all the OB discussions on programming AI-morality) as speculative anyway. There are still too many unknowns to be talking seriously about this.