cousin_it comments on The Design Space of Minds-In-General - Less Wrong

19 Post author: Eliezer_Yudkowsky 25 June 2008 06:37AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (82)

Sort By: Old

You are viewing a single comment's thread. Show more comments above.

Comment author: Will_Newsome 08 January 2011 01:46:00AM 0 points [-]

"Paperclipper" refers to any "universe tiler", not just one that tile with paperclips. Specifying paperclips in particular is hard. If you don't care about exactly what gets tiled, it's much easier.

My arguments apply to most kinds of things that tile the universe with highly arbitrary things like paperclips, though they apply less to things that tile the universe with less arbitrary things like bignums or something. I do believe arbitrary universe tilers are easier than FAI; just not paperclippers.

For well-understood goals, that's easy. Just hardcode the goal.

There are currently no well-understood goals, nor are there obvious ways of hardcoding goals, nor are hardcoded goals necessarily stable for self-modifying AIs. We don't even know what a goal is, let alone do we know how to solve the grounding problem of specifying what a paperclip is (or a tree is, or what have you). With humans you get the nice trick of getting the AI to look back on the process that created it, or alternatively just use universal induction techniques to find optimization processes out there in the universe, which will also find humans. (Actually I think it would find mostly memes, not humans per se, but a lot of what humans care about is memes.)

Human values are contingent on our entire evolutionary history.

Paperclips are contingent on that, plus a whole bunch of random cultural stuff. Again, if we're talking about universe tilers in general this does not apply.

Also, as a sort of appeal to authority, I've been working at the Singularity Institute for a year now, and have spent many many hours thinking about the problem of FAI (though admittedly I've given significantly less thought to how to build a paperclipper). If my intuitions are unsound, it is not for reasons that are intuitively obvious.

Comment author: cousin_it 21 January 2011 01:23:45PM *  3 points [-]

There are currently no well-understood goals, nor are there obvious ways of hardcoding goals

"Find a valid proof of this theorem from the axioms of ZFC". This goal is pretty well-understood, and I don't believe an AI with such a goal will converge on Buddha. Or am I misunderstanding your position?

Comment author: Vladimir_Nesov 21 January 2011 02:20:26PM *  0 points [-]

This doesn't take into account logical uncertainty. It's easy to write a program that eventually computes the answer you want, and then pose a question of doing that more efficiently while provably retaining the same goal, which is essentially what you cited, with respect to a brute force classical inference system starting from ZFC and enumerating all theorems (and even this has its problems, as you know, since the agent could be controlling which answer is correct). A far more interesting question is which answer to name when you don't have time to find the correct answer. "Correct" is merely a heuristic for when you have enough time to reflect on what to do.

(Also, even to prove theorems, you need operating hardware, and manging that hardware and other actions in the world would require decision-making under (logical) uncertainty. Even nontrivial self-optimization would require decision-making under uncertainty that has a "chance" of turning you from the correct question.)

Comment author: cousin_it 21 January 2011 02:59:07PM 2 points [-]

A far more interesting question is which answer to name when you don't have time to find the correct answer.

What's more interesting about it? Think for some time and then output the best answer you've got.

Comment author: Vladimir_Nesov 21 January 2011 03:19:01PM *  0 points [-]

Try to formalize this intuition. With provably correct answers, that's easy. Here, you need a notion of "best answer I've got", a way of comparing possible answers where correctness remains inaccessible. This makes it "more interesting": where the first problem is solved (to an extent), this one isn't.