Rereading this part of the Sequences makes me wonder if an AI could make use of a kind of reCAPTCHA approach for sussing out some of these Magical Categories. It certainly would slow up the AI a lot, but could generate a lot of examples and classifications.
I doubt this would be a very efficient solution, but now I'm pretty amused by the prospect of trying to post a blog comment and getting a normal CAPTCHA plus something like this:
Bob receives regular does of soma, which makes him report high subjective satisfaction. His lifespan is not shortened. Bob is no longer chooses to leave his house. Is Bob truly happy (Y/N)?
The amount of time it would take to get a reasonable dataset would likely exceed the projected lifespan of the universe, I imagine.
It's the allocation of intelligence to a scale that conserves relative rankings that is confusing here. What you are really fearing is something which is intelligent in a different way from us having hard-coded values that only appear to be similar to ours until they start to be realized on a large scale- a genie.
What I am saying is that long before we create a genie, we need to create a lesser AI that is capable of figuring out what we are wishing for.
If we're going to hard-code any behavior at all, we need to hard-code honesty. That way we can at least ask questions and be sure that we are getting the true answer, rather than the answer which is calculated to convince us to let the AI 'out of the box'.
In any case, the first goal for a suitably powerful AI should be "Communicate to humans how to create the AI they want.".
Today's post, Magical Categories was originally published on 24 August 2008. A summary (taken from the LW wiki):
Discuss the post here (rather than in the comments to the original post).
This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Unnatural Categories, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.
Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.