You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

private_messaging comments on Google may be trying to take over the world - Less Wrong Discussion

22 [deleted] 27 January 2014 09:33AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (133)

You are viewing a single comment's thread. Show more comments above.

Comment author: private_messaging 27 January 2014 11:28:02PM *  5 points [-]

To clarify, I have nothing anything against self educated persons. Some do great things. The "autodidacts" was specifically in quotes.

What is implausible, is this whole narrative where you have a risk obvious enough that people without any relevant training can see it (by the way of that paperclipping argument), yet the relevant experts are ignoring it. Especially when the idea of an intelligence turning against it's creator is incredibly common in fiction, to the point that nobody has to form that idea on their own.

Comment author: [deleted] 28 January 2014 03:54:03PM 3 points [-]

In general, current AGI architectures work via reinforcement learning: reward and punishment. Relevant experts are worried about what will happen when an AGI with the value-architecture of a pet dog finds that it can steal all the biscuits from the kitchen counter without having to do any tricks.

They are less worried about their current creations FOOMing into god-level superintelligences, because current AI architectures are not FOOMable, and it seems quite unlikely that you can create a self-improving ultraintelligence by accident. Except when that's exactly what they plan for them to do (ie: Shane Legg).

Juergen Schmidhuber gave an interview on this very website where he basically said that he expects his Goedel Machines to undergo a hard takeoff at some point, with right and wrong being decided retrospectively by the victors of the resulting Artilect War. He may have been trolling, but it's a bit hard to tell.

Comment author: private_messaging 28 January 2014 04:18:03PM 1 point [-]

I'd need to have links and to read it by myself.

With regards to reinforcement learning, one thing to note is that the learning process is in general not the same thing as the intelligence that is being built by the learning process. E.g. if you were to evolve some ecosystem of programs by using "rewards" and "punishments", the resulting code ends up with distinct goals (just as humans are capable of inventing and using birth control). Not understanding this, local genuises of the AI risk been going on about "omg he's so stupid it's going to convert the solar system to smiley faces" with regards to at least one actual AI researcher.

Comment author: [deleted] 28 January 2014 04:31:05PM 1 point [-]

I'd need to have links and to read it by myself.

Here is his interview. It's very, very hard to tell if he's got his tongue firmly in cheek (he refers to minds of human-level intelligence and our problems as being "small"), or if he's enjoying an opportunity to troll the hell out of some organization with a low opinion of his work.

With regards to reinforcement learning, one thing to note is that the learning process is in general not the same thing as the intelligence that is being built by the learning process.

With respect to genetic algorithms, you are correct. With respect to something like neural networks (real world stuff) or AIXI (pure theory), you are incorrect. This is actually why machine-learning experts differentiate between evolutionary algorithms ("use an evolutionary process to create an agent that scores well on X") versus direct learning approaches ("the agent learns to score well on X").

Not understanding this, local genuises of the AI risk been going on about "omg he's so stupid it's going to convert the solar system to smiley faces" with regards to at least one actual AI researcher.

What, really? I mean, while I do get worried about things like Google trying to take over the world, that's because they're ideological Singulatarians. They know the danger line is there, and intend to step over it. I do not believe that most competent Really Broad Machine Learning (let's use that nickname for AGI) researchers are deliberately, suicidally evil, but then again, I don't believe you can accidentally make a dangerous-level AGI (ie: a program that acts as a VNM-rational agent in pursuit of an inhumane goal).

Accidental and evolved programs are usually just plain not rational agents, and therefore pose rather more limited dangers (crashing your car, as opposed to killing everyone everywhere).

Comment author: private_messaging 28 January 2014 05:03:15PM *  2 points [-]

With respect to something like neural networks (real world stuff)

Well, the neural network in my head doesn't seem to want to maximize the reward signal itself, but instead is more interested in maximizing values imprinted into it by the reward signal (which it can do even by hijacking the reward signal or even by administering "punishments"). Really, reward signal is not utility, period. Teach the person to be good, and they'll keep themselves good by punishing/rewarding themselves.

or AIXI (pure theory), you are incorrect.

I don't think it's worth worrying about the brute force iteration over all possible programs. Once you stop iterating over the whole solution space in the learning method itself, the learning method faces the problem that it can not actually ensure that the structures constructed by the learning method don't have separate goals (nor is it desirable to ensure such, as you would want to be able to teach values to an agent using the reward signal).

Comment author: [deleted] 28 January 2014 05:28:35PM 1 point [-]

Well, the neural network in my head doesn't seem to want to maximize the reward signal itself, but instead is more interested in maximizing values imprinted into it by the reward signal (which it can do even by hijacking the reward signal or even by administering "punishments"). Really, reward signal is not utility, period. Teach the person to be good, and they'll keep themselves good by punishing/rewarding themselves.

Firstly, I was talking about artificial neural networks, which do indeed function as reinforcement learners, by construction and mathematical proof.

Secondly, human beings often function as value learners ("learn what is good via reinforcement, but prefer a value system you're very sure about over a reward that seems to contradict the learned values") rather than reinforcement learners. Value learners, in fact, are the topic of a machine ethics paper from 2011, by Daniel Dewey.

the learning method faces the problem that it can not actually ensure that the structures constructed by the learning method don't have separate goals (nor is it desirable to ensure such, as you would want to be able to teach values to an agent using the reward signal).

Sorry, could you explain this better? It doesn't match up with how the field of machine learning usually works.

Yes, any given hypothesis a learner has about a target function is only correct to within some probability of error. But that probability can be very small.

Comment author: private_messaging 28 January 2014 06:05:19PM *  3 points [-]

With the smiley faces, I am referring to disagreement with Hibbard, summarized e.g. here on wikipedia

Secondly, human beings often function as value learners ("learn what is good via reinforcement, but prefer a value system you're very sure about over a reward that seems to contradict the learned values") rather than reinforcement learners. Value learners, in fact, are the topic of a machine ethics paper from 2011, by Daniel Dewey.

You're speaking as if value learners were not a subtype of reinforcement learners.

For a sufficiently advanced AI, i.e. one that learns to try different counter-factual actions on a world model, it is essential to build a model of the reward, which is to be computed on the counter-factual actions. It's this model of the reward that is specifying which action gets chosen.

Yes, any given hypothesis a learner has about a target function is only correct to within some probability of error. But that probability can be very small.

Looks like presuming a super-intelligence from the start.

Comment author: [deleted] 28 January 2014 06:33:36PM *  0 points [-]

With the smiley faces, I am referring to disagreement with Hibbard, summarized e.g. here on wikipedia

Right, and that wikipedia article refers to stuff Eliezer was writing more than ten years ago. That stuff is nowhere near state-of-the-art machine ethics.

(I think this weekend I might as well blog some decent verbal explanations of what is usually going on in up-to-date machine ethics on here, since a lot of people appear to confuse real, state-of-the-art work with either older, superseded ideas or very intuitive fictions.

Luckily, it's a very young field, so it's actually possible for some bozo like me to know a fair amount about it.)

You're speaking as if value learners were not a subtype of reinforcement learners.

That's because they are not. These are precise mathematical terms being used here, and while they are similar (for instance, I'd consider a Value Learner closer to a reinforcement learner than to a fixed direct-normativity utility function), they're not identical, neither is one a direct supertype of the other.

For a sufficiently advanced AI, i.e. one that learns to try different counter-factual actions on a world model, it is essential to build a model of the reward, which is to be computed on the counter-factual actions. It's this model of the reward that is specifying which action gets chosen.

This intuition is correct, regarding reinforcement learners. It is slightly incorrect regarding value learners, but how precisely it is incorrect is at the research frontier.

Looks like presuming a super-intelligence from the start.

No, I didn't say the target function was so complex as to require superintelligence. If I have a function f(x) = x + 1, a learner will be able to learn that this is the target function to within a very low probability of error, very quickly, precisely because of its simplicity.

The simpler the target function, the less training data needed to learn it in a supervised paradigm.

Comment author: private_messaging 28 January 2014 07:04:32PM *  3 points [-]

Right, and that wikipedia article refers to stuff Eliezer was writing more than ten years ago. That stuff is nowhere near state-of-the-art machine ethics.

I think I seen him using smiley faces as example much more recently, that's why I thought of it as an example, but can't find the link.

These are precise mathematical terms being used here

The field of reinforcement learning is far too diverse for these to be "precise mathematical terms".

The simpler the target function, the less training data needed to learn it in a supervised paradigm.

I thought you were speaking of things like learning an alternative way to produce a button press.

Comment author: [deleted] 28 January 2014 07:19:53PM 0 points [-]

I thought you were speaking of things like learning an alternative way to produce a button press.

Here's where things like deep learning come in.

Deep learning learns features from the data. The better your set of features, the less complex the true target function is when phrased in terms of those features. However, features themselves can contain a lot of internal complexity.

So, for instance, "press the button" is a very simple target from our perspective, because we already possess abstractions for "button" and "press" and also the ability to name one button as "the button". Our minds contain a whole lot of very high-level features, some of which we're born with and some of which we've learned over a very long time (by computer-science standards, 18 years of training to produce an adult from an infant is an aeon) using some of the world's most intelligent deep-learning apparatus (ie: our brains).

Hence the fable of the "dwim" program, which is written in the exact same language of features your mind uses, and which therefore is the Do What I Mean program. This is also known as a Friendly AI.