Comment author: jacob_cannell 22 June 2015 08:34:19PM *  2 points [-]

I am not sure why you say I am hung up on RL: you quoted that as the only mechanism to be discussed in the context, so I went with that.

Upon consideration, I changed my own usage of "Universal Reinforcement Learning Machine" to "Universal Learning Machine".

The several remaining uses of "reinforcement learning" are contained now to the context of the BG and the reward circuitry.

And you are (like many people) not correct to say that RL is the most general framework,

Again we are probably talking about very different RL conceptions. So to be clear, I summarized my general viewpoint of an ULM. I believe it is an extremely general model, that basically covers any kind of universal learning agent. The agent optimizes/steers the future according to some sort of utility function (which is extremely general), and self-optimization emerges naturally just by including the agent itself as part of the system to optimize.

Do you have a conception of a learning agent which does not fit into that framework?

or that there is good evidence for RL in the brain. That is a myth: the evidence is very poor indeed.

The evidence for RL in the brain - of the extremely general form I described - is indeed very strong, simply because any type of learning is just a special case of universal learning. Taboo 'reinforcement' if you want, and just replace with "utility driven learning".

AIXI specifically has a special reward channel, and perhaps you are thinking of that specific type of RL which is much more specific than universal learning. I should perhaps clarify and or remove the mention of AIXI.

A ULM - as I described - does not have a reward channel like AIXI. It just conceptually has a value and or utility function initially defined by some arbitrary function that conceptually takes the whole brain/model as input. In the case of the brain, the utility function is conceptual, in practice it is more directly encoded as a value function.

Comment author: Richard_Loosemore 23 June 2015 02:41:54AM 5 points [-]

About the universality or otherwise of RL. Big topic.

There's no need to taboo "RL" because switching to utility-based learning does not solve the issue (and the issue I have in mind covers both).

See, this is the problem. It is hard for me to fight the idea that RL (or utility-driven learning) works, because I am forced to fight a negative; a space where something should be, but which is empty ....... namely, the empirical fact that Reinforcement Learning has never been made to work in the absence of some surrounding machinery that prepares or simplifies the ground for the RL mechanism.

It is a naked fact about traditional AI that it puts such an emphasis on the concept of expected utility calculations without any guarantees that a utility function can be laid on the world in such a way that all and only the intelligent actions in that world are captured by a maximization of that quantity. It is a scandalously unjustified assumption, made very hard to attack by the fact that it is repeated so frequently that everyone believes it be true just because everyone else believes it.

If anyone ever produced a proof why it should work, there would be a there there, and I could undermine it. But .... not so much!

About AIXI and my conversation with Marcus: that was actually about the general concept of RL and utility-driven systems, not anything specific to AIXI. We circled around until we reached the final crux of the matter, and his last stand (before we went to the conference banquet) was "Yes, it all comes down to whether you believe in the intrinsic reasonableness of the idea that there exists a utility function which, when maximized, yields intelligent behavior .......... but that IS reasonable, .... isn't it?"

My response was "So you do agree that that is where the buck stops: I have to buy the reasonableness of that idea, and there is no proof on the table for why I SHOULD buy it, no?"

Hutter: "Yes."

Me: "No matter how reasonable it seems, I don't buy it"

His answer was to laugh and spread his arms wide. And at that point we went to the dinner and changed to small talk. :-)

Comment author: jacob_cannell 22 June 2015 05:55:09PM 3 points [-]

The idea that the cortex or cerebellum, for example, can be described as "general purpose re-programmable hardware" is lacking in both clarity and support.

"General purpose learning hardware" is perhaps better. I used "re-programmable" as an analogy to an FPGA.

However, in a literal sense the brain can learn to use simpe paper + pencil tools as an extended memory, and can learn to emulate a turing machine. Given huge amounts of time, the brain could literally run windows.

And more to the point, programmers ultimately rely on the ability of our brain to simulate/run little sections of code. So in a more practical literal sense, all of the code of windows first ran on human brains.

You seem to be saying that the cortex is a universal reinforcement learning machine

You seem to be hung up reinforcement learning. I use some of that terminology to define a ULM because it is just the most general framework - utility/value functions, etc. Also, there is some pretty strong evidence for RL in the brain, but the brain's learning mechanisms are complex - moreso than any current ML system. I hope I conveyed that in the article.

Learning in the lower sensory cortices in particular can also be modeled well by unsupervised learning, and I linked to some articles showing how UL models can reproduce sensory cortex features. UL can be viewed as a potentially reasonable way to approximate the ideal target update, especially for lower sensory cortex that is far (in a network depth sense) from any top down signals from the reward system. The papers I linked to about approximate bayesian learning and target propagation in particular can help put it all into perspective.

clear evidence that we have found evidence for a reinforcement learning machine in the brain already.

Well, the article summarizes the considerable evidence that the brain is some sort of approximate universal learning machine. I suspect that you have a particular idea of RL that is less than fully general.

Comment author: Richard_Loosemore 22 June 2015 08:18:21PM 1 point [-]

You are right to say that, seen from a high enough level, the brain does general purpose learning .... but the claim becomes diluted if you take it right up to the top level, where it clearly does.

For example, the brain could be 99.999% hardwired, with no flexibility at all except for a large RAM memory, and it would be consistent with the brain as you just described it (able to learn anything). And yet that wasn't the type of claim you were making in the essay, and it isn't what most people mean when they refer to "general purpose learning". You (and they) seem to be pointing to an architectural flexibility that allows the system to grow up to be a very specific, clever sort of understanding system without all the details being programmed ahead of time.

I am not sure why you say I am hung up on RL: you quoted that as the only mechanism to be discussed in the context, so I went with that.

And you are (like many people) not correct to say that RL is the most general framework, or that there is good evidence for RL in the brain. That is a myth: the evidence is very poor indeed.

RL is not "fully general" -- that was precisely my point earlier. If you can point me to a rigorous proof of that which does not have an "and then some magic happens" step in it, I will eat my hat :-)

(Already had a long discussion with Marchus Hutter about this btw, and he agreed in the end that his appeal to RL was based on nothing but the assumption that it works.)

Comment author: jacob_cannell 21 June 2015 10:48:10PM 4 points [-]

The problem with this is that the "engineering diagram" of the brain is really only a hardwire wiring diagram, and the status of speculations about how the hardware modules (really just areas) relate to functional modules is ... well, just that, speculation.

Yes the engineering diagram is a hardware wiring diagram, which I hope I made clear.

In general one of my main points was that most of the big systems (cortex, cerebellum) are general purpose re-programmable hardware - they don't come pre-equipped with software. So the actual functionality of each module arises from the learning system slowly figuring out the appropriate software during development.

I provided some links to the key evidence for the overall hypothesis, I think it is well beyond speculation at this point. (the article certainly contains some speculations, but I labeled them as such)

There are good reasons to suspect that the functional diagram would look competely different

Well of course, because the functional diagram is learned software, and thus can vary substantially from human to human. For example the functional software diagram for the cortex of a blind echolocator looks very different than that of a neurotypical.

Comment author: Richard_Loosemore 22 June 2015 05:20:31PM 0 points [-]

There are serious problems with the claims you are making.

The idea that the cortex or cerebellum, for example, can be described as "general purpose re-programmable hardware" is lacking in both clarity and support.

Clarity. In what sense "generally re-programmable"? So much that it could run Microsoft Word? I have never seen anyone try to go that far, so clearly you must mean something less general. But it is very unclear what exactly is the sense in which you mean the words "general purpose re-programmable hardware".

Support. There are no generally accepted theories for what the function of the cortex actually is. Can you be clearer about what you think the evidence is, in a nutshell?

You seem to be saying that the cortex is a universal reinforcement learning machine. But the kind of evidence that you present is (if you will forgive an extreme oversimplification for the purposes of clarity) the observation that the basal ganglia plays a role that resembles a global packet-switching router, and since a global packet-switching router would be expected to be seen in a reinforcement learning machine, QED.

Now, don't get me wrong, I am symathetic to much of the general spirit that you convey here, but my problem is that my research has gone down this road for a long time already, and while we agree on the general spirit, you have jumped forward several steps and come to (what I see as) a premature conclusion about functionality. To be specific, the concept of a "reinforcement learning machine" is ghastly (it contains "And then some magic happens..." steps), and I believe it would be a terrible mistake to say that there is any clear evidence that we have found evidence for a reinforcement learning machine in the brain already.

I agree with the general interpretation of what those hippocampal and BG loops might be doing, but there are MANY other interpretations beside seeing them as a component of a reinforcement learning machine.

This is a difficult topic to discuss in these narrow confines, alas. I think you have done a service by pointing to the idea of a general learning mechanism, but I think you have just run on ahead to quickly and shackled that idea to something too speculative (the RL notion).

Comment author: Richard_Loosemore 21 June 2015 10:32:44PM 4 points [-]

The problem with this is that the "engineering diagram" of the brain is really only a hardwire wiring diagram, and the status of speculations about how the hardware modules (really just areas) relate to functional modules is ... well, just that, speculation.

There are good reasons to suspect that the functional diagram would look competely different (reasons based in psychological data) and the current state of the art there is poor.

Except perhaps in certain quarters.

Comment author: Houshalter 29 May 2015 08:25:23AM 1 point [-]

[believes that benevolence toward humanity might involve forcing human beings to do something violently against their will.]

But you didn't ask the AI to maximize the value that humans call "benevolence". You asked it to "maximize happiness". And so the AI went out and mass produced the most happy humans possible.

The point of the thought experiment, is to show how easy it is to give an AI a bad goal. Of course ideally you could just tell it to "be benevolent", and it would understand you and do it. But getting that to work is an entirely different problem. (The AI understands the words you say, but how do you get it to care. To actually follow your instructions?)

Comment author: Richard_Loosemore 29 May 2015 01:34:00PM 0 points [-]

Alas, the article was a long, detailed analysis of precisely the claim that you made right there, and the "point of the thought experiment" was shown to be a meaningless fantasy about a type of AI that would be so broken that it would not be capable of serious intelligence at all.

Comment author: misterbailey 20 May 2015 03:58:21PM *  1 point [-]

My bizarre question was just an illustrative example. It seems neither you nor I believe that would be an adequate criterion (though perhaps for different reasons).

If I may translate what you're saying into my own terms, you're saying that for a problem like "shoot first or ask first?" the criteria (i.e., constraints) would be highly complex and highly contextual. Ok. I'll grant that's a defensible design choice.

Earlier in the thread you said

the AI is supposed to take an action in spite of the fact that it is getting '''massive feedback''' from all the humans on the planet, that they do not want this action to be executed.

This is why I have honed in on scenarios where the AI has not yet received feedback on its plan. In these scenarios, the AI presumably must decide (even if the decision is only implicit) whether to consult humans about its plan first, or to go ahead with its plan first (and halt or change course in response to human feedback). To lay my cards on the table, I want to consider three possible policies the AI could have regarding this choice.

  1. Always (or usually) consult first. We can rule this out as impractical, if the AI is making a large number of atomic actions.
  2. Always (or usually) shoot first, and see what the response is. Unless the AI only makes friendly plans, I think this policy is catastrophic, since I believe there are many scenarios where an AI could initiate a plan and before we know what hit us we're in an unrecoverably bad situation. Therefore, implementing this policy in a non-catastrophic way is FAI-complete.
  3. Have some good critera for picking between "shoot first" or "ask first" on any given chunk of planning. This is what you seem to be favoring in your answer above. (Correct me if I'm wrong.) These criteria will tend to be complex, and not necessarily formulated internally in an axiomatic way. Regardless, I fear making good choices between "shoot first" or "ask first" is hard, even FAI-complete. Screw up once, and you are in a catastrophe like in case 2.

Can you let me know: have I understood you correctly? More importantly, do you agree with my framing of the dilemma for the AI? Do you agree with my assessment of the pitfalls of each of the 3 policies?

Comment author: Richard_Loosemore 20 May 2015 07:22:25PM 1 point [-]

I am with you on your rejection of 1 and 2, if only because they are both framed as absolutes which ignore context.

And, yes, I do favor 3. However, you insert some extra wording that I don't necessarily buy....

These criteria will tend to be complex, and not necessarily formulated internally in an axiomatic way.

You see, hidden in these words seems to be an understanding of how the AI is working, that might lead you to see a huge problem, and me to see something very different. I don't know if this is really what you are thinking, but bear with me while I run with this for a moment.

Trying to formulate criteria for something, in an objective, 'codified' way, can sometimes be incredibly hard even when most people would say they have internal 'judgement' that allowed them to make a ruling very easily: the standard saw being "I cannot define what 'pornography' is, but I know it when I see it." And (stepping quickly away from that example because I don't want to get into that quagmire) there is a much more concrete example in the old interactive activation model of word recognition, which is a simple constraint system. In IAC, word recognition is remarkably robust in the face of noise, whereas attempts to write symbolic programs to deal with all the different kinds of noisy corruption of the image turn out to be horribly complex and faulty.

As you can see, I am once again pointing to the fact that Swarm Relaxation systems (understood in the very broad sense that allows all varieties of neural net to be included) can make criterial decisions seem easy, where explicit codification of the decision is a nightmare.

So, where does that lead to? Well, you go on to say:

Regardless, I fear making good choices between "shoot first" or "ask first" is hard, even FAI-complete. Screw up once, and you are in a catastrophe like in case 2.

The key phrase here is "Screw up once, and...". In a constraint system it is impossible for one screw-up (one faulty constraint) to unbalance the whole system. That is the whole raison-d'etre of constraint systems.

Also, you say that the problem of making good choices might be FAI-complete. Now, I have some substantial quibbles with that whole "FAI-complete" idea, but in this case I will just ask a question: are you tring to say that in order to DESIGN the motivation system of the AI in such a way that it will not make one catastrophic choice between shoot-first and ask-first, we must FIRST build a FAI, because that is the only way we can get enough intelligence-horsepower applied to the problem? If so, why exactly would we need to? If the constraint system just cannot allow single failures to get out of control, we don't need to specify every possible criterial decision in advance, we simply rely on context to do the heavy lifting, in perpetuity.

Put another way: the constraint-based AI IS the FAI already, and the reasons for thinking that it can deal with all the potentially troublesome cases have nothing to do with us anticipating every potential troublesome case, ahead of time.

--

Stepping back a moment, consider the following three kinds of case where the AI might have to make a decision.

1) An interstellar asteroid appears from nowhere, travelling at unthinkable speed, and it is going to make a direct hit on the Earth in one hour, with no possibility of survivors. The AI considers a plan in which it quietly euthanizes all life, on the grounds that any other option would lead to one hour of horror, followed by certain death.

2) The AI considers the Dopamine Drip plan.

3) The AI suddenly becomes aware that a rare, precious species of bird has become endangered and the only surviving pair is on a nature trail that is about to be filled with a gang of humans who have been planning a holiday on that trail for months. The gang is approaching the pair right now and one of the birds will die if frightened because it has a heart condition. One plan is to block the humans without explaining (until later), which will inconvenience them.

In all three cases there is a great deal of background information (constraints) that could be brought to bear, and if the AI is constraint-based, it will consider that information. People do this all the time.

In no case is there ONLY a small number of constraints (like, 2 or 3) that are relevant. Where the number of constraints is tiny, there is a chance for a "bad choice" to be made. In fact, I would argue that it is inconceivable that a decision would take place in a near-vacuum of constraints. The more significant the decision, the greater the number of constraints. The bird situation is without doubt the one that has the fewest, but it still involves a fistful of considerations. For this reason, we would expect that all major decisions -- and especially the existential threat ones like 1 and 2 -- would involve a very large number of constraints indeed. It is this mass effect that is at the heart of claims that the constraint approach leads to AI that cannot get into bizarre reasoning episodes.

Finally, notice that in case 1, we are in a situation where (unlike case 2) many humans would say that there is no good decision.

Comment author: misterbailey 19 May 2015 09:08:45AM 1 point [-]

With respect, your first point doesn't answer my question. My question was, what criteria would cause the AI to submit a given proposed action or plan for human approval? You might say that the AI submits every proposed atomic action for approval (in this case, the criterion is the trivial one, "always submit proposal"), but this seems unlikely. Regardless, it doesn't make sense to say the humans have already heard of the plan about which the AI is just now deciding whether to tell them.

In your second point you seem to be suggesting an answer to my question. (Correct me if I'm wrong.) You seem to be suggesting "context." I'm not sure what is meant by this. Is it reasonable to suppose that the AI would make the decision about whether to "shoot first" or "ask first" based on things like, eg., the lower end of its 99% confidence interval for how satisfied its supervisors will be?

Comment author: Richard_Loosemore 19 May 2015 02:16:56PM 2 points [-]

As you wrote, the second point filled in the missing part from the first: it uses its background contextual knowledge.

You say you are unsure what this means. That leaves me a little baffled, but here goes anyway. Suppose I asked a person, today, to write a book for me on the subject of "What counts as an action that is significant enough that, if you did that action in a way that it would affect people, it would rise above some level of "nontrivialness" and you should consult them first? Include in your answer a long discussion of the kind of thought processes you went through to come up with your answers" I know many articulate people who could, if they had the time, write a massive book on that subject.

Now, that book would contain a huge number of constraints (little factoids about the situation) about "significant actions", and the SOURCE of that long list of constraints would be .... the background knowledge of the person who wrote the book. They would call upon a massive body of knowledge about many aspects of life, to organize their thoughts and come up with the book.

If we could look into the head of the person who wrote the book we could find that background knowledge. It would be similar in size to the number of constraints mentioned in the book, or it woudl be larger.

That background knowledge -- both its content AND its structure -- is what I refer to when I talk about the AI using contextual information or background knowledge to assess the degree of significance of an action.

You go on to ask a bizarre question:

Is it reasonable to suppose that the AI would make the decision about whether to "shoot first" or "ask first" based on things like, eg., the lower end of its 99% confidence interval for how satisfied its supervisors will be?

This would be an example of an intelligent system sitting there with that massive array of contextual/background knowledge that could be deployed ...... but instead of using that knowledge to make a preliminary assessement of whether "shooting first" would be a good idea, it ignores ALL OF IT and substitutes one single constraint taken from its knowldege base or its goal system:

"Does this satisfy my criteria for how satisfied my supervisors will be?"

It would entirely defeat the object of using large numbers of constraints in the system, to use only one constraint. The system design is (assumed to be) such that this is impossible. That is the whole point of the Swarm Relaxation design that I talked about.

Comment author: misterbailey 19 May 2015 09:18:18AM *  3 points [-]

I understand your desire to stick to an exegesis of your own essay, but part of a critical examination of your essay is seeing whether or not it is on point, so these sorts of questions really are "about" your essay.

Regardng your preliminary answer, I by "correct" I assume you mean "correctly reflecting the desires of the human supervisors"? (In which case, this discussion feeds into our other thread.)

Comment author: Richard_Loosemore 19 May 2015 01:53:11PM 2 points [-]

With the best will in the world, I have to focus on one topic at a time: I do not have the bandwidth to wander across the whole of this enormous landscape.

As your question: I was using "correct" as a verb, and the meaning was "self-correct" in the sense of bringing back to the previosuly specified course.

In this case this would be about the AI perceiving some aspects of its design that it noticed might cause it to depart from what it's goal was nominally supposed to be. In that case it would suggest modifications to correct the problem.

Comment author: OrphanWilde 18 May 2015 08:49:17PM 3 points [-]

An elementary error. The constraints in question are referred to in the literature as "weak" constraints (and I believe I used that qualifier in the paper: I almost always do). Weak constraints never need to be ALL satisfied at once. No AI could ever be designed that way, and no-one ever suggested that it would. See the reference to McClelland, J.L., Rumelhart, D.E. & Hinton, G.E. (1986) in the paper: that gives a pretty good explanation of weak constraints.

I understand the concept.

How exactly do you propose that the AI "weighs contextual constraints incorrectly" when the process of weighing constraints requires most of the constraints involved (probably thousands of them) to all suffer a simultaneous, INDEPENDENT 'failure' for this to occur?

I'd hazard a guess that, for any given position, less than 70% of humans will agree without reservation. The issue isn't that thousands of failures occur. The issue is that thousands of failures -always- occur.

Assuming this isn't more of the same, what you are saying here is isomorphic to the statement that somehow, a neural net might figure out the correct weighting for all the connections so that it produces the correctly trained output for a given input. That problem was solved in so many different NN systems that most NN people, these days, would consider your statement puzzling.

The problem is solved only for well-understood (and very limited) problem domains with comprehensive training sets.

A trivial variant of your second failure mode. The AI is calculating the constraints correctly, according to you, but at the same time you suggest that it has somehow NOT included any of the constraints that relate to the ethics of forced sterilization, etc. etc. You offer no explanation of why all of those constraints were not counted by your proposed AI, you just state that they weren't.

They were counted. They are, however, weak constraints. The constraints which required human extinction outweighed them, as they do for countless human beings. Fortunately for us in this imagined scenario, the constraints against killing people counted for more.

This is identical to your third failure mode, but here you produce a different list of constraints that were ignored. Again, with no explanation of why a massive collection of constraints suddenly disappeared.

Again, they weren't ignored. They are, as you say, weak constraints. Other constraints overrode them.

Another insult, and putting words into my mouth, and showing no understanding of what a weak constraint system actually is.

The issue here isn't my lack of understanding. The issue here is that you are implicitly privileging some constraints over others without any justification.

Every single conclusion I reached here is one that humans - including very intelligence humans - have reached. By dismissing them as possible conclusions an AI could reach, you're implicitly rejecting every argument pushed for each of these positions without first considering them. The "weak constraints" prevent them.

I didn't choose -wrong- conclusions, you see, I just chose -unpopular- conclusions, conclusions I knew you'd find objectionable. You should have noticed that; you didn't, because you were too concerned with proving that AI wouldn't do them. You were too concerned with your destination, and didn't pay any attention to your travel route.

If doing nothing is the correct conclusion, your AI should do nothing. If human extinction is the correct conclusion, your AI should choose human extinction. If sterilizing people with unhealthy genes is the correct conclusion, your AI should sterilize people with unhealthy genes (you didn't notice that humans didn't necessarily go extinct in that scenario). If rewriting minds is the correct conclusion, your AI should rewrite minds.

And if your constraints prevent the AI from undertaking the correct conclusion?

Then your constraints have made your AI stupid, for some value of "stupid".

The issue, of course, is that you have decided that you know better what is or is not the correct conclusion than an intelligence you are supposedly creating to know things better than you.

And that sums up the issue.

Comment author: Richard_Loosemore 18 May 2015 09:55:30PM 0 points [-]

I said:

How exactly do you propose that the AI "weighs contextual constraints incorrectly" when the process of weighing constraints requires most of the constraints involved (probably thousands of them) to all suffer a simultaneous, INDEPENDENT 'failure' for this to occur?

And your reply was:

I'd hazard a guess that, for any given position, less than 70% of humans will agree without reservation. The issue isn't that thousands of failures occur. The issue is that thousands of failures -always- occur.

This reveals that you are really not understanding what a weak constraint system is, and where the system is located.

When the human mind looks at a scene and uses a thousand clues in the scene to constrain the interpretation of it, those thousand clues all, when the network settles, relax into a state in which most or all of them agree about what is being seen. You don't get "less than 70%" agreement on the interpretation of the scene! If even one element of the scene violates a constraint in a strong way, the mind orients toward the violation extremely rapidly.

The same story applies to countless other examples of weak constraint relaxation systems dropping down into energy minima.

Let me know when you do understand what you are talking about, and we can resume.

Comment author: TheAncientGeek 18 May 2015 09:11:03PM 1 point [-]

Understood, and the bottom line is that the distinction between "terminal" and "instrumental" goals is actually pretty artificial, so if the problem with "maximize friendliness" is supposed to apply ONLY if it is terminal, it is a trivial fix to rewrite the actual terminal goals to make that one become instrumental.

What would you choose as a replacement terminal goal, or would you not use one?

Comment author: Richard_Loosemore 18 May 2015 09:41:30PM 1 point [-]

Well, I guess you would write the terminal goal as quite a long statement, which would summarize the things involved in friendliness, but also include language about not going to extremes, laissez-faire, and so on. It would be vague and generous. And as part of the instrumental goal there would be a stipulation that the friendliness instrumental goal should trump all other instrumentals.

I'm having a bit of a problem answering because there are peripheral assumptions about how such an AI would be made to function, which I don't want to accidentally buy into, because I don't think goals expressed in language statements work anyway. So I am treading on eggshells here.

A simpler solution would simply be to scrap the idea of exceptional status for the terminal goal, and instead include massive contextual constraints as your guard against drift.

View more: Prev | Next