Richard_Loosemore comments on The Brain as a Universal Learning Machine - Less Wrong

82 Post author: jacob_cannell 24 June 2015 09:45PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (166)

You are viewing a single comment's thread. Show more comments above.

Comment author: Richard_Loosemore 22 June 2015 05:20:31PM 0 points [-]

There are serious problems with the claims you are making.

The idea that the cortex or cerebellum, for example, can be described as "general purpose re-programmable hardware" is lacking in both clarity and support.

Clarity. In what sense "generally re-programmable"? So much that it could run Microsoft Word? I have never seen anyone try to go that far, so clearly you must mean something less general. But it is very unclear what exactly is the sense in which you mean the words "general purpose re-programmable hardware".

Support. There are no generally accepted theories for what the function of the cortex actually is. Can you be clearer about what you think the evidence is, in a nutshell?

You seem to be saying that the cortex is a universal reinforcement learning machine. But the kind of evidence that you present is (if you will forgive an extreme oversimplification for the purposes of clarity) the observation that the basal ganglia plays a role that resembles a global packet-switching router, and since a global packet-switching router would be expected to be seen in a reinforcement learning machine, QED.

Now, don't get me wrong, I am symathetic to much of the general spirit that you convey here, but my problem is that my research has gone down this road for a long time already, and while we agree on the general spirit, you have jumped forward several steps and come to (what I see as) a premature conclusion about functionality. To be specific, the concept of a "reinforcement learning machine" is ghastly (it contains "And then some magic happens..." steps), and I believe it would be a terrible mistake to say that there is any clear evidence that we have found evidence for a reinforcement learning machine in the brain already.

I agree with the general interpretation of what those hippocampal and BG loops might be doing, but there are MANY other interpretations beside seeing them as a component of a reinforcement learning machine.

This is a difficult topic to discuss in these narrow confines, alas. I think you have done a service by pointing to the idea of a general learning mechanism, but I think you have just run on ahead to quickly and shackled that idea to something too speculative (the RL notion).

Comment author: jacob_cannell 22 June 2015 05:55:09PM 3 points [-]

The idea that the cortex or cerebellum, for example, can be described as "general purpose re-programmable hardware" is lacking in both clarity and support.

"General purpose learning hardware" is perhaps better. I used "re-programmable" as an analogy to an FPGA.

However, in a literal sense the brain can learn to use simpe paper + pencil tools as an extended memory, and can learn to emulate a turing machine. Given huge amounts of time, the brain could literally run windows.

And more to the point, programmers ultimately rely on the ability of our brain to simulate/run little sections of code. So in a more practical literal sense, all of the code of windows first ran on human brains.

You seem to be saying that the cortex is a universal reinforcement learning machine

You seem to be hung up reinforcement learning. I use some of that terminology to define a ULM because it is just the most general framework - utility/value functions, etc. Also, there is some pretty strong evidence for RL in the brain, but the brain's learning mechanisms are complex - moreso than any current ML system. I hope I conveyed that in the article.

Learning in the lower sensory cortices in particular can also be modeled well by unsupervised learning, and I linked to some articles showing how UL models can reproduce sensory cortex features. UL can be viewed as a potentially reasonable way to approximate the ideal target update, especially for lower sensory cortex that is far (in a network depth sense) from any top down signals from the reward system. The papers I linked to about approximate bayesian learning and target propagation in particular can help put it all into perspective.

clear evidence that we have found evidence for a reinforcement learning machine in the brain already.

Well, the article summarizes the considerable evidence that the brain is some sort of approximate universal learning machine. I suspect that you have a particular idea of RL that is less than fully general.

Comment author: Richard_Loosemore 22 June 2015 08:18:21PM 1 point [-]

You are right to say that, seen from a high enough level, the brain does general purpose learning .... but the claim becomes diluted if you take it right up to the top level, where it clearly does.

For example, the brain could be 99.999% hardwired, with no flexibility at all except for a large RAM memory, and it would be consistent with the brain as you just described it (able to learn anything). And yet that wasn't the type of claim you were making in the essay, and it isn't what most people mean when they refer to "general purpose learning". You (and they) seem to be pointing to an architectural flexibility that allows the system to grow up to be a very specific, clever sort of understanding system without all the details being programmed ahead of time.

I am not sure why you say I am hung up on RL: you quoted that as the only mechanism to be discussed in the context, so I went with that.

And you are (like many people) not correct to say that RL is the most general framework, or that there is good evidence for RL in the brain. That is a myth: the evidence is very poor indeed.

RL is not "fully general" -- that was precisely my point earlier. If you can point me to a rigorous proof of that which does not have an "and then some magic happens" step in it, I will eat my hat :-)

(Already had a long discussion with Marchus Hutter about this btw, and he agreed in the end that his appeal to RL was based on nothing but the assumption that it works.)

Comment author: jacob_cannell 22 June 2015 08:34:19PM *  2 points [-]

I am not sure why you say I am hung up on RL: you quoted that as the only mechanism to be discussed in the context, so I went with that.

Upon consideration, I changed my own usage of "Universal Reinforcement Learning Machine" to "Universal Learning Machine".

The several remaining uses of "reinforcement learning" are contained now to the context of the BG and the reward circuitry.

And you are (like many people) not correct to say that RL is the most general framework,

Again we are probably talking about very different RL conceptions. So to be clear, I summarized my general viewpoint of an ULM. I believe it is an extremely general model, that basically covers any kind of universal learning agent. The agent optimizes/steers the future according to some sort of utility function (which is extremely general), and self-optimization emerges naturally just by including the agent itself as part of the system to optimize.

Do you have a conception of a learning agent which does not fit into that framework?

or that there is good evidence for RL in the brain. That is a myth: the evidence is very poor indeed.

The evidence for RL in the brain - of the extremely general form I described - is indeed very strong, simply because any type of learning is just a special case of universal learning. Taboo 'reinforcement' if you want, and just replace with "utility driven learning".

AIXI specifically has a special reward channel, and perhaps you are thinking of that specific type of RL which is much more specific than universal learning. I should perhaps clarify and or remove the mention of AIXI.

A ULM - as I described - does not have a reward channel like AIXI. It just conceptually has a value and or utility function initially defined by some arbitrary function that conceptually takes the whole brain/model as input. In the case of the brain, the utility function is conceptual, in practice it is more directly encoded as a value function.

Comment author: Richard_Loosemore 23 June 2015 02:41:54AM 5 points [-]

About the universality or otherwise of RL. Big topic.

There's no need to taboo "RL" because switching to utility-based learning does not solve the issue (and the issue I have in mind covers both).

See, this is the problem. It is hard for me to fight the idea that RL (or utility-driven learning) works, because I am forced to fight a negative; a space where something should be, but which is empty ....... namely, the empirical fact that Reinforcement Learning has never been made to work in the absence of some surrounding machinery that prepares or simplifies the ground for the RL mechanism.

It is a naked fact about traditional AI that it puts such an emphasis on the concept of expected utility calculations without any guarantees that a utility function can be laid on the world in such a way that all and only the intelligent actions in that world are captured by a maximization of that quantity. It is a scandalously unjustified assumption, made very hard to attack by the fact that it is repeated so frequently that everyone believes it be true just because everyone else believes it.

If anyone ever produced a proof why it should work, there would be a there there, and I could undermine it. But .... not so much!

About AIXI and my conversation with Marcus: that was actually about the general concept of RL and utility-driven systems, not anything specific to AIXI. We circled around until we reached the final crux of the matter, and his last stand (before we went to the conference banquet) was "Yes, it all comes down to whether you believe in the intrinsic reasonableness of the idea that there exists a utility function which, when maximized, yields intelligent behavior .......... but that IS reasonable, .... isn't it?"

My response was "So you do agree that that is where the buck stops: I have to buy the reasonableness of that idea, and there is no proof on the table for why I SHOULD buy it, no?"

Hutter: "Yes."

Me: "No matter how reasonable it seems, I don't buy it"

His answer was to laugh and spread his arms wide. And at that point we went to the dinner and changed to small talk. :-)

Comment author: TheAncientGeek 26 June 2015 10:27:55AM 3 points [-]

. It is a scandalously unjustified assumption, made very hard to attack by the fact that it is repeated so frequently that everyone believes it be true just because everyone else believes it.

I don't think that is an overstatement. If MIRI is basicatly wrong about UFs, then most of its case unravels. Why isnt the issue bring treated as a matter of urgency?

Comment author: Richard_Loosemore 26 June 2015 04:31:16PM *  1 point [-]

A very good question indeed. Although ... there is a depressing answer.

This is a core-belief issue. For some people (like Yudkowsky and almost everyone in MIRI) artificial intelligence must be about the mathematics of artificial intelligence, but without the utility-function approach, that entire paradigm collapses. Seriously: it all comes down like a house of cards.

So, this is a textbook case of a Kuhn / Feyerabend - style clash of paradigms. It isn't a matter of "Okay, so utility functions might not be the best approach: so let's search for a better way to do it" .... it is more a matter of "Anyone who thinks that an AI cannot be built using utility functions is a crackpot." It is a core belief in the sense that it is not allowed to be false. It is unthinkable, so rather than try to defend it, those who deny it have to be personally attacked. (I don't say this because of personal experience, I say it because that kind of thing has been observed over and over when paradigms come into conflict).

Here, for example, is a message sent to the SL4 mailing list by Yudkowsky in August 2006:

Dear Richard Loosemore:

When someone doesn't have anything concrete to say, of course they always trot out the "paradigm" excuse.

Sincerely, Eliezer Yudkowsky.

So the immediate answer to your question is that it will never be treated as a matter of urgency, it will be denied until all the deniers drop dead.

Meanwhile, I went beyond that problem and outlined a solution, soon after I started working in this field in the mid-80s. And by 2006 I had clarified my ideas enough to present them at the AGIRI workshop held in Bethesda that year. The MIRI (then called SIAI) crowd were there, along with a good number of other professional AI people.

The response was interesting. During my presentation the SIAI/MIRI bunch repeatedly interrupted with rude questions or pointed, very loud, laughter. Insulting laughter. Loud enough to make the other participants look over and wonder what the heck was going on.

That's your answer, again, right there.

But if you want to know what to do about it, the paper I published after the workshop is a good place to start.

Comment author: Kaj_Sotala 26 June 2015 07:57:14PM *  4 points [-]

it is more a matter of "Anyone who thinks that an AI cannot be built using utility functions is a crackpot." It is a core belief in the sense that it is not allowed to be false. It is unthinkable, so rather than try to defend it, those who deny it have to be personally attacked.

Won't comment about past affairs, but these days at least part of MIRI seems more open to the possibility. E.g. this thread where So8res (Nate Soares, now Executive Director of MIRI) lists some possible reasons for why it might be necessary to move beyond utility functions. (He is pretty skeptical of most, but at least he seems to be seriously considering the possibility, and gives a ~15% chance "that VNM won't cut it".)

Comment author: gjm 26 June 2015 06:24:36PM 3 points [-]

Would that be this paper?

If so, it seems to me to have rather little to do with the question of whether utility functions are necessary, helpful, neutral, unhelpful, or downright inconsistent with genuinely intelligent behaviour. It argues that intelligent minds may be "complex systems" whose behaviour is very difficult to relate to their lower-level mechanisms, but something that attempts to optimize a utility function can perfectly well have that property. (Because the utility function can itself be complex in the relevant sense; or because the world is complex, so that effective optimization of even a not-so-complex utility function turns out to be achievable only by complex systems; or because even though the utility function could be optimized by something not-complex, the particular optimizer we're looking at happens to be complex.)

My understanding of the position of EY and other people at MIRI is not that "artificial intelligence must be about the mathematics of artificial intelligence", but that if we want to make artificially intelligent systems that might be able to improve themselves rapidly, and if we want high confidence that this won't lead to an outcome we'd view as disastrous, the least-useless tools we have are mathematical ones.

Surely it's perfectly possible to hold (1) that extremely capable AI might be produced by highly non-mathematical means, but (2) that this would likely be disastrous for us, so that (3) we should think mathematically about AI in the hope of finding a way of doing it that doesn't lead to disaster. But it looks as if you are citing their belief in #3 as indicating that they violently reject #1.

So, anyway, utility functions. The following things seem to be clearly true:

  • There are functions whose maximization implies (at least) kinda-intelligence-like behaviour. For instance, maximizing games of chess won against the world champion (in circumstances where you do actually have to play the games rather than, e.g., just killing him) requires you to be able to play chess at that level. Maximizing the profits of a company requires you to do something that resembles what the best human businesspeople do. Maximizing the number of people who regard you as a friend seems like it requires you to be good at something like ordinary social interaction. Etc.
    • Some of these things could probably be gamed. E.g., maybe there's a way to make people regard you as a friend by drugging them or messing directly with their brains. If we pick difficult enough tasks, then gaming them effectively is also the kind of thing that is generally regarded as good evidence of intelligence.
  • The actually intelligent agents we know of (namely, ourselves and to a lesser extent animals and maybe some computer software) appear to have something a bit like utility functions. That is, we have preferences and to some extent we act so as to realize those preferences.
    • For real human beings in the real world, those preferences are far from being perfectly describable by any utility function. But it seems reasonable to me to describe them as being in some sense the same kind of thing as a utility function.
  • There are mathematical theorems that say that if you have preferences over outcomes, then certain kinda-reasonable assumptions (that can be handwavily described as "your preferences are consistent and sane") imply that those preferences actually must be describable by a utility function.
    • This doesn't mean that effective intelligent agents must literally have utility functions; after all, we are effective intelligent agents and we don't. But it does at least suggest that if you're trying to build an effective intelligent agent, then giving it a utility function isn't obviously a bad idea.

All of which seems to me like sufficient reason to (1) investigate AI designs that have (at least approximately) utility functions, and (2) be skeptical of any claim that having a utility function actually makes AI impossible. And it doesn't appear to me to come down to a baseless article of faith, no matter what you and Marcus Hutter may have said to one another.

Comment author: TheAncientGeek 06 July 2015 12:22:51PM 1 point [-]

My understanding of the position of EY and other people at MIRI is not that "artificial intelligence must be about the mathematics of artificial intelligence", but that if we want to make artificially intelligent systems that might be able to improve themselves rapidly, and if we want high confidence that this won't lead to an outcome we'd view as disastrous, the least-useless tools we have are mathematical ones.

But there are good reasons for thinking that, in absolute terms, many mathematical methods of AI safety are useless. The problem is that they relate to ideal rationaliists, but ideal rationality is uncomputable, so they are never directly applicable to any buildable AI....and how they real world AI would deviate from ideal rationality is crucial to understanding the that's they would pose. Deviations from ideal rationality could pose new threats, or could counter certain classes of threat (in particular, lack of goal stability could be leveraged to provide corrigibility, which is a desirable safety feature).

Surely it's perfectly possible to hold (1) that extremely capable AI might be produced by highly non-mathematical means, but (2) that this would likely be disastrous for us, so that (3) we should think mathematically about AI in the hope of finding a way of doing it that doesn't lead to disaster. But it looks as if you are citing their belief in #3 as indicating that they violently reject #1.

There's an important difference between thinking mathematically and only thinking mathematically. Highly non mathematical AI, that is cobbled together without clean overriding principles, cannot be made safe by clean mathematical principles, although it could quite conceivably be made safe by piecemeal engineering solutions such as kill switches, corrigibility and better boxing... the kind of solution MIRI isnt interested in...which does look as though they are neglecting a class of AI danger.

Comment author: gjm 06 July 2015 01:24:01PM 0 points [-]

many mathematical methods of AI safety are useless

If any particular mathematical approach to AI safety is useless, and if MIRI are attempting to use that approach, then they are making a mistake. But we should distinguish that from a different situation where they aren't attempting to use the useless approach but are studying it for insight. So, e.g., maybe approach X is only valid for AIs that are ideal rationalists, but they hope that some of what they discover by investigating approach X will point the way to useful approaches for not-so-ideal rationalists.

Do you have particular examples in mind? Is there good evidence telling us whether MIRI think the methods in question will be directly applicable to real AIs?

There's an important difference between thinking mathematically and only thinking mathematically.

I agree. I am not so sure I agree that cobbled-together AI can "quite conceivably be made safe by piecemeal engineering solutions", and I'm pretty sure that historically at least MIRI has thought it very unlikely that they can. It does seem plausible that any potentially-dangerous AI could be made at least a bit safer by such things, and I hope MIRI aren't advocating that no such things be done. But this is all rather reminiscent of computer security, where there are crude piecemeal things you can do that help a bit, but if you want really tight security there's no substitute for designing your system for security from the start -- and one possible danger of doing the crude piecemeal things is that they give you a false sense of safety.

Comment author: [deleted] 27 June 2015 12:06:39AM 0 points [-]

Meanwhile, I went beyond that problem and outlined a solution, soon after I started working in this field in the mid-80s. And by 2006 I had clarified my ideas enough to present them at the AGIRI workshop held in Bethesda that year.

Link?

Comment author: Richard_Loosemore 30 June 2015 05:35:09PM 1 point [-]

Sorry, was in too much of a rush to give link.....

Loosemore, R.P.W. (2007). Complex Systems, Artificial Intelligence and Theoretical Psychology. In B. Goertzel & P. Wang (Eds.), Proceedings of the 2006 AGI Workshop. IOS Press, Amsterdam.

http://richardloosemore.com/docs/2007_ComplexSystems_rpwl.pdf

Comment author: [deleted] 30 June 2015 11:37:58PM *  2 points [-]

Excuse me, but as much as I think the SIAI bunch were being rude to you, if you had presented, at a serious conference on a serious topic, a paper that waves its hands, yells "Complexity! Irreducible! Parallel!" and expected a good reception, I would have been privately snarking if not publicly. That would be me acting like a straight-up asshole, but it would also be because you never try to understand a phenomenon by declaring it un-understandable. Which is not to say that symbolic, theorem-prover, "Pure Maths are Pure Reason which will create Pure Intelligence" approaches are very good either -- they totally failed to predict that the brain is a universal learning machine, for instance.

(And so far, the "HEY NEURAL NETS LEARN WELL" approach is failing to predict a few things I think they really ought to be able to see, and endeavor to show.)

That anyone would ever try to claim a technological revolution is about to arise from either of those schools of work is what constantly discredits the field of artificial intelligence as a hype-driven fraud!

Comment author: TheAncientGeek 28 June 2015 05:51:02PM -1 points [-]

I don't see it as dogmatism so much as a verbal confusion. The ubiquity of UFs can be defended using a broad (implicit) definition, but the conclusions typically drawn about types of AI danger and methods of AI safety relate to a narrower definition, where a Ufmks

  • Explicitly coded And/or
  • Fixed, unupdateable And/or
  • "Thick" containing detailed descriptions of goals.
Comment author: jacob_cannell 23 June 2015 04:16:43AM *  3 points [-]

Since the utility function is approximated anyway, it becomes an abstract concept - especially in the case of evolved brains. For an evolved creature, the evolutionary utility function can be linked to long term reproductive fitness, and the value function can then be defined appropriately.

For a designed agent, it's a useful abstraction. We can conceptually rate all possible futures, and then roughly use that to define a value function that optimizes towards that goal.

It's really just a mathematical abstraction of the notion of X is better than Y. It's not worth arguing about. It's also proven in the real world - agents based on utility formalizations work. Well.

Comment author: Richard_Loosemore 24 June 2015 02:28:43PM 0 points [-]

It certainly is worth discussing, and I'm sorry but you are not correct that "agents based on utility formalizations work. Well."

That topic came up at the AAAI symposium I attended last year. Specifically, we had several people there who built real-world (as opposed to academic, toy) AI systems. Utility based systems are generally not used, except as a small component of a larger mechanism.

Comment author: jacob_cannell 26 June 2015 09:53:22PM *  2 points [-]

Pretty much all of the recent ML systems are based on a utility function framework in a sense - they are trained to optimize an objective function. In terms of RL in particular, Deepmind's Atari agent works pretty well, and builds on a history of successful practical RL agents that all are trained to optimize a 'utility function'.

That said, for complex AGI, we probably need something more complex than current utility function frameworks - in the sense that you can't reduce utility to an external reward score. The brain doesn't appear to have a simple VNM single-axis utility concept, which is some indication that we may eventually drop that notion for complex AI. My conception of 'utility function' is loose, and could include whatever it is the brain is doing.

Comment author: [deleted] 27 June 2015 12:06:21AM 1 point [-]

Wait wait wait. You didn't head to the dinner, drink some fine wine, and start raucously debating the same issue over again?

Bah, humbug!

Also, how do I get invited to these conferences again ;-)?

It is a scandalously unjustified assumption, made very hard to attack by the fact that it is repeated so frequently that everyone believes it be true just because everyone else believes it.

Very true, at least regarding AI. Personally, my theory is that the brain does do reinforcement learning, but the "reward function" isn't a VNM-rational utility function, it's just something the body signals to the brain to say, "Hey, that world-state was great!" I can't imagine that Nature used something "mathematically coherent", but I can imagine it used something flagrantly incoherent but really dead simple to implement. Like, for instance, the amount of some chemical or another coming in from the body, to indicate satiety, or to relax after physical exertion, or to indicate orgasm, or something like that.

Comment author: Richard_Loosemore 30 June 2015 05:53:21PM 1 point [-]

Hey, ya pays yer money and walk in the front door :-) AGI conferences run about $400 a ticket I think. Plus the airfare to Berlin (there's one happening in a couple of weeks, so get your skates on).

Re the possibility that the human system does do reinforcement learning .... fact is, if one frames the meaning of RL in a sufficiently loose way, the human cogsys absolutely DOES do RL, no doubt about it. Just as you described above.

But if you sit down and analyze what it means to make the claim that a system uses RL, it turns out that there is a world of difference between the two positions:

The system CAN BE DESCRIBED in such a way that there is reinforcement of actions/internal constructs that lead to positive outcomes in some way,

and

The system is controlled by a mechanism that explicitly represents (A) actions/internal constructs, (B) outcomes or expected outcomes, and (C) scalar linkages between the A and B entities .... and behavior is completely dominated by a mechanism that browses the A, B and C in such a way as to modify one of the C linkages according to the cooccurrence of a B with an A.

The difference is that the second case turns the descriptive mechanism into an explicit mechanism.

It's like Ptolemy's Epicycle model of the solar system. Was Ptolemy's fancy little wheels-within-wheels model a good descriptive model of planetary motion? You bet ya! Would it have been appropriate to elevate that model and say that the planets actually DID move on top of some epicycle-like mechanism? Heck no! As a functional model it was garbage, and it held back a scientific understanding of what was really going on for over a thousand years.

Same deal with RL. Our difficulty right now is that so many people slip back and forth between arguing for RL as a descriptive model (which is fine) and arguing for it as a functional model (which is disastrous, because that was tried in psychology for 30 years, and it never worked).