JamesAndrix comments on Dreams of AIXI - Less Wrong

-1 Post author: jacob_cannell 30 August 2010 10:15PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (145)

You are viewing a single comment's thread. Show more comments above.

Comment author: JamesAndrix 03 September 2010 12:26:15AM 0 points [-]

I think this is an open question, but certainly one approach is to follow the brain's lead and make a system that learns its ethics and high level goals dynamically, through learning.

In that type of design, the initial motivation gets imprinting queues from the parents.

This seems like a non-answer to me.

You can't just say 'learning' as if all possible minds will learn the same things from the same input, and internalize the same values from it.

There is something you have to hardcode to get it to adopt any values at all.

your algorithms converge on some asymptotic limit for the hardware.

Well, what is that limit?

It seems to me that an imaginary perfectly efficient algorithm would read process and output data as fast as the processor could shuffle the bits around, which is probably far faster than it could exchange data with the outside world.

Even if we take that down 1000x becsaue this is an algorithm that's doing actual thinking, you're looking at an easy couple of million bytes per second. And that's superintelligently optimized structured output based on preprocessed efficient input. Because this is AGI, we don't need to count in say, raw video bandwidth, because that can be preprocessed by a system that is not generally intelligent.

So a conservatively low upper limit for my PC's intelligence is outputting a million bytes per second of compressed poetry, or viral genomes, or viral genomes that write poetry.

If the first Superhuman AGI is only superhuman by an order of magnitude or so, or must run on a vastly more powerful system, then you can bet that it's algorithms are many orders of magnitude less efficient than they could be.

Because FOOM is just exponential growth

No.

Why couldn't your supercomputer AGI enter into a growth phase higher than exponential?

Example: If not-too-bright but technological aliens saw us take a slow general purpose computer, and then make a chip that worked 100 times faster, but they didn't know how to put algorithms on a chip, then it would look like our technology got 1000 times better really quickly. But that's just because they didn't already know the trick. If they learned the trick, they could make some of their dedicated software systems work 1000 times faster.

"Convert algorithm to silicon." is just one procedure for speeding things up that an agent can do, or not yet know how to do. You know it's possible, and a superintelligence would figure it out, but how do you rule out a superintelligence figureing out twelve trick like that, which each provide a 1000x speedup. In it's first calendar month?

Comment author: jacob_cannell 03 September 2010 03:19:18AM *  1 point [-]

You can't just say 'learning' as if all possible minds will learn the same things from the same input, and internalize the same values from it.

There is something you have to hardcode to get it to adopt any values at all

Yes, you have to hardcode 'something', but that doesn't exactly narrow down the field much. Brains have some emotional context circuitry for reinforcing some simple behaviors (primary drives, pain avoidance, etc), but in humans these are increasingly supplanted and to some extent overridden by learned beliefs in the cortex. Human values are thus highly malleable - socially programmable. So my comment was "this is one approach - hardcode very little, and have all the values acquired later during development".

Well, what is that limit?

It seems to me that an imaginary perfectly efficient algorithm would read process and output data as fast as the processor could shuffle the bits around,

Unfortunately, we need to be a little more specific than imaginary algorithms.

Computational complexity theory is the branch of computer science that deals with the computational costs of different algorithms, and specifically the most optimal possible solutions.

Universal intelligence is such a problem. AIXI is an investigation into optimal universal intelligence in terms of the upper limits of intelligence (the most intelligent possible agent), but while interesting, it shows that the most intelligent agent is unusably slow.

Taking a different route, we know that a universal intelligence can never do better in any specific domain than the best known algorithm for that domain. For example, an AGI playing chess could do no better than just pausing its AGI algorithm (pausing its mind completely) and instead running the optimal chess algorithm (assuming that the AGI is running as a simulation on general hardware instead of faster special-purpose AGI hardware).

So there is probably an optimal unbiased learning algorithm, which is the core building block of a practical AGI. We don't know for sure what that algorithm is yet, but if you survey the field, there are several interesting results. The first thing you'll see is that we have a variety of hierarchical deep learning algorithms now that are all pretty good, some appear to be slightly better for certain domains, but there is not atm a clear universal winner. Also, the mammalian cortex uses something like this. More importantly, there is alot of recent research, but no massive breakthroughs - the big improvements are coming from simple optimization and massive datasets, not fancier algorithms. This is not definite proof, but it looks like we are approaching some sort of bound for learning algorithms - at least at the lower levels.

There is not some huge space of possible improvements, thats just not how computer science works. When you discover quicksort and radix sort, you are done with serial sorting algorithms. And then you find the optimal parallel variants, and sorting is solved. There are no possible improvements past that point.

Computer science is not like moore's law at all. Its more like physics. There's only so much knowledge, and so many breakthroughs, and at this point alot of it honestly is already solved.

So its just pure naivety to think that AGI will lead to some radical recursive breakthrough in software. poppycock. Its reasonably likely humans will have narrowed in on the optimal learning algorithms by the time AGI comes around. Further improvements will be small optimizations for particular hardware architectures - but thats really not much different at all then hardware design itself, and eventually you want to just burn the universal learning algorithms into the hardware (as the brain does).

Hardware is quite different, and there is a huge train of future improvements there. But AGI's impact there will be limited by computer speeds! Because you need regular computers running compilers and simulators to build new programs and new hardware. So AGI can speed Moore's Law up some, but not dramatically - an AGI that thought 1000x faster than a human would just spend 1000x longer waiting for its code to compile.

I am a software engineer, and I spend probably about 30-50% of my day waiting on computers (compiling, transferring, etc). And I only think at human speeds.

AGI's will soon have a massive speed advantage, but ironically they will probably leverage that to become best selling authors, do theoretical physics and math, and non-engineering work in general where you don't need alot of computation.

You know it's possible, and a superintelligence would figure it out, but how do you rule out a superintelligence figureing out twelve trick like that, which each provide a 1000x speedup. In it's first calendar month?

Say you had an AGI that thought 10x faster. It would read and quickly learn everything about its own AGI design, software, etc etc. It would get a good idea of how much optimization slack there was in its design and come up with a bunch of ideas. It could even write the code really fast. But unfortunately it would still have to compile it and test it (adding extra complexity in that this is its brain we are talking about).

Anyway, it would only be able to get small gains from optimizing its software - unless you assume the human programmers were idiots. Maybe a 2x speed gain or something - we are just throwing numbers out, but we have a huge experience with real-time software on fixed hardware in say the video game industry (and other industries) and this asymptotic wall is real, and complexity theory is solid.

Big gains necessarily must come from hardware improvements. This is just how software works - we find optimal algorithms and use them, and further improvement without increasing the hardware hits an asymptotic wall. You spend a few years and you get something 3x better, spend 100 more and you get another 50%, and spend 1000 more and get another 30% and so on.

EDIT: After saying all this, I do want to reiterate that I think there could be a quick (even FOOMish) transition from the first AGIs to AGI's that are 100-1000x or so faster thinking, but the constraint on progress will quickly be the speed of regular computers running all the software you need to do anything in the modern era. Specialized software already does much of the heavy lifting in engineering, and will do even more of it by the time AGI arrives.

Comment author: JamesAndrix 03 September 2010 08:09:33PM 2 points [-]

So my comment was "this is one approach - hardcode very little, and have all the values acquired later during development".

Hardcode very little?

What is the information content of what an infant feels when it is fed after being hungry?

I'm not trying to narrow the feild, the feild is always narrowed to whatever learning system an agent actually uses. In humans, the system that learns new values is not generic

Using a 'generic' value learning system will give you an entity that learns morality in an alien way. I cannot begin to guess what it would learn to want.

I'd like to table the intelligence explosion portion of this discussion, I think we agree that an AI or group of AI's could quickly grow powerful enough that they could take over, if that's what they decided to do. So establishing their values is important regardless of precisely how powerful they are.

Comment author: jacob_cannell 03 September 2010 10:22:54PM *  1 point [-]

Hardcode very little?

Yes. The information in the genome, and the brain structure coding subset in particular, is a tiny tiny portion of the information in an adult brain.

What is the information content of what an infant feels when it is fed after being hungry?

An infant brain is mainly an empty canvas (randomized synaptic connections from which learning will later literally carve out a mind) combined with some much simpler, much older basic drives and a simpler control system - the old brain - that descends back to the era of reptiles or earlier.

In humans, the system that learns new values is not generic

That depends on what you mean by 'values'. If you mean linguistic concepts such as values, morality, kindness, non-cannibalism, etc etc, then yes, these are learned by the cortex, and the cortex is generic. There is a vast weight of evidence for almost overly generic learning in the cortex.

Using a 'generic' value learning system will give you an entity that learns morality in an alien way. I cannot begin to guess what it would learn to want.

Not at all. To learn alien morality, it would have to either invent alien morality from scratch, or be taught alien morality from aliens. Morality is a set of complex memetic linguistic patterns that have evolved over long periods of time. Morality is not coded in the genome and it does not spontaneously generate.

Thats not to say that there are no genetic tweaks to the space of human morality - but any such understanding based on genetic factors must also factor in complex cultural adaptations.

For example, the Aztecs believed human sacrifice was noble and good. Many Spaniards truly believed that the Aztecs were not only inhuman, but actually worse than human - actively evil, and truly believed that they were righteous in converting, conquering, or eliminating them.

This mindspace is not coded in the genome.

I think we agree that an AI or group of AI's could quickly grow powerful enough that they could take over, if that's what they decided to do

Agreed.

Comment author: JamesAndrix 04 September 2010 12:33:26AM 1 point [-]

I'm not saying that all or even most of the information content of adult morality is in the genome. I'm saying that the memetic stimulus that creates it evolved with hooks specific to how humans adjust their values.

If the emotions and basic drives are different, the values learned will be different. If the compressed description of the basic drives is just 1kb, there are ~2^1024 different possible initial minds with drives that complex, most of them wildly alien.

How would you know what the AI would find beautiful? Will you get all aspects of it's sexuality right?

If the AI isn't comforted by physical contact, that's at least few bytes of the drive description that's different than the description that matches our drives. That difference throws out a huge chunk of how our morality has evolved to instill itself.

We might still be able to get an alien mind to adopt all the complex values we have, but we would have to translate the actions we would normally take into actions that match alien emotions. This is a hugely complex task that we have no prior experience with.

Comment author: jacob_cannell 04 September 2010 06:40:57PM *  1 point [-]

I'm not saying that all or even most of the information content of adult morality is in the genome.

Right, so we agree on that then.

If I was going to simplify - our emotional systems and the main associated neurotransmitter feedback loops are the genetic harnesses that constrain the otherwise overly general cortex and its far more complex, dynamic memetic programs.

We have these simple reinforcement learning systems to avoid pain-causing stimuli, pleasure-reward, and so on - these are really old conserved systems from the thalamus that have maintained some level of control and shaping of the cortex as it has rapidly expanded and taken over.

You can actually disable a surprising large number of these older circuits (through various disorders, drugs, injuries) and still have an intact system: - physical pain/pleasure, hunger, yes even sexuality.

And then there are some more complex circuits that indirectly reward/influence social behaviour. They are hooks though, they don't have enough complexity to code for anything as complex as language concepts. They are gross, inaccurate statistical manipulators that encourage certain behaviours apriori

If these 'things' could talk, they would be constantly telling us to: (live in groups, groups are good, socializing is good, share information, have sex, don't have sex with your family, smiles are good, laughter is good, babies are cute, protect babies, it's good when people like you, etc etc.)

Another basic drive appears to be that for learning itself, and its interesting how far that alone could take you. The learning drive is crucial. Indeed the default 'universal intelligence' (something like AIXI) may just have the learning drive taken to the horizon. Of course, that default may not necessarily be good for us, and moreover it may not even be the most efficient.

However, something to ponder is that the idea of "taking the learning drive" to the horizon (maximize knowledge) is surprisingly close to the main cosmic goal of most transhumanists, extropians, singularitans, etc etc. Something to consider: perhaps there is some universal tendency towards a universal intelligence (and single universal goal).

Looking at it this way, scientists and academic types have a stronger than usual learning drive, closely correlated with higher-than-average intelligence. The long standing ascetic and monastic traditions in human cultures show how memetics can sometimes override the genetic drives completely, resulting in beings who have scarified all genetic fitness for memetic fitness. Most scientists don't go to that extreme, but it is a different mindset - and the drives are different.

If the emotions and basic drives are different, the values learned will be different

Sure, but we don't need all the emotions and basic drives. Even if we take direct inspiration from the human brain, some are actually easy to remove - as mentioned earlier. Sexuality (as a drive) is surprisingly easy to remove (although certainly considered immoral to inflect on humans! we seem far less concerned with creating asexual AIs) along with most of the rest.

The most important is the learning drive. Some of the other more complex social drives we may want to keep, and the emotional reinforcement learning systems in general may actually just be nifty solutions to very challenging engineering problems - in which case we will keep some of them as well.

I don't find your 2^1024 analysis useful - the space of possible drives/brains created by the genome is mainly empty - almost all designs are duds, stillbirths.

We aren't going to be randomly picking random drives from a lottery. We will either be intentionally taking them from the brain, or intentionally creating new systems.

If the AI isn't comforted by physical contact, that's at least few bytes of the drive description that's different than the description that matches our drives. That difference throws out a huge chunk of how our morality has evolved to instill itself.

There is probably a name for this as a 'disorder', but I had a deep revulsion of physical contact as a child. I grew out of this to a degree later. I don't see the connection to morality.

That difference throws out a huge chunk of how our morality has evolved to instill itself.

Part of the problem here is morality is a complex term.

The drives and the older simpler control systems in the brain do not operate at the level of complex linguistic concepts - that came much much later. They can influence our decisions and sense of right/wrongness for simple decisions especially, but they have increasingly less influence as you spend more time considering the problem and developing a more complex system of ethics.

We might still be able to get an alien mind to adopt all the complex values we have, but we would have to translate the actions we would normally take into actions that match alien emotions.

alien mind? Who is going to create alien minds? There is the idea of running some massive parallel universe sim to evolve intelligence from scratch, but thats just silly from a computational point of view.

The most likely contender at this point is reverse engineering the brain, and to the extent that human morality has some genetic tweaked-tendencies, we can get those by reverse engineering the relevant circuits.

But remember the genetically preserved emotional circuits are influencers on behavior, but minor, and are not complex enough to cope with abstract linguistic concepts.

Again again, there is nothing in the genome that tells you that slavery is wrong, or that human sacrifice is wrong, or that computers can have rights.

Those concepts operate an entire new plane which the genome does not participate in.

Comment author: JamesAndrix 04 September 2010 10:42:27PM 0 points [-]

I'm not talking about the genome.

1024 bits is an extremely lowball estimate of the complexity of the basic drives and emotions in your AI design. You have to create those drives out of a huge universe of possible drives. Only a tiny subset of possible designs are human like. Most likely you will create an alien mind. Even handpicking drives: it's a small target, and we have no experience with generating drives for even near human AI. The shape of all human like drive sets within the space of all possible drive sets is likely to be thin and complexly twisty within the mapping of a human designed algorithm. You won't intuitively know what you can tweak.

Also, a set of drives that yields a nice AI at human levels might yield something unfriendly once the AI is able to think harder about what it wants. (and this applies just as well to upgrading existing friendly humans.)

All intellectual arguments about complex concepts of morality stem from simpler concepts of right and wrong, which stem from basic preferences learned in childhood. But THOSE stem from emotions and drives which flag particular types of early inputs as important in the first place.

A baby will cry when you pinch it, but not when you bend a paperclip.

live in groups, groups are good, socializing is good, share information, have sex, don't have sex with your family, smiles are good, laughter is good, babies are cute, protect babies, it's good when people like you

Estimating 1 bit per character, that's 214 bits. Still a huge space.

There is probably a name for this as a 'disorder', but I had a deep revulsion of physical contact as a child. I grew out of this to a degree later. I don't see the connection to morality.

It could be that there is another mechanism that guides adoption of values, which we don't even have a word for yet.

A simpler explanation is that moral memes evolved to be robust to most of the variation in basic drives that exists within the human population. A person born with relatively little 'frowns are bad' might still be taught not to murder with a lesson that hooks into 'groups are good'.

But there just aren't many moral lessons structured around the basic drive of 'paperclips are good' (19 bits)

Comment author: jacob_cannell 05 September 2010 01:15:47AM *  0 points [-]

You have to create those drives out of a huge universe of possible drives. Only a tiny subset of possible designs are human like. Most likely you will create an alien mind

The subset possible of designs is sparse - and almost all of the space is an empty worthless desert. Evolution works by exploring paths in this space incrementally. Even technology evolves - each CPU design is not a random new point in the space of all possible designs - each is necessarily close to previously explored points.

All intellectual arguments about complex concepts of morality stem from simpler concepts of right and wrong, which stem from basic preferences learned in childhood.

Yes - but they are learned memetically, not genetically. The child learns what is right and wrong through largely subconscious queues in the tone of voice of the parents, and explicit yes/no (some of the first words learned), and explicit punishment. Its largely a universal learning system with an imprinting system to soak up memetic knowledge from the parents. The genetics provided the underlying hardware and learning algorithm, but the content is all memetic (software/data).

Saying intellectual arguments about complex concepts such as morality relate back to genetics is like saying all arguments about computer algorithm design stem from simpler ideas, which ultimately stem from enlightenment thinkers of three hundred years ago - or perhaps paleolithic cave dwellers inventing fire.

Part of this disagreement could stem from different underlying background assumptions - for example I am probably less familiar with ev psych than many people on LW - partly because (to the extent I have read it) I find it to be grossly over-extended past any objective evidence (compared to say computational neuroscience). I find that ev psych has minor utility in actually understanding the brain, and is even much less useful attempting to make sense of culture.

Trying to understand culture/memetics/minds with ev psych or even neuroscience is even worse than trying to understand biology through physics. Yes it did all evolve from the big bang, but that was a long long time ago.

So basically, anything much more complex than our inner reptile brain (which is all the genome can code for) needs to be understood in memetic/cultural/social terms.

For example, in many civilizations it has been perfectly acceptable to kill or abuse slaves. In some it was acceptable for brothers and sisters to get married, for homosexual relations between teacher and pupil, and we could go on and on.

The idea that there is some universally programmed 'morality' in the genome is . ... a convenient fantasy. It seems reasonable only because we are samples in the dominant Judeo-Christian memetic super-culture, which at this point has spread its influence all over the world, and dominates most of it.

But there are alternate histories and worlds where that just never happened, and they are quite different.

A child's morality develops as a vast accumulation of tiny cues and triggers communicated through the parents - and these are memetic transfers, not genetic. (masturbation is bad, marriage is good, slavery is wrong, racism is wrong, etc etc etc etc)

But there just aren't many moral lessons structured around the basic drive of 'paperclips are good' (19 bits)

The basic drive 'paperclips are good' is actually a very complex thing we'd have to add to an AGI design - its not something that would just spontaneously appear.

The more easier, practical AGI design would be a universal learning engine (inspired by the human cortex&hippocampus), simulation loop (hippo-thalamic-cortical circuit) combined with just a subset of the simpler reinforcement learning circuits (the most important being learning-reinforcement itself and imprinting).

And then with imprinting you teach the developing AGI morality in the same way humans learn morality - memetically. Trying to hard-code the morality into the AGI is a massive step backwards from the human brain's design.

Comment author: JamesAndrix 05 September 2010 06:20:46AM 0 points [-]

One thing I want to make clear is that it is not the correct way to make friendly AI to try to hard code human morality into it. Correct Friendly AI learns about human morality.

MOST of my argument really really isn't about human brains at all. Really.

For a value system in an AGI to change, there must be a mechanism to change the value system. Most likely that mechanism will work off of existing values, if any. In such cases, the complexity of the initial values system is the compressed length of the modification mechanism, plus any initial values. This will almost certainly be at least a kilobit.

If the mechanism+initial values that your AI is using were really simple, then you would not need 1024 bits to describe it. The mechanism you are using is very specific. If you know you need to be that specific, then you already know that you're aiming for a target that specific.

The subset possible of designs is sparse - and almost all of the space is an empty worthless desert.

If your generic learning algorithm needs a specific class of motivation mechanisms to 1024 bits of specificity in order to still be intelligent, then the mechanism you made is actually part of your intellignce design. You should separate that for clarity, an AGI should be general.

The idea that there is some universally programmed 'morality' in the genome is . ... a convenient fantasy.

Heh yeah, but I already conceded that.

Let me put it this way: emotions and drives and such are in the genome. They act as a (perhaps relatively small) function which takes various sensory feeds as arguments, and produce as output modifications to a larger system, say a neural net. If you change that function, you will change what modifications are made.

Given that we're talking about functions that also take their own output as input and do pretty detailed modifications on huge datasets, there is tons of room for different functions to go in different directions. There is no generic morality-importer.

Now there may be clusters of similar functions which all kinda converge given similar input, especially when that input is from other intelligences repeating memes evolved to cause convergence on that class of functions. But even near those clusters are functions which do not converge.

But there just aren't many moral lessons structured around the basic drive of 'paperclips are good' (19 bits)

The basic drive 'paperclips are good' is actually a very complex thing we'd have to add to an AGI design - its not something that would just spontaneously appear.

I think it's great that you're putting the description of a paperclip in the basic drive complexity count, as that will completely blow away the kilobit for storing any of the basic human drives you've listed. Maybe the complexity of the important subset of human drives will be somewhere in the ballpark of the complexity of the reptilian brain.

Another thing I could say to describe my point: If you have a generic learning algorithm, then whatever things feed rewards or punishments to that algorithm should bee seen as part of that algorithms environment. Even if some of those things are parts of the agent as a whole, they are part of what the values-agnostic learning algorithm is going to learn to get reward from.

So if you change an internal reward-generator, it's just like changing the environment of the part that just does learning. So two AI's with different internal reward generators will end up learning totally different things about their 'environment'.

To say that a different way: Everything you try to teach the AI will be filtered through the lens of its basic drives.

Comment author: jacob_cannell 05 September 2010 06:03:12PM *  -1 points [-]

For a value system in an AGI to change, there must be a mechanism to change the value system.

I'm not convinced that an AGI needs a value system in the first place (beyond the basic value of - survive)- but perhaps that is because I am taking 'value system' to mean something similar to morality - a goal evaluation mechanism.

As I discussed, the infant human brain does have a number of inbuilt simple reinforcement learning systems that do reward/punish on a very simple scale for some simple drives (pain avoidance, hunger) - and you could consider these a 'value system', but most of these drives appear to be optional.

Most of the learning an infant is doing is completely unsupervised learning in the cortex, and it has little to nothing to do with a 'value system'.

The bare bones essentials could just be just the cortical-learning system itself and perhaps an imprinting mechanism.

So two AI's with different internal reward generators will end up learning totally different things about their 'environment'.

This is not necessarily true, it does not match what we know from theoretical models such as AGI. With enough time and enough observations, two general universal intelligences will converge on the same beliefs about their environment.

Their goal/reward mechanisms may be different (ie what they want to accomplish), for a given environment there is a single correct set of beliefs, a single correct simulation of that environment that AGI's should converge to.

Of course in our world this is so complex that it could take huge amounts of time, but science is the example mechanism.