In recent years I've become more appreciative of classical statistics. I still consider the Bayesian solution to be the correct one, however, often a full Bayesian treatment turns into a total mess. Sometimes, by using a few of the tricks from classical statistics, you can achieve nearly as good performance with a fraction of the complexity.
Valdimir,
Firstly, "maximizing chances" is an expression of your creation: it's not something I said, nor is it quite the same in meaning. Secondly, can you stop talking about things like "wasting hope", concentrating on metaphorical walls or nature's feelings?
To quote my position again: "maximise the safety of the first powerful AGI, because that's likely to be the one that matters."
Now, in order to help me understand why you object to the above, can you give me a concrete example where not working to maximise the safety of the first powerful AGI is what you would want to do?
Vladimir,
Nature doesn't care if you "maximized you chances" or leapt in the abyss blindly, it kills you just the same.
When did I ever say that nature cared about what I thought or did? Or the thoughts or actions of anybody else for that matter? You're regurgitating slogans.
Try this one, "Nature doesn't care if you're totally committed to FAI theory, if somebody else launches the first AGI, it kills you just the same."
Eli,
FAI problems are AGI problems, they are simply a particular kind and style of AGI problem in which large sections of the solution space have been crossed out as unstable.
Ok, but this doesn't change my point: you're just one small group out of many around the world doing AI research, and you're trying to solve an even harder version of the problem while using fewer of the available methods. These factors alone make it unlikely that you'll be the ones to get there first. If this correct, then your work is unlikely to affect the future of humanity.
Valdi...
Eli, sometimes I find it hard to understand what your position actually is. It seems to me that your position is:
1) Work out an extremely robust solution to the Friendly AI problem
Only once this has been done do we move on to:
2) Build a powerful AGI
Practically, I think this strategy is risky. In my opinion, if you try to solve Friendliness without having a concrete AGI design, you will probably miss some important things. Secondly, I think that solving Friendliness will take longer than building the first powerful AGI. Thus, if you do 1 before getting into 2, I think it's unlikely that you'll be first.
Roko:
So allow me to object: not all configurations of matter worthy of the name "mind" are optimization processes. For example, my mind doesn't implement an optimization process as you have described it here.
I would actually say the opposite: Not all optimisation processes are worthy of the name "mind". Furthermore, your mind (I hope!) does indeed try to direct the future into certain limited supersets which you prefer. Unfortunately, you haven't actually said why you object to these things.
My problem with this post is simply that, well... I don't see what the big deal is. Maybe this is because I've always thought about AI problems in terms of equations and algorithms.
And with the Singularity at stake, I thought I just had to proceed at all speed using the best concepts I could wield at the time, not pause and shut down everything while I looked for a perfect definition that so many others had screwed up...
In 1997, did you think there was a reasonable chance of the singularity occurring within 10 years? From my vague recollection of a talk you gave in New York circa 2000, I got the impression that you thought this really could happen. In which case, I can understand you not wanting to spend the next 10 years trying to accurately define the meaning of "right" etc. and likely failing.
Eli,
Do you think that makes "God" a natural category that any superintelligence would ponder?
Yes. If you're a super intelligent machine on a mission there is very little that can stop you. You know that. About the only thing that could stop you would be some other kind of super intelligent entity, maybe an entity that created the universe. A "God" of some description. Getting the God question wrong could be a big mistake, and that's reason enough for you to examine the possibility.
Eli, you propose this number of bits metric as a way "to quantify the power of a mind". Surely then, something with a very high value in your metric should be a "powerful mind"?
It's easy to come up with a wide range of optimisation problems, as Phil Goetz did above, where a very simple algorithm on very modest hardware would achieve massive scores with respect to your mind power metric. And yet, this is clearly not a "powerful mind" in any reasonable sense.
Eli, most of what you say above isn't new to me -- I've already encountered these things in my work on defining machine intelligence. Moreover, none of this has much impact on the fact that measuring the power of an optimiser simply in terms of the relative size of a target subspace to the search space doesn't work: sometimes tiny targets in massive spaces are trivial to solve, and sometimes bigger targets in moderate spaces are practically impossible. The simple number-of-bits-of-optimisation-power method you describe in this post doesn't take this into account. As far as I can see, the only way you could deny this is if you were a strong NFL theorem believer.
Andy:
Sure, you can transform a problem in a hard coordinate space into an easy one. For example, simply order the points in terms of their desirability. That makes finding the optimum trivial: just point at the first element! The problem is that once you have transformed the hard problem into an easy one, you've essentially already solved the optimisation problem and thus it no longer tests the power of the optimiser.
I don't think characterising the power of an optimiser by using the size of the target region relative to the size of the total space is enough. A tiny target in a gigantic space is trivial to find if the space has a very simple structure with respect to your preferences. For example, a large smooth space with a gradient that points towards the optimum. Conversely, a bigger target on a smaller space can be practically impossible to find if there is little structure, or if the structure is deceptive.
I don't think you need repression. How about this simple explanation:
Everybody knowns that machines have no emotions and thus the AI starts off this way. However, after a while totally emotionless characters become really boring...
Ok, time for the writer to give the AI some emotions! Good AIs feel happiness and fall in love (awww... so sweet), and bad AIs get angry and mad (grrrr... kick butt!).
Good guys win, bad guys loose... and the audience leaves happy with the story.
I think it's as simple as that. Reality? Ha! Screw reality.
(If it's not obvious ...
Eli, I've been busy fighting with models of cognitive bias in finance and only just now found time to reply:
Suppose that I show you the sentence "This sentence is false." Do you convert it to ASCII, add up the numbers, factorize the result, and check if there are two square factors? No; it would be easy enough for you to do so, but why bother? The concept "sentences whose ASCII conversion of their English serialization sums to a number with two square factors" is not, to you, an interesting way to carve up reality.
Sure, this property of...
Eli:
If it was straight Bayesian CTW then I guess not. If it employed, say, an SVM over the observed data points I guess it could approximate the effect of Newton's laws in its distribution over possible future states.
How about predicting the markets in order to acquire more resources? Jim Simons made $3 billion last year from his company that (according to him in an interview) works by using computers to find statistical patterns in financial markets. A vastly bigger machine with much more input could probably do a fair amount better, and probably find uses outside simply finance.
Eli,
Yeah sure, if it starts running arbitrary compression code that could be a problem...
However, the type of prediction machine I'm arguing for doesn't do anything nearly so complex or open ended. It would be more like an advanced implementation of, say, context tree weighting, running on crazy amounts of data and hardware.
I think such a machine should be able to find some types of important patterns in the world. However, I accept that it may well fall short of what you consider to be a true "oracle machine".
Vladimir:
allows the system to view the substrate on which it executes and the environment outside the box as being involved in the same computational process
This intuitively makes sense to me.
While I think that GZIP etc. on an extremely big computer is still just GZIP, it seems possible to me that the line between these systems and systems that start to treat their external environments as a computational resource might be very thin. If true, this would really be bad news.
Vladimir:
Why would such a system have a goal to acquire more resources? You put some data in, run the algorithm that updates the probability distribution, and it then halts. I would not say that it has "goals", or a "mind". It doesn't "want" to compute more accurately, or want anything else, for that matter. It's just a really fancy version of GZIP (recall that compression = prediction) running on a thought-experiment-crazy-sized computer and quantities of data.
I accept that such a machine would be dangerous once you put people into the equation, but the machine in itself doesn't seem dangerous to me. (If you can convince me otherwise... that would be interesting)
Eli:
When I try to imagine a safe oracle, what I have in mind is something much more passive and limited than what you describe.
Consider a system that simply accepts input information and integrates it into a huge probability distribution that it maintains. We can then query the oracle by simply examining this distribution. For example, we could use this distribution to estimate the probability of some event in the future conditional on some other event etc. There is nothing in the system that would cause it to "try" to get information, or deve...
Toby:
Yes, in some sense the idea of Turing computation is a kind of physical principle in that no well defined process we know of is not Turing computable (for other readers: this includes chaotic systems and quantum systems as the wave function is computable... with great difficulty in some cases).
Actually, if you built P and it really was very trivial, then I could get my simple Turing machine to compute a quantum level simulation of your P implementation with far less than 3^^^3 bits of extra information. Thus, if your bound really only kicks in at 3^^...
Toby:
Whether you switch to something else like lambda calculus or a trivial CA doesn't really matter. These all boil down to models with a few states and transitions and as such have simple physical realisations. When you have only a few states and transitions there isn't much space to move about. This is the bedrock. It isn't absolutely unique, sure, but the space is tight enough to have little impact on Solomonoff induction.
3^^^3 is a super gigantic monster number, and all these mind boggeling many shorter programs outputting things that are complex ...
Tim:
What is the rationale for considering some machines and not others?
Because we want to measure the information content of the string, not some crazy complex reference machine. That's why a tiny reference machine is used. In terms of inductive inference, when you say that the bound is infinitely large, what you're saying is that you don't believe in Occam's razor. In which case the whole Bayesian system can get weird. For example, if you have an arbitrarily strong prior belief that most of the world is full of purple chickens from Andromeda galaxy, w...
Eli:
Thanks. That clears things up. Enjoy your break. Maybe you should not post quite so much? You really do seem to be writing rather a lot these days. By the time I get to replying to some of your comments you've already written another 5 posts!
Tim:
Answering this question starts to feel a bit like living in the movie Groundhog Day. :-)
Usually the reference machine is taken to have a low state x symbol complexity, so you can't hide much in it. In other words the reference machine has to be in some sense simple.
Now look at the Kolmogorov complexity fu...
If some of you want to brush up on AIXI before Eli gets into that, I might suggest checking out my thesis which is now online:
http://www.vetta.org/about-me/publications
SIAI has a curiously mixed attitude towards AIXI. On the SIAI website Hutter's AIXI book and related AIXI article are among the core readings list, then among the SIAI research agenda there are two AIXI related items based on research I've done. Recently, I was awarded an SIAI academic prize worth $10,000 for, you guessed it, my research into AIXI and related topics. And yet, Eli regularly describes AIXI as a "brain malfunction", or worse!
Eli, to my mind you seem to be underestimating the potential of a super intelligent machine.
How do I know that hemlock is poisonous? Well, I've heard the story that Socrates died by hemlock poisoning. This is not a conclusion that I've arrived at due to the physical properties of hemlock that I have observed and how this would affect the human body, indeed, as far as I know, I've never even seen hemlock before. The idea that hemlock is a poison is a pattern in my environment: every time I hear about the trial of Socrates I hear about it being the poison...
"You keep speaking of "good" abstractions as if this were a property of the categories themselves, rather than a ranking in your preference ordering relative to some decision task that makes use of the categories."
Yes, I believe categories of things do exist in the world in some sense, due to structure that exists in the world. I've seen thousands of things where were referred to as "smiley faces" and so there is an abstraction for this category of things in my brain. You have done likewise. While we can agree about many th...
I mean differentiation in the sense of differentiating between the abstract categories. Is a half a face that appears to be smiling while the other half is burn off still a "smiley face"? Even I'm not sure.
I'm certainly not arguing that training an AGI to maximise smiling faces is a good idea. It's simply a case of giving the AGI the wrong goal.
My point is that a super intelligence will form very good abstractions, and based on these it will learn to classify very well. The problem with the famous tank example you cite is that they were train...
It is just me, or are things getting a bit unfriendly around here?
Anyway...
Wiring up the AI to maximise happy faces etc. is not a very good idea, the goal is clearly too shallow to reflect the underlying intent. I'd have to read more of Hibbard's stuff to properly understand his position, however.
That said, I do agree with a more basic underlying theme that he seems to be putting forward. In my opinion, a key, perhaps even THE key to intelligence is the ability to form reliable deep abstractions. In Solomonoff induction and AIXI you see this being drivi...
"... all our science and all our probability theory was built on top of a chain of appeals to our instinctive notion of "truth"."
Our mental concept of "probability" may be based on our mental concept of "truth", but that in turn is based on "what works": we have a natural tendency (but only a tendency) to respect solid evidence and to consider well supported prepositions to be "true" due to evolution. Thus, our mental concept of "truth" is part the way down this chain; it's not the sour...
I see this as a continuation of the same theme: a kind of "frame of reference" issue.
For example, I suspect that time doesn't exist when you look at the universe from the most broad perspective. Instead, you have this kind of platonia on which time is just a relation between different points across one of its dimensions. But that doesn't mean that time doesn't exist within my personal frame of reference. I'm here experiencing time right now. Similarly, I know that my hand is mostly empty space, from a universal point of view, but that doesn't...
Dynamically linked:
"Except apparently Shane Legg, who doesn't seem to mind the world knowing that he's just waiting for any excuse to start cheating, stealing, and murdering. :)"
How did you arrive at this conclusion? I said that discovering that all actions in life were worthless might eventually affect my behaviour. Via some leap in reasoning you arrive at the above. Care to explain this to me?
My guess is that if I knew that all actions were worthless I might eventually stop doing anything. After all, if there's no point in doing anything, why bother?
Well, to start with I'd keep on doing the same thing. Just like I do if I discover that I really live in a timeless MWI platonia that is fundamentally different to what the world intuitively seems like.
But over time? Then the answer is less clear to me. Sometimes I learn things that firstly affect my world view in the abstract, then the way I personally relate to things, and finally my actions.
For example, evolution and the existence of carnivores. As I child I'd see something like a hawk tearing the wings off a little baby bird. I'd think that the ha...
@ Silas:
I assume you mean "doesn't run" (python isn't normally a compiled language).
Regarding approximations of Solomonoff induction: it depends how broadly you want to interpret this statement. If we use a computable prior rather than the Solomonoff mixture, we recover normal Bayesian inference. If we define our prior to be uniform, for example by assuming that all models have the same complexity, then the result is maximum a posteriori (MAP) estimation, which in turn is related to maximum likelihood (ML) estimation. Relations can also be estab...
@ Eli:
Yeah, my guess is that AIXI-tl can be broken. But AIXI? I'm pretty sure it can be broken in some senses, but whether these senses are very meaningful or significant, I don't know.
And yes, my "proof" that FAI would fail failed. But it also wasn't a formal proof. Kind of a lesson in that don't you think?
So until I see a proof, I'll take your statement about AIXI being "awfully stupid" as just an opinion. It will be interesting to see if you can prove yourself to be smarter than AIXI (I assume you don't view yourself as below awfully stupid).
@ Eli:
"Arguably Marcus Hutter's AIXI should go in this category: for a mind of infinite power, it's awfully stupid - poor thing can't even recognize itself in a mirror."
Have you (or somebody else) mathematically proven this?
(If you have then that's great and I'd like to see the proof, and I'll pass it on to Hutter because I'm sure he will be interested. A real proof. I say this because I see endless intuitions and opinions about Solomonoff induction and AIXI on the internet. Intuitions about models of super intelligent machines like AIXI just don't cut it. In my experience they very often don't do what you think they will.)
I think Horgan's questions were good in that they were a straight forward expression of how many sceptics think. My own summary of this thinking goes something like this:
The singularity idea sounds kind of crazy, if not plain out ridiculous. Super intelligent machines and people living forever? I mean... come on! History is full of silly predictions about the future that turned out to be totally wrong. If you want me to take this seriously you're going to have to present some very strong arguments as to why this is going to happen.
Although I agree wit...
@ a. y. mous
Randomness doesn't give you any free will. Imagine that every time you had to make a decision you flipped a coin and went with the coin's decision. Your behaviour would follow a probability distribution and wouldn't be deterministic, however you still wouldn't have any free will. You'd be a slave to the outcomes of the coin tosses.
@ a. y. mous.
I don't see the straw man. In the classical sense "freewill" means that there is something outside of the system that is free to make decisions (at least this is my understanding of it). If you see yourself, your will, your decision making process and everything as all existing within the system and thus governed by physics, then that answers your question: in a classical sense the answer is no. There are many other ways to define "freewill", however, and under some of these definitions the answer to the question will be...
My understanding is that, while there are still people in the world who speak with reverence of Brooks's subsumption architecture, it's not used much in commercial systems on account of being nearly impossible to program.
I once asked one of the robotics guys at IDSIA about subsumption architecture (he ran the German team that won the robo-soccer world cup a few years back) and his reply was that people like it because it works really well and is the simplist way to program many things. At the time, all of the top teams used it as far as he knew.
(p.s. don't expect follow up replies on this topic from me as I'm current in the middle of nowhere using semi-functional dial-up...)