XiXiDu comments on Siren worlds and the perils of over-optimised search - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (411)
Why do you think such an AI wouldn't just fail at being powerful, rather than being powerful in a catastrophic way?
If programs fail in the real world then they are not working well. You don't happen to come across a program that manages to prove the Riemann hypothesis when you designed it to prove the irrationality of the square root of 2.
If it fails at being powerful, we don't have to worry about it, so I feel free to ignore those probabilities.
But you might come across a program motivated to eliminate all humans if you designed it to optimise the economy...
So you're not pursuing the claim that a SAI will probably be dangerous, you are just worried that it might be?
My claim has always been that the probability that an SAI will be dangerous is too high to ignore. I fluctuate on the exact probability, but I've never seen anything that drives it down to a level I feel comfortable with (in fact, I've never seen anything drive it below 20%).
This line of reasoning still seems flawed to me. It's just like saying that you can build an airplane that can fly and land, autonomously, except that your plane is going to forcefully crash into a nuclear power plant.
The gist of the matter is that there are a vast number of ways that you can fail at predicting your programs behavior. Most of these failure modes are detrimental to the overall optimization power of the program. This is because being able to predict the behavior of your AI, to the extent necessary for it to outsmart humans, is analogous to predicting that your airplane will fly without crashing. Eliminating humans, in order to optimize the economy, is about as likely as your autonomous airplane crashing into a nuclear power plant, in order to land safely.
I don't know why you think you can predict the likely outcome of an artificial general intelligence by making surface analogies to things that aren't even optimization processes. People have been using analogies to "predict" nonsense for centuries.
In this case there are a variety of reasons that a programmer might succeed at preventing a UAV from crashing into a nuclear power plant, yet fail at preventing AGI from eliminating all humans. Mainly revolving around the fact that most programmers wouldn't even consider the "eliminate all humans" option as a serious possibility until it had already occurred, while the problem of physical obstructions is explicitly a part of the UAV problem definition. That itself has to do with the fact that an AGI can represent internally features of the world that weren't even considered by the designers (due to general intelligence).
As an aside, serious misconfigurations or unintended results of computer programs happen all the time today, but you don't generally hear or care about them because they don't end the world.
This is why the Wise employ normative uncertainty and the learning of utility functions from data, rather than hardcoding verbal instructions that only make sense in light of a complete human mind and social context.
Indeed. But the more of the problem you can formalise and solve (eg maintaining a stable utility function over self-improvements) the more likely the learning approach is to succeed.
Well yes, of course. I mean, if you can't build an agent that was capable of maintaining its learned utility while becoming vastly smarter (and thus capable of more accurately learning and enacting capital-G Goodness), then all that utility-learning was for nought.
The very idea underlying AI is enabling people to get a program to do what they mean without having to explicitly encode all details. What AI risk advocates do is to turn the whole idea upside down, claiming that, without explicitly encoding what you mean, your program will do something else. The problem here is that it is conjectured that the program will do what it was not meant to do in a very intelligent and structured manner. But this can't happen when it comes to intelligently designed systems (as opposed to evolved systems), because the nature of unintended consequences is overall chaotic.
How often have you heard of intelligently designed programs that achieved something highly complex and marvelous, but unintended, thanks to the programmers being unable to predict the behavior of the program? I don't know of any such case. But this is exactly what AI risk advocates claim will happen, namely that a program designed to do X (calculate 1+1) will perfectly achieve Y (take over the world).
If artificial general intelligence will eventually be achieved by some sort of genetic/evolutionary computation, or neuromorphic engineering, then I can see how this could lead to unfriendly AND capable AI. But an intelligently designed AI will either work as intended or be incapable of taking over the world (read: highly probable).
This of course does not ensure a positive singularity (if you believe that this is possible at all), since humans might use such intelligently and capable AIs to wreck havoc (ask the AI to do something stupid, or something that clashes with most human values). So there is still a need for "friendly AI". But this is quite different from the idea of interpreting "make humans happy" as "tile the universe with smiley faces". Such a scenario contradicts the very nature of intelligently designed AI, which is an encoding of “Understand What Humans Mean” AND “Do What Humans Mean”. More here.
Alexander, have you even bothered to read the works of Marcus Hutter and Juergen Schmidhuber, or have you spent all your AI-researching time doing additional copy-pastas of this same argument every single time the subject of safe or Friendly AGI comes up?
Your argument makes a measure of sense if you are talking about the social process of AGI development: plainly, humans want to develop AGI that will do what humans intend for it to do. However, even a cursory look at the actual research literature shows that the mathematically most simple agents (ie: those that get discovered first by rational researchers interested in finding universal principles behind the nature of intelligence) are capital-U Unfriendly, in that they are expected-utility maximizers with not one jot or tittle in their equations for peace, freedom, happiness, or love, or the Ideal of the Good, or sweetness and light, or anything else we might want.
(Did you actually expect that in this utterly uncaring universe of blind mathematical laws, you would find that intelligence necessitates certain values?)
No, Google Maps will never turn superintelligent and tile the solar system in computronium to find me a shorter route home from a pub crawl. However, an AIXI or Goedel Machine instance will, because these are in fact entirely distinct algorithms.
In fact, when dealing with AIXI and Goedel Machines we have an even bigger problem than "tile everything in computronium to find the shortest route home": the much larger problem of not being able to computationally encode even a simple verbal command like "find the shortest route home". We are faced with the task of trying to encode our values into a highly general, highly powerful expected-utility maximizer at the level of, metaphorically speaking, pre-verbal emotion.
Otherwise, the genie will know, but not care.
Now, if you would like to contribute productively, I've got some ideas I'd love to talk over with someone for actually doing something about some few small corners of Friendliness subproblems. Otherwise, please stop repeating yourself.
I asked several people what they think about it, and to provide a rough explanation. I've also had e-Mail exchanges with Hutter, Schmidhuber and Orseau. I also informally thought about whether practically general AI that falls into the category “consequentialist / expected utility maximizer / approximation to AIXI” could ever work. And I am not convinced.
If general AI, which is capable of a hard-takeoff, and able to take over the world, requires less lines of code, in order to work, than to constrain it not to take over the world, then that's an existential risk. But I don't believe this to be the case.
Since I am not a programmer, or computer scientist, I tend to look at general trends, and extrapolate from there. I think this makes more sense than to extrapolate from some unworkable model such as AIXI. And the general trend is that humans become better at making software behave as intended. And I see no reason to expect some huge discontinuity here.
Here is what I believe to be the case:
(1) The abilities of systems are part of human preferences as humans intend to give systems certain capabilities and, as a prerequisite to build such systems, have to succeed at implementing their intentions.
(2) Error detection and prevention is such a capability.
(3) Something that is not better than humans at preventing errors is no existential risk.
(4) Without a dramatic increase in the capacity to detect and prevent errors it will be impossible to create something that is better than humans at preventing errors.
(5) A dramatic increase in the human capacity to detect and prevent errors is incompatible with the creation of something that constitutes an existential risk as a result of human error.
Here is what I doubt:
(1) Present-day software is better than previous software generations at understanding and doing what humans mean.
(2) There will be future generations of software which will be better than the current generation at understanding and doing what humans mean.
(3) If there is better software, there will be even better software afterwards.
(4) Magic happens.
(5) Software will be superhuman good at understanding what humans mean but catastrophically worse than all previous generations at doing what humans mean.
This is a much bigger problem for your ability to reason about this area than you think.
A relevant quote from Eliezer Yudkowsky (source):
And another one (source):
So since academic consensus on the topic is not reliable, and domain knowledge in the field of AI is negatively useful, what are the prerequisites for grasping the truth when it comes to AI risks?
I think that in saying this, Eliezer is making his opponents' case for them. Yes, of course the standard would also let you discard cryonics. One solution to that is to say that the standard is bad. Another solution is to say "yes, and I don't much care for cryonics either".
Nah, those are all plausibly correct things that mainstream science has mostly ignored and/or made researching taboo.
If you prefer a more clear-cut example, science was wrong about continental drift for about half a century -- until overwhelming, unmistakable evidence became available.
Ability to program is probably not sufficient, but it is definitely necessary. But not because of domain relevance; it's necessary because programming teaches cognitive skills that you can't get any other way, by presenting a tight feedback loop where every time you get confused, or merge concepts that needed to be distinct, or try to wield a concept without fully sharpening your understanding of it first, the mistake quickly gets thrown in your face.
And, well... it's pretty clear from your writing that you haven't mastered this yet, and that you aren't going to become less confused without stepping sideways and mastering the basics first.
On a complete sidenote, this is a lot of why programming is fun. I've also found that learning the Coq theorem-prover has exactly the same effect, to the point that studying Coq has become one of the things I do to relax.
That looks highly doubtful to me.
People have been telling him this for years. I doubt it will get much better.
At a minimum, a grasp of computer programming and CS. Computer programming, not even AI.
I'm inclined to disagree somewhat with Eliezer_2009 on the issue of traditional AI - even basic graph search algorithms supply valuable intuitions about what planning looks like, and what it is not. But even that same (obsoleted now, I assume) article does list computer programming knowledge as a requirement.
What counts as "a grasp" of computer programming/science? I can e.g. program a simple web crawler and solve a bunch of Project Euler problems. I've read books such as "The C Programming Language".
I would have taken the udacity courses on machine learning by now, but the stated requirement is a strong familiarity with Probability Theory, Linear Algebra and Statistics. I wouldn't describe my familiarity as strong, that will take a few more years.
I am skeptical though. If the reason that I dismiss certain kinds of AI risks is that I lack the necessary education, then I expect to see rebuttals of the kind "You are wrong because of (add incomprehensible technical justification)...". But that's not the case. All I see are half-baked science fiction stories and completely unconvincing informal arguments.
Don't twist Eliezer's words. There's a vast difference between "a PhD in what they call AI will not help you think about the mathematical and philosophical issues of AGI" and "you don't need any training or education in computing to think clearly about AGI".
Too bad. I can download an inefficient but functional subhuman AGI from Github. Making it superhuman is just a matter of adding an entire planet's worth of computing power. Strangely, doing so will not make it conform to your ideas about "eventual future AGI", because this one is actually existing AGI, and reality doesn't have to listen to you.
That is exactly the situation we face, your refusal to believe in actually-existing AGI models notwithstanding. Whine all you please: the math will keep on working.
Then I recommend you shut up about matters of highly involved computer science until such time as you have acquired the relevant knowledge for yourself. I am a trained computer scientist, and I held lots of skepticism about MIRI's claims, so I used my training and education to actually check them. And I found that the actual evidence of the AGI research record showed MIRI's claims to be basically correct, modulo Eliezer's claims about an intelligence explosion taking place versus Hutter's claim that an eventual optimal agent will simply scale itself up in intelligence with the amount of computing power it can obtain.
That's right, not everyone here is some kind of brainwashed cultist. Many of us have exercised basic skepticism against claims with extremely low subjective priors. But we exercised our skepticism by doing the background research and checking the presently available object-level evidence rather than by engaging in meta-level speculations about an imagined future in which everything will just work out.
Take a course at your local technical college, or go on a MOOC, or just dust off a whole bunch of textbooks in computer-scientific and mathematical subjects, study the necessary knowledge to talk about AGI, and then you get to barge in telling everyone around you how we're all full of crap.
I think you are underestimating this by many orders of magnitudes.
Yeah. A starting point could be the AI writing some 1000 letter essay (action space of 27^1000 without punctuation) or talking through a sound card (action space of 2^(16*44100) per second). If he was talking about mc-AIXI on github, the relevant bits seem to be in the agent.cpp and it ain't looking good.
Which one are you talking about, to be completely exact?
then use that training and figure out how many galaxies worth of computing power it's going to take.
Of bleeding course I was talking about AIXI. What I find strange to the point of suspiciousness here is the evinced belief on part of the "AI skeptics" that the inefficiency of MC-AIXI means there will never, ever be any such thing as near-human, human-equivalent, or greater-than-human AGIs. After all, if intelligence is impossible without converting whole galaxies to computronium first, then how do we work?
And if we admit that sub-galactic intelligence is possible, why not artificial intelligence? And if we admit that sub-galactic artificial intelligence is possible, why not something from the "Machine Learning for Highly General Hypothesis Classes + Decision Theory of Active Environments = Universal AI" paradigm started by AIXI?
I'm not at all claiming current implementations of AIXI or Goedel Machines are going to cleanly evolve into planet-dominating superintelligences that run on a home PC next year, or even next decade (for one thing, I don't think planet dominating superintelligences will run on a present-day home PC ever). I am claiming that the underlying scientific paradigm of the thing is a functioning reduction of what we mean by the word "intelligence", and given enough time to work, this scientific paradigm is very probably (in my view) going to produce software you can run on an ordinary massive server farm that will be able to optimize arbitrary, unknown or partially unknown environments according to specified utility functions.
And eventually, yes, those agents will become smarter than us (causing "MIRI's issues" to become cogent), because we, actual human beings, will figure out the relationships between compute-power, learning efficiency (rates of convergence to error-minimizing hypotheses in terms of training data), reasoning efficiency (moving probability information from one proposition or node in a hypothesis to another via updating), and decision-making efficiency (compute-power needed to plan well given models of the environment). Actual researchers will figure out the fuel efficiency of artificial intelligence, and thus be able to design at least one gigantic server cluster running at least one massive utility-maximizing algorithm that will be able to reason better and faster than a human (while they have the budget to keep it running).
The notion that AI is possible is mainstream. The crank stuff such as "I can download an inefficient but functional subhuman AGI from Github. Making it superhuman is just a matter of adding an entire planet's worth of computing power.", that's to computer science as hydrinos are to physics.
As for your server farm optimizing unknown environments, the last time I checked, we knew some laws of physics, and did things like making software tools that optimize simulated environments that follow said laws of physics, incidentally it also being mathematically nonsensical to define an "utility function" without a well defined domain. So you got your academic curiosity that's doing all on it's own and using some very general and impractical representations for modelling the world, so what? You're talking of something that is less - in terms of it's market value, power, anything - than it's parts and underlying technologies.
That suggestion would make LW a sad and lonely place.
Are you sure you mean it?
So, why MIRI's claims aren't accepted by the mainstream, then? Is it because all the "trained computer scientiests" are too dumb or too lazy to see the truth? Or is it the case that the "evidence" is contested, ambiguous, and inconclusive?
Because they've never heard of them. I am not joking. Most computer scientists are not working in artificial intelligence, have not the slightest idea that there exists a conference on AGI backed by Google and held every single year, and certainly have never heard of Hutter's "Universal AI" that treats the subject with rigorous mathematics.
In their ignorance, they believe that the principles of intelligence are a highly complex "emergent" phenomenon for neuroscientists to figure out over decades of slow, incremental toil. Since most of the public, including their scientifically-educated colleagues, already believe this, it doesn't seem to them like a strange belief to hold, and besides, anyone who reads even a layman's introduction to neuroscience finds out that the human brain is extremely complicated. Given the evidence that the only known actually-existing minds are incredibly complicated, messy things, it is somewhat more rational to believe that minds are all incredibly complicated, messy things, and thus to dismiss anyone talking about working "strong AI" as a science-fiction crackpot.
How are they supposed to know that the actual theory of intelligence is quite simple, and the hard part is fitting it inside realizable, finite computers?
Also, the dual facts that Eliezer has no academic degree in AI and that plenty of people who do have such degrees have turned out to be total crackpots anyway means that the scientific public and the "public public" are really quite entitled to their belief that the base rate of crackpottery among people talking about knowing how AI works is quite high. It is high! But it's not 100%.
(How did I tell the crackpottery apart from the real science? Well, frankly, I looked for patterns that appeared to have come from the process of doing real science: instead of a grand revelation, I looked for a slow build-up of ideas that were each ground out into multiple publications. I also filtered for AGI theorists who managed to apply their principles of broad AGI to usages in narrower machine-learning problems, resulting again in published papers. I looked for a theory that sounded like programming rather than like psychology. Hence my zeroing in on Schmidhuber, Hutter, Legg, Orneau, etc. as the AGI Theorists With a Clue.
Hutter, by the way, has written a position paper about potential Singularities in which he actually cites Yudkowsky, so hey.)
OK then. Among the scientists who have heard of them and bothered to have an opinion on the topic, does the opinion that MIRI is correct dominate? And if not so, why, given your account that the evidence unambiguously points in only one direction?
I don't think I'm going to believe you about that. The fact that in some contexts it's convenient to define intelligence as a cross-domain optimizer does not mean that it is nothing but.
If the only way to shoehorn theoretically pure intelligence into a finite architecture is to turn it into a messy combination of specialised mindless...then everyone's right.
As far as I know, MIRI's main beliefs are listed in the post 'Five theses, two lemmas, and a couple of strategic implications'.
I am not sure how you could verify any of those beliefs by a literature review. Where 'verify' means that the probability of their conjunction is high enough in order to currently call MIRI the most important cause. If that's not your stance, then please elaborate. My stance is that it is important to keep in mind that general AI could turn out to be very dangerous but that it takes a lot more concrete AI research before action relevant conclusions about the nature and extent of the risk can be drawn.
As someone who is no domain expert I can only think about it informally or ask experts what they think. And currently there is not enough that speaks in favor of MIRI. But this might change. If for example the best minds at Google would thoroughly evaluate MIRI's claims and agree with MIRI, then that would probably be enough for me to shut up. If MIRI would become a top-charity at GiveWell, then this would also cause me to strongly update in favor of MIRI. There are other possibilities as well. For example strong evidence that general AI is only 5 decades away (e.g. the existence of a robot that could navigate autonomously in a real-world environment and survive real-world threats and attacks with approximately the skill of an insect / an efficient and working emulation of a fly brain).
MIRIs claims also aren't accepted by domain experts who have been invited to discuss them here, and so, know about them.
Or that you need just so much education, neither more nor less, to see them.
Trust me, an LW without XiXiDu is neither a sad nor lonely place, as evidenced by his multiple attempts at leaving.
Mainstream CS people are in general neither dumb nor lazy. AI as a field is pretty fringe to begin with, and AGI is moreso. Why is AI a fringe field? In the 70's MIT thought they could save the world with LISP. They failed, and the rest of CS became immunized to the claims of AGI.
Unless an individual sees AGI as a credible threat, it's not pragmatic for them to start researching it, due to the various social and political pressures in academia.
I read the grandparent post as an attempt to assert authority and tell people to sit down, shut up, and attend to their betters.
You're reading it as a direct personal attack on XiXiDu.
Neither interpretation is particularly appealing.
I consider efficiency to be a crucial part of the definition of intelligence. Otherwise, as someone else told you in another comment, unlimited computing power implies that you can do "an exhaustive brute-force search through the entire solution space and be done in an instant."
I'd be grateful if you could list your reasons (or the relevant literature) for believing that AIXI related research is probable enough to lead to efficient artificial general intelligence (AGI) in order for it to make sense to draw action relevant conclusions from AIXI about efficient AGI.
I do not doubt the math. I do not doubt that evolution (variation + differential reproduction + heredity + mutation + genetic drift) underlies all of biology. But that we understand evolution does not mean that it makes sense to call synthetic biology an efficient approximation of evolution.
Even if you ran an AIXI on all the world's computers, you could still box it.
Did you check the claim that we have something dangerously unfriendly?
As a matter of fact, yes. There is a short sentence in Hutter's textbook indicating that he has heard of the possibility that AIXI might overpower its operators in order to gain more reward, and he acknowledged that such a thing could happen, but he considered it outside the scope of his book.
I asked Laurent Orseau about this here.
Did he not toknow that AIXI us uncomputable?
what
https://github.com/moridinamael/mc-aixi
We won't get a chance to test the "planet's worth of computing power" hypothesis directly, since none of us have access to that much computing power. But, from my own experience implementing mc-aixi-ctw, I suspect that is an underestimate of the amount of compute power required.
The main problem is that the sequence prediction algorithm (CTW) makes inefficient use of sense data by "prioritizing" the most recent bits of the observation string, so only weakly makes connections between bits that are temporally separated by a lot of noise. Secondarily, plain monte carlo tree search is not well-suited to decision making in huge action spaces, because it wants to think about each action at least once. But that can most likely be addressed by reusing sequence prediction to reduce the "size" of the action space by chunking actions into functional units.
Unfortunately. both of these problems are only really technical ones, so it's always possible that some academic will figure out a better sequence predictor, lifting mc-aixi on an average laptop from "wins at pacman" to "wins at robot wars" which is about the level at which it may start posing a threat to human safety.
only?
Mc-aixi is not going to win at something as open ended as robot wars just by replacing CTW or CTS with something better.
And anyway, even if it did, it wouldn't be about the level at which it may start posing a threat to human safety. Do you think that the human robot wars champions a threat to human safety? Are they even at the level of taking over the world? I don't think so.
Actually... :-D
what is this I don't even
I look forward to the falsifiable claim.
Why don't you make your research public? Would be handy to have a thorough validation of MIRI's claims. Even if people like me wouldn't understand it, you could publish it and thereby convince the CS/AI community of MIRI's mission.
Does this also apply to people who support MIRI without having your level of insight?
If only you people would publish all this research.
Now you're just dissembling on the meaning of the word "research", which was clearly used in this context as "literature search".
The idea is not to put it in a journal, but to make it public. You can certainly publish, in that sense, the results of a literature search. The point is to put it where people other than yourself can see it. It would certainly be informative if you were to post, even here, something saying "I looked up X claim and I found it in the literature under Y".
Of course we haven't discovered anything dangerously unfriendly...
Or anything that can't be boxed. Remind me how AIs are supposed to out of boxes?
Of course we have, it's called AIXI. Do I need to download a Monte Carlo implementation from Github and run it on a university server with environmental access to the entire machine and show logs of the damn thing misbehaving itself to convince you?
AIs can be causally boxed, just like anything else. That is, as long as the agent's environment absolutely follows causal rules without any exception that would leak information about the outside world into the environment, the agent will never infer the existence of a world outside its "box".
But then it's also not much use for anything besides Pac-Man.
Given how slow and dumb it is, I have a hard time seeing an approximation to AIXI as a threat to anyone, except maybe itself.
True, but that's an issue of raw compute-power, rather than some innate Friendliness of the algorithm.
It would still be useful to have an example, of innate unfriendliness, rather than " it doesn't really run or do anything"
Not just raw compute-power. An approximation to AIXI is likely to drop a rock on itself just to see what happens long before it figure out enough to be dangerous.
FWIW, I think that would make for a pretty interesting post.
And now I think I know what I might do for a hobby during exams month and summer vacation. Last I looked at the source-code, I'd just have to write some data structures describing environment-observations (let's say... of the current working directory of a Unix filesystem) and potential actions (let's say... Unix system calls) in order to get the experiment up and running. Then it would just be a matter of rewarding the agent instance for any behavior I happen to find interesting, and watching what happens.
Initial prediction: since I won't have a clearly-developed reward criterion and the agent won't have huge exponential sums of CPU cycles at its disposal, not much will happen.
However, I do strongly believe that the agent will not suddenly develop a moral sense out of nowhere.
No. But .it will be eminently boxable. In fact, if you not nuts, youll be running it a box.
I think you'll have serious trouble getting an AIXI approximation to do much of anything interesting, let alone misbehave. The computational costs are too high.
Do you even know what "monte carlo" means? It means it tries to build a predictor of environment by trying random programs. Even very stupid evolutionary methods do better.
Once you throw away this whole 'can and will try absolutely anything' and enter the domain of practical software, you'll also enter the domain where the programmer is specifying what the AI thinks about and how. The immediate practical problem of "uncontrollable" (but easy to describe) AI is that it is too slow by a ridiculous factor.
Private_messaging, can you explain why you open up with such a hostile question at eli? Why the implied insult? Is that the custom here? I am new, should I learn to do this?
For example, I could have opened with your same question, because Monte Carlo methods are very different from what you describe (I happened to be a mathematical physicist back in the day). Let me quote an actual definition:
Monte Carlo Method: A problem solving technique used to approximate the probability of certain outcomes by running multiple trial runs, called simulations, using random variables.
A classic very very simple example is a program that approximates the value of 'pi' thusly:
Estimate pi by dropping $total_hits random points into a square with corners at -1,-1 and 1,1
(then count how many are inside radius one circle centered on origin)
(loop here for as many runs as you like) { define variables $x,$y, $hitsinsideradius = 0, $radius =1.0, $totalhits=0, piapprox;
} output data for this particular run } print nice report exit();
OK, this is a nice toy Monte Carlo program for a specific problem. Real world applications typically have thousands of variables and explore things like strange attractors in high dimensional spaces, or particle physics models, or financial programs, etc. etc. It's a very powerful methodology and very well known.
In what way is this little program an instance of throwing a lot of random programs at the problem of approximating 'pi'? What would your very stupid evolutionary program to solve this problem more efficiently be? I would bet you a million dollars to a thousand (if I had a million) that my program would win a race against a very stupid evolutionary program to estimate pi to six digits accurately, that you write. Eli and Eliezer can judge the race, how is that?
I am sorry if you feel hurt by my making fun of your ignorance of Monte Carlo methods, but I am trying to get in the swing of the culture here and reflect your cultural norms by copying your mode of interaction with Eli, that is, bullying on the basis of presumed superior knowledge.
If this is not pleasant for you I will desist, I assume it is some sort of ritual you enjoy and consensual on Eli's part and by inference, yours, that you are either enjoying this public humiliation masochistically or that you are hoping people will give you aversive condition when you publicly display stupidity, ignorance, discourtesy and so on. If I have violated your consent then I plead that I am from a future where this is considered acceptable when a person advertises that they do it to others. Also, I am a baby eater and human ways are strange to me.
OK. Now some serious advice:
If you find that you have just typed "Do you even know what X is?" then given a little condescending mini lecture about X, please check that you yourself actually know what X is before you post. I am about to check Wikipedia before I post in case I'm having a brain cloud, and i promise that I will annotate any corrections I need to make after I check; everything up to HERE was done before the check. (Off half recalled stuff from grad school a quarter century ago...)
OK, Wikipedia's article is much better than mine. But I don't need to change anything, so I won't.
P.S. It's ok to look like an idiot in public, it's a core skill of rationalists to be able to tolerate this sort of embarassment, but another core skill is actually learning something if you find out that you were wrong. Did you go to Wikipedia or other sources? Do you know anything about Monte Carlo Methods now? Would you like to say something nice about them here?
P.P.S. Would you like to say something nice about eli_sennesh, since he actually turns out to have had more accurate information than you did when you publicly insulted his state of knowledge? If you too are old pals with a joking relationship, no apology needed to him, but maybe an apology for lazily posting false information that could have misled naive readers with no knowledge of Monte Carlo methods?
P.P.P.S. I am curious, is the psychological pleasure of viciously putting someone else down as ignorant in front of their peers worth the presumed cost of misinforming your rationalist community about the nature of an important scientific and mathematical tool? I confess I feel a little pleasure in twisting the knife here, this is pretty new to me. Should I adopt your style of intellectual bullying as a matter of course? I could read all your posts and viciously hold up your mistakes to the community, would you enjoy that?
I'm well aware of what Monte Carlo methods are (I work in computer graphics where those are used a lot), I'm also aware of what AIXI does.
Furthermore eli (and the "robots are going to kill everyone" group - if you're new you don't even know why they're bringing up monte-carlo AIXI in the first place) are being hostile to TheAncientGeek.
edit: to clarify, Monte-Carlo AIXI is most assuredly not an AI which is inventing and applying some clever Monte Carlo methods to predict the environment. No, it's estimating the sum over all predictors of environment with a random subset of predictors of environment (which doesn't work all too well, and that's why hooking it up to the internet is not going to result in anything interesting happening, contrary to what has been ignorantly asserted all over this site). I should've phrased it differently, perhaps - like "Do you even know what "monte carlo" means as applied to AIXI?".
It is completely irrelevant how human-invented Monte-Carlo solutions behave, when the subject is hooking up AIXI to a server.
edit2: to borrow from your example:
" Of course we haven't discovered anything dangerously good at finding pi..."
"Of course we have, it's called area of the circle. Do I need to download a Monte Carlo implementation from Github and run it... "
"Do you even know what "monte carlo" means? It means it tries random points and checks if they're in a circle. Even very stupid geometric methods do better."
You appear to have posted this as a reply to the wrong comment. Also, you need to indent code 4 spaces and escape underscores in text mode with a \_.
On the topic, I don't mind if you post tirades against people posting false information (I personally flipped the bozo bit on private_messaging a long time ago). But you should probably keep it short. A few paragraphs would be more effective than two pages. And there's no need for lengthy apologies.
"Good, I can feel your anger. ... Strike me down with all of your hatred and your journey towards the dark side will be complete!"
Once you enter the domain of practical software you've entered the domain of Narrow AI, where the algorithm designer has not merely specified a goal but a method as well, thus getting us out of dangerous territory entirely.
On rereading this I feel I should vote myself down if I knew how, it seems a little over the top.
Let me post about my emotional state since this is a rationality discussion and if we can't deconstruct our emotional impulses and understand them we are pretty doomed to remaining irrational.
I got quite emotional when I saw a post that seemed like intellectual bullying followed by self congratulation; I am very sensitive to this type of bullying, more so when directed at others than myself as due to freakish test scores and so on as a child I feel fairly secure about my intellectual abilities, but I know how bad people feel when others consider them stupid. I have a reaction to leap to the defense of the victim; however I put this down to local custom of a friendly ribbing type of culture or something and tried not to jump on it.
Then I saw that privatemessaging seemed pretending to be an authority on Monte Carlo methods while spreading false information about them, either out of ignorance (very likely) or malice. Normally ignorance would have elicited a sympathy reaction from me and a very gentle explanation of the mistake, but in the context of having just seen privatemessaging attack elisennesh for his supposed ignorance of Monte Carlo methods, I flew into a sort of berserker sardonic mode, i.e. "If privatemessaging thinks that people who post about Monte Carlo methods while not knowing what they are should be mocked in public, I am happy to play by their rules!" And that led to the result you see, a savage mocking.
I do not regret doing it because the comment with the attack on eli_sennesh and the calumnies against Monte Carlo still seems to be to have been in flagrant violation of rationalist ethics, in particular, presenting himself as if not an expert, at least someone with the moral authority to diss someone else for their ignorance on an important topic, and then followed false and misleading information about MC methods. This seemed like an action with a strongly negative utility to the community because it could potentially lead many readers to ignore the extremely useful Monte Carlo methodology.
If I posed as an authority and when around telling people Bayesian inference was a bad methodology that was basically just "a lot of random guesses" and that "even a very stupid evolutionary program" would do better t assessing probabilities, should I be allowed to get away scot free? I think not. If I do something like that I would actually hope for chastisement or correction from the community, to help me learn better.
Also it seemed like it might make readers think badly of those who rely heavily on Monte Carlo Methods. "Oh those idiots, using those stupid methods, why don't they switch to evolutionary algorithms". I'm not a big MC user but I have many friends who are, and all of them seem like nice, intelligent, rational individuals.
So I went off a little heavily on private_messaging, who I am sure is a good person at heart.
Now, I acted emotionally there, but my hope is that in the Big Searles Room that constitutes our room, I managed to pass a message that (through no virtue of my own) might ultimately improve the course of our discourse.
I apologize to anyone who got emotionally hurt by my tirade.
To think of the good an EPrime style ban on "is" could do here....
That would be the AIXI that is uncomputable?
And don't AIs get out of boxes by talking their way out, round here?
It's incomputable because the Solomonoff prior is, but you can approximate it -- to arbitrary precision if you've got the processing power, though that's a big "if" -- with statistical methods. Searching Github for the Monte Carlo approximations of AIXI that eli_sennesh mentioned turned up at least a dozen or so before I got bored.
Most of them seem to operate on tightly bounded problems, intelligently enough. I haven't tried running one with fewer constraints (maybe eli has?), but I'd expect it to scribble over anything it could get its little paws on.
But people do run these things that aren't actually AIXIs , and they haven't actually taken over the world, so they aren't actually dangerous.
So there is no actually dangerous actual .AI.
Sir Lancelot: Look, my liege!
[trumpets play a fanfare as the camera cuts briefly to the sight of a majestic castle]
King Arthur: [in awe] Camelot!
Sir Galahad: [in awe] Camelot!
Sir Lancelot: [in awe] Camelot!
Patsy: [derisively] It's only a model!
King Arthur: Shh!
:-D
Is that ... possible?
Is it possible to run an AIXI approximation as root on a machine somewhere and give it the tools to shoot itself in the foot? Sure. Will it actually end up shooting itself in the foot? I don't know. I can't think of any theoretical reasons why it wouldn't, but there are practical obstacles: a modern computer architecture is a lot more complicated than anything I've seen an AIXI approximation working on, and there are some barriers to breaking one by thrashing around randomly.
It'd probably be easier to demonstrate if it was working at the core level rather than the filesystem level.
Huh. I was under the impression it would require far too much computing power to approximate AIXI well enough that it would do anything interesting. Thanks!
This can easily be done, and be done safely, since you could give an AIXI root access to a virtualused machine.
I'm still waiting for evidence that it would do something destructive in the pursuit of a goal that's is not obviously destructive.
Since many humans are difficult to box, I would have to disagree with you there.
And, obviously, not all humans are Friendly.
An intelligent, charismatic psychopath seems like they would fit both your criteria. And, of course, there is no shortage of them. We can only be thankful they are too rare relative to equivalent semi-Friendly intelligences, and too incompetent, to have done more damage than all the deaths and so on.
Most humans are easy to box, since they can be contained jn prisons.
How likly is an .AI to be psychopathic that is not designed to be psychopathic?
If I believed that anything as simple as AIXI could possibly result in practical general AI, or that expected utility maximizing was at all feasible, then I would tend to agree with MIRI. I don't. And I think it makes no sense to draw conclusions about practical AI from these models.
This is crucial.
That's largely irrelevant and misleading. Your autonomous car does not need to feature an encoding of an amount of human values that correspondents to its level of autonomy.
That post has been completely debunked.
ETA: Fixed a link to expected utility maximization.
There's the famous example of the .AI trained to spot tanks that actually leant to spot sunny days. That seems to underlie a lot of MIRI thinking, although at the same time the point is disguised by emphasesing explicit coding over training.
I have never seen AI characterised like that before. Sounds like moonshine to me. Programming languages, libraries, and development environments yes, that's what they're for, but those don't take away the task of having to explicitly and precisely think about what you mean, they just automate the routine grunt work for you. An AI isn't going to superintelligently (that is to say,magically) know what you mean, if you didn't actually mean anything.
Non AI systems uncontroversially require explicit coding. How would you characterise .AI systems, then?
XiXiDu's characterisation seems suitable enough: programs able to perform tasks normally requiring human intelligence. One might add "or superhuman intelligence", as long as one is not simply wishing for magic there. This is orthogonal to the question of how you tell such a system what you want it to do.
Indeed. But there is a how to-do-it definition of .AI, and it is kind of not aboutt explicit coding, for instance, if a student takes an .AI course as part of a degree, they are not taught explicit coding all over again. They are taught about learning algorithms, neural networks, etc.
They definitely require some amount of explicit coding of their values. You can try to reduce the burden of such explicit value-loading through various indirect means, such as value learning, indirect normativity, extrapolated volition, or even reinforcement learning (though that's the most primitive and dangerous form of value-loading). You cannot, however, dodge the bullet.
Because?
What does improvement in the field of AI refer to? I think it isn't wrong to characterize it as the development of programs able to perform tasks normally requiring human intelligence.
I believe that companies like Apple would like their products, such as Siri, to be able to increasingly understand what their customers expect their gadgets to do, without them having to learn programming.
In this context it seems absurd to imagine that when eventually our products become sophisticated enough to take over the world, they will do so due to objectively stupid misunderstandings.
That's a reasonably good description of the stuff that people call AI. Any particular task, however, is just an application area, not the definition of the whole thing. Natural language understanding is one of those tasks.
The dream of being able to tell a robot what to do, and it knowing exactly what you meant, goes beyond natural language understanding, beyond AI, beyond superhuman AI, to magic. In fact, it seems to me a dream of not existing -- the magic AI will do everything for us. It will magically know what we want before we ask for it, before we even know it. All we do in such a world is to exist. This is just another broken utopia.
I agree. All you need is a robot that does not mistake "earn a college degree" for "kill all other humans and print an official paper confirming that you earned a college degree".
All trends I am aware of indicate that software products will become better at knowing what you meant. But in order for them to constitute an existential risk they would have to become catastrophically worse at understanding what you meant while at the same time becoming vastly more powerful at doing what you did not mean. But this doesn't sound at all likely to me.
What I imagine is that at some point we'll have a robot that can enter a classroom, sit down, and process what it hears and sees in such a way that it will be able to correctly fill out a multiple choice test at the end of the lesson. Maybe the robot will literally step on someones toes. This will then have to be fixed.
What I don't think is that the first robot entering a classroom, in order to master a test, will take over the world after hacking school's WLAN and solving molecular nanotechnology. That's just ABSURD.
Um, I think you meant "disagree".
It just blows my mind that after the countless hours you've spent reading and writing about the Friendly AI problem, not to mention the countless hours people have spent patiently explaining (and re- re- re- re-explaining) it to you, that you still don't understand what the FAI problem is. It's unbelievable.
Yeah, but hardcoding is an easier sell to people who know how to code but have never done .AI... Its like political demagogues selling unworkable but easily understood ideas.
Not really, no. Most people don't recognize the "hidden complexity of wishes" in Far Mode, or when it's their wishes. However, I think if I explain to them that I'll be encoding my wishes, they'll quickly figure out that my attempts to hardcode AI Friendliness are going to be very bad for them. Human intelligence evolved for winning arguments when status, wealth, health, and mating opportunities are at issue: thus, convince someone to treat you as an opponent, and leave the correct argument lying right where they can pick it up, and they'll figure things out quickly.
Hmmm... I wonder if that bit of evolutionary psychology explains why many people act rude and nasty even to those close to them. Do we engage more intelligence when trying to win a fight than when trying to be nice?