You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Yudkowsky's brain is the pinnacle of evolution

-27 Yudkowsky_is_awesome 24 August 2015 08:56PM

Here's a simple problem: there is a runaway trolley barreling down the railway tracks. Ahead, on the tracks, there are 3^^^3 people tied up and unable to move. The trolley is headed straight for them. You are standing some distance off in the train yard, next to a lever. If you pull this lever, the trolley will switch to a different set of tracks. However, you notice that there is one person, Eliezer Yudkowsky, on the side track. You have two options: (1) Do nothing, and the trolley kills the 3^^^3 people on the main track. (2) Pull the lever, diverting the trolley onto the side track where it will kill Yudkowsky. Which is the correct choice?

The answer:

Imagine two ant philosophers talking to each other. “Imagine," they said, “some being with such intense consciousness, intellect, and emotion that it would be morally better to destroy an entire ant colony than to let that being suffer so much as a sprained ankle."

Humans are such a being. I would rather see an entire ant colony destroyed than have a human suffer so much as a sprained ankle. And this isn't just human chauvinism either - I can support my feelings on this issue by pointing out how much stronger feelings, preferences, and experiences humans have than ants do.

How this relates to the trolley problem? There exists a creature as far beyond us ordinary humans as we are beyond ants, and I think we all would agree that its preferences are vastly more important than those of humans.

Yudkowsky will save the world, not just because he's the one who happens to be making the effort, but because he's the only one who can make the effort.

The world was on its way to doom until the day of September 11, 1979, which will later be changed to national holiday and which will replace Christmas as the biggest holiday. This was of course the day when the most important being that has ever existed or will exist, was born.

Yudkowsky did the same to the field of AI risk as Newton did to the field of physics. There was literally no research done on AI risk in the same scale that has been done in the 2000's by Yudkowsky. The same can be said about the field of ethics: ethics was an open problem in philosophy for thousands of years. However, Plato, Aristotle, and Kant don't really compare to the wisest person who has ever existed. Yudkowsky has come closest to solving ethics than anyone ever before. Yudkowsky is what turned our world away from certain extinction and towards utopia.

We all know that Yudkowsky has an IQ so high that it's unmeasurable, so basically something higher than 200. After Yudkowsky gets the Nobel prize in literature due to getting recognition from Hugo Award, a special council will be organized to study the intellect of Yudkowsky and we will finally know how many orders of magnitude higher Yudkowsky's IQ is to that of the most intelligent people of history.

Unless Yudkowsky's brain FOOMs before it, MIRI will eventually build a FAI with the help of Yudkowsky's extraordinary intelligence. When that FAI uses the coherent extrapolated volition of humanity to decide what to do, it will eventually reach the conclusion that the best thing to do is to tile the whole universe with copies of Eliezer Yudkowsky's brain. Actually, in the process of making this CEV, even Yudkowsky's harshest critics will reach such understanding of Yudkowsky's extraordinary nature that they will beg and cry to start doing the tiling as soon as possible and there will be mass suicides because people will want to give away the resources and atoms of their bodies for Yudkowsky's brains. As we all know, Yudkowsky is an incredibly humble man, so he will be the last person to protest this course of events, but even he will understand with his vast intellect and accept that it's truly the best thing to do.

Even when contrarians win, they lose: Jeff Hawkins

13 endoself 08 April 2015 04:54AM

Related: Even When Contrarians Win, They Lose

I had long thought that Jeff Hawkins (and the Redwood Center, and Numentia) were pursuing an idea that didn't work, and were continuing to fail to give up for a prolonged period of time. I formed this belief because I had not heard of any impressive results or endorsements of their research. However, I recently read an interview with Andrew Ng, a leading machine learning researcher, in which he credits Jeff Hawkins with publicizing the "one learning algorithm" hypothesis - the idea that most of the cognitive work of the brain is done by one algorithm. Ng says that, as a young researcher, this pushed him into areas that could lead to general AI. He still believes that AGI is far though.

I found out about Hawkins' influence on Ng after reading an old SL4 post by Eliezer and looking for further information about Jeff Hawkins. It seems that the "one learning algorithm" hypothesis was widely known in neuroscience, but not within AI until Hawkins' work. Based on Eliezer's citation of Mountcastle and his known familiarity with cognitive science, it seems that he learned of this hypothesis independently of Hawkins. The "one learning algorithm" hypothesis is important in the context of intelligence explosion forecasting, since hard takeoff is vastly more likely if it is true. I have been told that further evidence for this hypothesis has been found recently, but I don't know the details.

This all fits well with Robin Hanson's model. Hawkins had good evidence that better machine learning should be possible, but the particular approaches that he took didn't perform as well as less biologically-inspired ones, so he's not really recognized today. Deep learning would definitely have happened without him; there were already many people working in the field, and they started to attract attention because of improved performance due to a few tricks and better hardware. At least Ng's career though can be credited to Hawkins.

I've been thinking about Robin's hypothesis a lot recently, since many researchers in AI are starting to think about the impacts of their work (most still only think about the near-term societal impacts rather than thinking about superintelligence though). They recognize that this shift towards thinking about societal impacts is recent, but they have no idea why it is occurring. They know that many people, such as Elon Musk, have been outspoken about AI safety in the media recently, but few have heard of Superintelligence, or attribute the recent change to FHI or MIRI.

[Link] Will Superintelligent Machines Destroy Humanity?

1 roystgnr 27 November 2014 09:48PM

A summary and review of Bostrom's Superintelligence is in the December issue of Reason magazine, and is now posted online at Reason.com.

Goal retention discussion with Eliezer

56 MaxTegmark 04 September 2014 10:23PM

Although I feel that Nick Bostrom’s new book “Superintelligence” is generally awesome and a well-needed milestone for the field, I do have one quibble: both he and Steve Omohundro appear to be more convinced than I am by the assumption that an AI will naturally tend to retain its goals as it reaches a deeper understanding of the world and of itself. I’ve written a short essay on this issue from my physics perspective, available at http://arxiv.org/pdf/1409.0813.pdf.

Eliezer Yudkowsky just sent the following extremely interesting comments, and told me he was OK with me sharing them here to spur a broader discussion of these issues, so here goes.

On Sep 3, 2014, at 17:21, Eliezer Yudkowsky <yudkowsky@gmail.com> wrote:

Hi Max!  You're asking the right questions.  Some of the answers we can
give you, some we can't, few have been written up and even fewer in any
well-organized way.  Benja or Nate might be able to expound in more detail
while I'm in my seclusion.

Very briefly, though:
The problem of utility functions turning out to be ill-defined in light of
new discoveries of the universe is what Peter de Blanc named an
"ontological crisis" (not necessarily a particularly good name, but it's
what we've been using locally).

http://intelligence.org/files/OntologicalCrises.pdf

The way I would phrase this problem now is that an expected utility
maximizer makes comparisons between quantities that have the type
"expected utility conditional on an action", which means that the AI's
utility function must be something that can assign utility-numbers to the
AI's model of reality, and these numbers must have the further property
that there is some computationally feasible approximation for calculating
expected utilities relative to the AI's probabilistic beliefs.  This is a
constraint that rules out the vast majority of all completely chaotic and
uninteresting utility functions, but does not rule out, say, "make lots of
paperclips".

Models also have the property of being Bayes-updated using sensory
information; for the sake of discussion let's also say that models are
about universes that can generate sensory information, so that these
models can be probabilistically falsified or confirmed.  Then an
"ontological crisis" occurs when the hypothesis that best fits sensory
information corresponds to a model that the utility function doesn't run
on, or doesn't detect any utility-having objects in.  The example of
"immortal souls" is a reasonable one.  Suppose we had an AI that had a
naturalistic version of a Solomonoff prior, a language for specifying
universes that could have produced its sensory data.  Suppose we tried to
give it a utility function that would look through any given model, detect
things corresponding to immortal souls, and value those things.  Even if
the immortal-soul-detecting utility function works perfectly (it would in
fact detect all immortal souls) this utility function will not detect
anything in many (representations of) universes, and in particular it will
not detect anything in the (representations of) universes we think have
most of the probability mass for explaining our own world.  In this case
the AI's behavior is undefined until you tell me more things about the AI;
an obvious possibility is that the AI would choose most of its actions
based on low-probability scenarios in which hidden immortal souls existed
that its actions could affect.  (Note that even in this case the utility
function is stable!)

Since we don't know the final laws of physics and could easily be
surprised by further discoveries in the laws of physics, it seems pretty
clear that we shouldn't be specifying a utility function over exact
physical states relative to the Standard Model, because if the Standard
Model is even slightly wrong we get an ontological crisis.  Of course
there are all sorts of extremely good reasons we should not try to do this
anyway, some of which are touched on in your draft; there just is no
simple function of physics that gives us something good to maximize.  See
also Complexity of Value, Fragility of Value, indirect normativity, the
whole reason for a drive behind CEV, and so on.  We're almost certainly
going to be using some sort of utility-learning algorithm, the learned
utilities are going to bind to modeled final physics by way of modeled
higher levels of representation which are known to be imperfect, and we're
going to have to figure out how to preserve the model and learned
utilities through shifts of representation.  E.g., the AI discovers that
humans are made of atoms rather than being ontologically fundamental
humans, and furthermore the AI's multi-level representations of reality
evolve to use a different sort of approximation for "humans", but that's
okay because our utility-learning mechanism also says how to re-bind the
learned information through an ontological shift.

This sorta thing ain't going to be easy which is the other big reason to
start working on it well in advance.  I point out however that this
doesn't seem unthinkable in human terms.  We discovered that brains are
made of neurons but were nonetheless able to maintain an intuitive grasp
on what it means for them to be happy, and we don't throw away all that
info each time a new physical discovery is made.  The kind of cognition we
want does not seem inherently self-contradictory.

Three other quick remarks:

*)  Natural selection is not a consequentialist, nor is it the sort of
consequentialist that can sufficiently precisely predict the results of
modifications that the basic argument should go through for its stability.
The Omohundrian/Yudkowskian argument is not that we can take an arbitrary
stupid young AI and it will be smart enough to self-modify in a way that
preserves its values, but rather that most AIs that don't self-destruct
will eventually end up at a stable fixed-point of coherent
consequentialist values.  This could easily involve a step where, e.g., an
AI that started out with a neural-style delta-rule policy-reinforcement
learning algorithm, or an AI that started out as a big soup of
self-modifying heuristics, is "taken over" by whatever part of the AI
first learns to do consequentialist reasoning about code.  But this
process doesn't repeat indefinitely; it stabilizes when there's a
consequentialist self-modifier with a coherent utility function that can
precisely predict the results of self-modifications.  The part where this
does happen to an initial AI that is under this threshold of stability is
a big part of the problem of Friendly AI and it's why MIRI works on tiling
agents and so on!

*)  Natural selection is not a consequentialist, nor is it the sort of
consequentialist that can sufficiently precisely predict the results of
modifications that the basic argument should go through for its stability.
It built humans to be consequentialists that would value sex, not value
inclusive genetic fitness, and not value being faithful to natural
selection's optimization criterion.  Well, that's dumb, and of course the
result is that humans don't optimize for inclusive genetic fitness.
Natural selection was just stupid like that.  But that doesn't mean
there's a generic process whereby an agent rejects its "purpose" in the
light of exogenously appearing preference criteria.  Natural selection's
anthropomorphized "purpose" in making human brains is just not the same as
the cognitive purposes represented in those brains.  We're not talking
about spontaneous rejection of internal cognitive purposes based on their
causal origins failing to meet some exogenously-materializing criterion of
validity.  Our rejection of "maximize inclusive genetic fitness" is not an
exogenous rejection of something that was explicitly represented in us,
that we were explicitly being consequentialists for.  It's a rejection of
something that was never an explicitly represented terminal value in the
first place.  Similarly the stability argument for sufficiently advanced
self-modifiers doesn't go through a step where the successor form of the
AI reasons about the intentions of the previous step and respects them
apart from its constructed utility function.  So the lack of any universal
preference of this sort is not a general obstacle to stable
self-improvement.

*)   The case of natural selection does not illustrate a universal
computational constraint, it illustrates something that we could
anthropomorphize as a foolish design error.  Consider humans building Deep
Blue.  We built Deep Blue to attach a sort of default value to queens and
central control in its position evaluation function, but Deep Blue is
still perfectly able to sacrifice queens and central control alike if the
position reaches a checkmate thereby.  In other words, although an agent
needs crystallized instrumental goals, it is also perfectly reasonable to
have an agent which never knowingly sacrifices the terminally defined
utilities for the crystallized instrumental goals if the two conflict;
indeed "instrumental value of X" is simply "probabilistic belief that X
leads to terminal utility achievement", which is sensibly revised in the
presence of any overriding information about the terminal utility.  To put
it another way, in a rational agent, the only way a loose generalization
about instrumental expected-value can conflict with and trump terminal
actual-value is if the agent doesn't know it, i.e., it does something that
it reasonably expected to lead to terminal value, but it was wrong.

This has been very off-the-cuff and I think I should hand this over to
Nate or Benja if further replies are needed, if that's all right.

Can't Pursue the Art for its Own Sake? Really?

0 potato 20 September 2011 02:09AM

Can anyone tell me why it is that if I use my rationality exclusively to improve my conception of rationality I fall into an infinite recursion? EY say's this in The Twelve Virtues and in Something to Protect, but I don't know what his argument is. He goes as far as to say that you must subordinate rationality to a higher value.

I understand that by committing yourself to your rationality you lose out on the chance to notice if your conception of rationality is wrong. But what if I use the reliability of win that a given conception of rationality offers me as the only guide to how correct that conception is. I can test reliability of win by taking a bunch of different problems with known answers that I don't know, solving them using my current conception of rationality and solving them using the alternative conception of rationality I want to test, then checking the answers I arrived at with each conception against the right answers. I could also take a bunch of unsolved problems and attack them from both conceptions of rationality, and see which one I get the most solutions with. If I solve a set of problems with one, that isn't a subset of the set of problems I solved with the other, then I'll see if I can somehow take the union of the two conceptions. And, though I'm still not sure enough about this method to use it, I suppose I could also figure out the relative reliability of two conceptions by making general arguments about the structures of those conceptions; if one conception is "do that which the great teacher says" and the other is "do that which has maximal expected utility", I would probably not have to solve problems using both conceptions to see which one most reliably leads to win.

And what if my goal is to become as epistimically rational as possible. Then I would just be looking for the conception of rationality that leads to truth most reliably. Testing truth by predictive power.

And if being rational for its own sake just doesn't seem like its valuable enough to motivate me to do all the hard work it requires, let's assume that I really really care about picking the best conception of rationality I know of, much more than I care about my own life.

It seems to me that if this is how I do rationality for its own sake — always looking for the conception of goal-oriented rationality which leads to win most reliably, and the conception of epistemic rationality which leads to truth most reliably — then I'll always switch to any conception I find that is less mistaken than mine, and stick with mine when presented with a conception that is more mistaken, provided I am careful enough about my testing. And if that means I practice rationality for its own sake, so what? I practice music for its own sake too. I don't think that's the only or best reason to pursue rationality, certainly some other good and common reasons are if you wanna figure something out or win. And when I do eventually find something I wanna win or figure out that no one else has (no shortage of those), if I can't, I'll know that my current conception isn't good enough. I'll be able to correct my conception by winning or figuring it out, and then thinking about what was missing from my view of rationality that wouldn't let me do that before. But that wouldn't mean that I care more about winning or figuring some special fact than I do about being as rational as possible; it would just mean that I consider my ability to solve problems a judge of my rationality.

I don't understand what I loose out on if I pursue the Art for its own sake in the way described above. If you do know of something I would loose out on, or if you know Yudkowsky's original argument showing the infinite recursion when you motivate yourself to be rational by your love of rationality, then please comment and help me out.  Thanks ahead of time.

Timeline of major Eliezer Yudkowsky publications

14 lukeprog 18 September 2011 03:53PM

Is That Your True Rejection? by Eliezer Yudkowsky @ Cato Unbound

30 XiXiDu 07 September 2011 06:27PM

A response essay written by Eliezer Yudkowsky posted at Cato Unbound for the issue Brain, Belief, and Politics:

Is That Your True Rejection? by Eliezer Yudkowsky

Eliezer Yudkowsky suggests that the partial mutability of human traits is an auxiliary reason at best for Michael Shermer’s libertarianism. Take that fact away, and Shermer’s politics probably wouldn’t go with it. Yudkowsky says that his own small-l libertarian tendencies come from the long history of government incompetence, indifference, and outright malevolence. These, and not brain science, are the best reasons for libertarians to believe what they do.

Moreover, we make a logical error when we infer shares of causality from shares of observed variance; the relationship between nature and nurture is cooperative, not zero-sum. One thing, however, is clear: Human genetic variance is tiny, as indeed it must be for human beings all to constitute a single species. Environmental manipulation can only achieve so much in part because of this universal human inheritance.

The lead essay has been written by Michael Shermer:

Liberty and Science by Michael Shermer

Michael Shermer discusses scientific findings about belief formation. Beliefs, including political beliefs, are usually the result of automatic or intuitive moral judgments, not rational calculations. One cluster of those intuitions presumes that human nature is malleable; these usually produce a liberal politics. Another group of intuitions presumes that human nature is static; these tend to produce conservatism. But Shermer argues that humans really fall somewhere in between — malleable, within some important limits. He argues that this set of findings should produce a libertarian politics.

Hanson Debating Yudkowsky, Jun 2011

14 XiXiDu 03 July 2011 04:59PM

On Wednesday I debated my ex-co-blogger Eliezer Yudkowsky at a private Jane Street Capital event (crude audio here, from 4:45; better video here [as of July 14]).

I “won” in the sense of gaining more audience votes — the vote was 45-40 (him to me) before, and 32-33 after the debate. That makes me two for two, after my similar “win” over Bryan Caplan (42-10 before, 25-20 after). This probably says little about me, however, since contrarians usually “win” such debates.

Our topic was: Compared to the farming and industrial revolutions, intelligence explosion first-movers will quickly control a much larger fraction of their new world. He was pro, I was con. We also debated this subject here on Overcoming Bias from June to December 2008. Let me now try to summarize my current position.

[...]

It thus seems quite unlikely that one AI team could find an architectural innovation powerful enough to let it go from tiny to taking over the world within a few weeks.

Link: overcomingbias.com/2011/07/debating-yudkowsky.html

Eliezer passes 100,000 karma points

14 lukeprog 09 March 2011 07:59PM

Eliezer Yudkowsky has passed an arbitrary milestone: 100,000 karma points on Less Wrong. Allow me just a moment to celebrate this like we celebrate other arbitrary milestones, like birthdays.

I think that Eliezer's karma score vastly under-rates his relative contribution to this site. For example, his score is only about 13x my own score, but I think it's obvious he has contributed far more than 13x as much value to this community as I have.

This is probably due to the fact that good posts today get far more upvotes than earlier good posts, when I suspect the community was smaller. For example, my rather simple and insignificant post Secure Your Beliefs received 34 upvotes, which is more than almost any of Eliezer's epic and brilliant posts of the past have received, for example Terminal Values and Instrumental Values.

So at this arbitrary milestone, I'd just like to say a quick word to Eliezer:

 

Thanks.

You've done a lot.

 

Okay, that's all! I hope this doesn't come across as "sucking up to the Dear Leader," but instead as the sincere appreciation it is. There is a good reason I list Eliezer as one of my heroes-even-though-we-shouldn't-have-'heroes' over here.

Discussion for Eliezer Yudkowsky's paper: Timeless Decision Theory

10 Alexei 06 January 2011 12:28AM

I have not seen any place to discuss Eliezer Yudkowsky's new paper, titled Timeless Decision Theory, so I decided to create a discussion post. (Have I missed an already existing post or discussion?)