You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

What is optimization power, formally?

10 sbenthall 18 October 2014 06:37PM

I'm interested in thinking formally about AI risk. I believe that a proper mathematization of the problem is important to making intellectual progress in that area.

I have been trying to understand the rather critical notion of optimization power. I was hoping that I could find a clear definition in Bostrom's Superintelligence. But having looked in the index at all the references to optimization power that it mentions, as far as I can tell he defines it nowhere. The closest he gets is defining it in terms of rate of change and recalcitrance (pp.62-77). This is an empty definition--just tautologically defining it in terms of other equally vague terms.

Looking around, this post by Yudkowksy, "Measuring Optimization Power" doesn't directly formalize optimization power. He does discuss how one would predict or identify if a system were the result of an optimization process in a Bayesian way:

The quantity we're measuring tells us how improbable this event is, in the absence of optimization, relative to some prior measure that describes the unoptimized probabilities.  To look at it another way, the quantity is how surprised you would be by the event, conditional on the hypothesis that there were no optimization processes around.  This plugs directly into Bayesian updating: it says that highly optimized events are strong evidence for optimization processes that produce them.

This is not, however, a definition that can be used to help identify the pace of AI development, for example. Rather, it is just an expression of how one would infer anything in a Bayesian way, applied to the vague 'optimization process' phenomenon.

Alex Altair has a promising attempt at formalization here but it looks inconclusive. He points out the difficulty of identifying optimization power with just the shift in the probability mass of utility according to some utility function. I may be misunderstanding, but my gloss on this is that defining optimization power purely in terms of differences in probability of utility doesn't say anything substantive about how a process has power. Which is important it is going to be related to some other concept like recalcitrance in a useful way. 

Has there been any further progress in this area?

It's notable that this discussion makes zero references to computational complexity, formally or otherwise. That's notable because the informal discussion about 'optimization power' is about speed and capacity to compute--whether it be brains, chips, or whatever. There is a very well-developed formal theory of computational complexity that's at the heart of contemporary statistical learning theory. I would think that the tools for specifying optimization power would be in there somewhere.

Those of you interested in the historical literature on this sort of thing may be interested in cyberneticist's Rosenblueth, Weiner, and Bigelow's 1943 paper "Behavior, Purpose and Teleology", one of the first papers to discuss machine 'purpose', which they associate with optimization but in the particular sense of a process that is driven by a negative feedback loop as it approaches its goal. That does not exactly square with an 'explosively' teleology. This is one indicator that explosively purposeful machines might be quite rare or bizarre. In general, the 20th century cybernetics movement has a lot in common with contemporary AI research community. Which is interesting, because its literature is rarely directly referenced. I wonder why.

Mutual Worth without default point (but with potential threats)

6 Stuart_Armstrong 31 July 2013 09:52AM

Though I planned to avoid posting anything more until well after baby, I found this refinement to MWBS yesterday, so I'm posting it while Miriam sleeps during a pause in contractions.

The mutual worth bargaining solution was built from the idea that the true value of a trade is having your utility function access the decision points of the other player. This gave the idea of utopia points: what happens when you are granted complete control over the other person's decisions. This gave a natural 1 to normalise your utility function. But the 0 point is chosen according to a default point. This is arbitrary, and breaks the symmetry between the top and bottom point of the normalisation.

We'd also want normalisations that function well when players have no idea what their opponents will be. This includes not knowing what their utility functions will be. Can we model what a 'generic' opposing utility function would be?

It's tricky, in general, to know what 'value' to put on an opponent's utility function. It's unclear what kind of utilities would you like to see them have? That's because game theory comes into play, with Nash equilibriums, multiple solution concepts, bargaining and threats: there is no universal default to the result of a game between two agents. There are two situations, however, that are respectively better and worse than all others: the situation where your opponent shares your exact utility function, and the situations where they have the negative of that (they're essentially your 'anti-agent').

If your opponent shares your utility function, then there is a clear ideal outcome: act as if you and the opponent were the same person, acting to maximise your joint utility. This is the utopia point for MWBS, which can be standardised to take value 1.

If your opponent has the negative of your utility, then the game is zero-sum: any gain to you is a loss to your opponent, and there is no possibility for mutually pleasing compromise. But zero-sum games also have a single canonical outcome! For zero-sum games, the concepts of Nash equilibrium, minimax, and maximin are all equivalent (and are generally mixed outcomes). The game has a single defined value: each player can guarantee they get as much utility as that value, and the other player can guarantee that they get no more.

It seems natural to normalise that point to -1 (0 would be equivalent, but -1 feels more appropriate). Given this normalisation for each utility, the two utilities can then be summed and joint maximised in the usual way.

This bargaining solution has a lot of attractive features - it's symmetric in minimal and maximal utilities, does not require a default point, reflects the relative power, and captures the spread of opponents utilities that could be encountered without needing to go into game theory. It is vulnerable to (implicit) threats, however! If I can (potentially) cause a lot of damage to you and your cause, then when you normalise your utility, you get penalised because of what your anti-agent could do if they controlled my decision nodes. So just by having the power do do bad stuff to you, I come out better than I would otherwise (and vice-versa, of course).

I feel it's worth exploring further (especially what happens with multiple agents) - but for me, after the baby.

Best causal/dependency diagram software for fluid capture?

1 [deleted] 08 April 2013 07:20PM

I've found most graphing software too clunky, or having too much mental friction, for my purpose of creating graphically represented plans, to convert written diagrams into digital form, or to do preference inference based on the structure of my goals (amongst other things).

So far the only tool that I've seen that reduces this friction is GraphViz [1], since I think I can literally just list down connection after connection in markup, with no care for structure or reasonableness, and then prune connections after I see how the entire thing looks. Point and click is for suckers.

However, I also like the approach of Freemind that quickly outputs a visual map that is easily traversable; but it doesn't do much for me when the causality is more involved.

Are there any alternatives that anyone is aware of?

[1] If you are not familiar with GraphViz, see this amusing introduction that maps the social network in R. Kelly's hit hip hopera, "Trapped in the Closet".

Notes on Psychopathy

18 gwern 19 December 2012 04:02AM

This is some old work I did for SI. See also Notes on the Psychology of Power.

Deviant but not necessarily diseased or dysfunctional minds can demonstrate resistance to all treatment and attempts to change their mind (think No Universally Compelling Arguments; the premier example are probably psychopaths - no drug treatments are at all useful nor are there any therapies with solid evidence of even marginal effectiveness (one widely cited chapter, “Treatment of psychopathy: A review of empirical findings”, concludes that some attempted therapies merely made them more effective manipulators! We’ll look at that later.) While some psychopath traits bear resemblance to general characteristic of the powerful, they’re still a pretty unique group and worth looking at.

The main focus of my excerpts is on whether they are treatable, their effectiveness, possible evolutionary bases, and what other issues they have or don’t have which might lead one to not simply write them off as “broken” and of no relevance to AI.

(For example, if we were to discover that psychopaths were healthy human beings who were not universally mentally retarded or ineffective in gaining wealth/power and were destructive and amoral, despite being completely human and often socialized normally, then what does this say about the fragility of human values and how likely an AI will just be nice to us?)

continue reading »

Modifying Universal Intelligence Measure

2 Alex_Altair 18 September 2012 11:44PM

In 2007, Legg and Hutter wrote a paper using the AIXI model to define a measure of intelligence. It's pretty great, but I can think of some directions of improvement.

  • Reinforcement learning. I think this term and formalism are historically from much simpler agent models which actually depended on being reinforced to learn. In its present form (Hutter 2005 section 4.1) it seems arbitrarily general, but it still feels kinda gross to me. Can we formalize AIXI and the intelligence measure in terms of utility functions, instead? And perhaps prove them equivalent?
  • Choice of Horizon. AIXI discounts the future by requiring that total future reward is bounded, and therefore so does the intelligence measure. This seems to me like a constraint that does not reflect reality, and possibly an infinitely important one. How could we remove this requirement? (Much discussion on the "Choice of the Horizon" in Hutter 2005 section 5.7).
  • Unknown utility function. When we reformulate it in terms of utility functions, let's make sure we can measure its intelligence/optimization power without having to know its utility function. Perhaps by using an average of utility functions weighted by their K-complexity.
  • AI orientation. Finally, and least importantly, it tests agents across all possible programs, even those which are known to be inconsistent with our universe. This might okay if your agent is a playing arbitrary games on a computer, but if you are trying to determine how powerful an agent will be in this universe, you probably want to replace the Solomonoff prior with the posterior resulting from updating the Solomonoff prior with data from our universe.

Any thought or research on this by others? I imagine lots of discussion has occurred over these topics; any referencing would be appreciated.

Notes on the Psychology of Power

34 gwern 27 July 2012 07:22PM

Luke/SI asked me to look into what the academic literature might have to say about people in positions of power. This is a summary of some of the recent psychology results.

The powerful or elite are: fast-planning abstract thinkers who take action (1) in order to pursue single/minimal objectives, are in favor of strict rules for their stereotyped out-group underlings (2) but are rationalizing (3) & hypocritical when it serves their interests (4), especially when they feel secure in their power. They break social norms (5, 6) or ignore context (1) which turns out to be worsened by disclosure of conflicts of interest (7), and lie fluently without mental or physiological stress (6).

What are powerful members good for? They can help in shifting among equilibria: solving coordination problems or inducing contributions towards public goods (8), and their abstracted Far perspective can be better than the concrete Near of the weak (9).

  1. Galinsky et al 2003; Guinote, 2007; Lammers et al 2008; Smith & Bargh, 2008
  2. Eyal & Liberman
  3. Rustichini & Villeval 2012
  4. Lammers et al 2010
  5. Kleef et al 2011
  6. Carney et al 2010
  7. Cain et al 2005; Cain et al 2011
  8. Eckel et al 2010
  9. Slabu et al; Smith & Trope 2006; Smith et al 2008

continue reading »

Thoughts and problems with Eliezer's measure of optimization power

17 Stuart_Armstrong 08 June 2012 09:44AM

Back in the day, Eliezer proposed a method for measuring the optimization power (OP) of a system S. The idea is to get a measure of small a target the system can hit:

You can quantify this, at least in theory, supposing you have (A) the agent or optimization process's preference ordering, and (B) a measure of the space of outcomes - which, for discrete outcomes in a finite space of possibilities, could just consist of counting them - then you can quantify how small a target is being hit, within how large a greater region.

Then we count the total number of states with equal or greater rank in the preference ordering to the outcome achieved, or integrate over the measure of states with equal or greater rank.  Dividing this by the total size of the space gives you the relative smallness of the target - did you hit an outcome that was one in a million?  One in a trillion?

Actually, most optimization processes produce "surprises" that are exponentially more improbable than this - you'd need to try far more than a trillion random reorderings of the letters in a book, to produce a play of quality equalling or exceeding Shakespeare.  So we take the log base two of the reciprocal of the improbability, and that gives us optimization power in bits.

For example, assume there were eight equally likely possible states {X0, X1, ... , X7}, and S gives them utilities {0, 1, ... , 7}. Then if S can make X6 happen, there are two states better or equal to its achievement (X6 and X7), hence it has hit a target filling 1/4 of the total space. Hence its OP is log2 4 = 2. If the best S could manage is X4, then it has only hit half the total space, and has an OP of only log2 2 = 1. Conversely, if S reached the perfect X7, 1/8 of the total space, then it would have an OP of log2 8 = 3.

continue reading »

The Craft And The Community: Wealth And Power And Tsuyoku Naritai

3 [deleted] 23 April 2012 04:06PM

In this post, I'll try to tackle the question of whether this community and its members should focus more efforts and resources on improving their strength as individuals and as a community, than on directly tackling the problem of singularity. I'll start off with a personal anecdote, because, while I know it's not indispensable, I think anecdotes help the reader to think in near rather than far mode, and this post's topic is already too easily thought of in far mode in the first place.

The other day, I was in an idle conversation with a cab driver when he asked me: What would you do if you won the lottery? Is there some particular dream you have, such as travelling the world or something? I said (and I apologize in advance for the grandiosity and egotism of what follows, mostly because it might show a poor appraisal of my own competence and ability)

Well, it's not like I would ever play at the lottery, but if I did, and somehow won, I would

  • Pay myself the very best tutors and the very best education (I'm thinking Master's Degrees, PhD, and so on, that's all pretty damned expensive depending on where you take it) in my chosen speciality.
  • Pay myself the best aid in achieving peak sustainable physical, mental, and emotional condition (as optimized for the struggles and stresses of a daily life of extreme academic exertion, not for, say, performing in the battlefield, the olympics, or competitive chess). Coaches, gurus, chemicals, whatever it takes.
  • Spend one or two or even three years around the world learning as many "important" languages as I can. Not in order of ease or priority: Portuguese, Italian, Russian, Mandarin Chinese, Japanese, Hindi, Urdu, Farsi, and Turkish and Arabic and Hebrew and their Ancient variants, because of all the doors these could open... and Basque and Navajo (those two would be just for the hell of it). (I already know English, Spanish, French, and a fair amount of German and Arabic).
  • With the acquired technical knowledge and skills, and the help of the contact network and the better understanding of human nature that learning so many languages and exploring so many cultures and travelling so much will have netted me, use the remaining money to start a business, one that involves as many people as possible in a way such that I can train them to be a Chaos Legion.
  • Hopefully, once I have achieved enough profits to make the growth of my business secure, donate a constant stipendium to my favourite nonprofits.
  • In my old age, use the returns from all the previous efforts to found a school (actually an integral education system, think something between Summerhill School and the Mahora Academy Complex) which would be optimized for great justice and the rigorous use and promotion and exponential spread of modern rationality

 

 

My reply surprised both of us. Him, because it was atypical (apparently most people would spend them on luxury items and so on, that is, they would spend their newfound money in signalling that they have it... I think the mistake comes from seeing rich people doing it and then assuming that that's what you should do if you become rich, the only other option apparently being saving it up in an account). That a modern rationalist came up with an atypical answer to such a question is only to be expected.

But I was surprised too, because I found it strange that what I thought I ought to do and what I wanted to do coincided so perfectly. I wasn't even expecting those last two points, they sort of naturally came out in the spur of the moment. Upon further thought, I was also surprised that this turned out to be merely an exaggeration and heavy of my pre-existing plan, which I am already attempting to follow with far less material means. That is to say, the dramatic change in money did not fundamentally change what I wanted to do with my (currently limited) lifetime.

But then I asked myself: if my priority is reducing existential risk, why am I not giving all the money to my favourite nonprofits immediately?

And that's where it hit me: I wanted to make myself stronger. And the point I'm trying to make is that, well, so should we all. Why?

There's a strong selfish component to that (not that there's anything wrong with healthy selfishness), but, for someone who considers existential risk an extremely important factor, enlightened self-interest might still be on the side of donating immediately.

But it might also be a sound strategy, sounder, perhaps, to exponentially increase our ability to help fight existential risk, in terms of fear, and improve the general level of human rationality, in terms of desire (I understand that we would all be happier in a world with more rational people, for many, many reasons, not all of which are altruistic). So, how would we go about this? I submit to you this tentative strategy draft.

 

  • Step Zero: Improve our own physical and mental condition. We need the best possible hardware to operate on. This will give us raw ability. The ability to gain abilities, so to speak. This includes making ourselves happy (which itself overlaps with and is enabled by some of the following points: it's a virtuous cycle).
  • Step One: Increasing our own personal, intrinsic worth: buying with our money goods that improve our ability to both obtain and enjoy more goods, and that could never be taken away from us by economic transaction. While we could teach ourselves those with only the cost of opportunity of not spending that time earning wages, well thought out and carefully applied expenditure can significantly accelerate and smooth the process. This will make us powerful, useful tools, for our own goals and for the goals of those that would associate with us (employers, allies, and so on). This will give us authority. We become acknowledged experts on or at a socially useful field. Scientists, Engineers, Artists and other highly skilled folks are on this level: they can already get a lot done, and change the world but, as the Creationism issue proves, among others, it isn't nearly enough. You often don't get to choose what to work on or how many resources are made available to you, you don't have any control on the fruit of your work once it's done and released, and you may always have trouble getting people to follow your advice, no matter how much you think you know better. Step one is the step of Intrinsic Power.
  • Step Two: Increasing our ability to take advantage of social status: learning and perfecting languages, social skills, communication and manipulation tools, dress sense and sense of signalling, dancing, romantic an sexual intelligence and skill (how many powerful people had their careers and/or reputations ruined forever because of badly handled sex and romance?)... That is, we will learn and master the rules of the game, the game we are all playing all the time by virtue of associating with other human beings, and avoid defecting by accident, among other possible mistakes. This will give us urbanity. Together with authority, it already means both credit and influence. At this level, you can actually get a lot more stuff done, because you're much better at persuading people to want to follow your suggestions out of their own volition.
    • This includes the ability to delegate, divide work and manage specialists, empower and motivate people to help you, helping them grow themselves in the process, etc. Step Two is the step of Soft Power.

A lot of effort has already been expended by the community in working on these first steps. But there's a third step that isn't getting worked on much, perhaps because of aesthetic values, perhaps because it's one of the most dangerous to wield, both to the world and to ourselves and our own personal integrity:

  • Step Three: Increase how much coercive power we hold over how many of our fellow human beings: the ability to make them do things or else. My impression is that economic power (both affluence and assets) is much more secure and far less vulnerable than other sources such as, say, media influence or political clout, or social power brokering (which is greatly enhanced by joining support groups or becoming one ourselves), (although the feedback between these tends to be positive, on average, and overlap and migration between them is hardly unheard of). This power is increased exponentially, and is much easier to maintain, by having both Step One (you actually know what you're doing or at least where to get the information, and are more able to judge it) and Two (you know what not to do and how to achieve the greatest results with the minimal expenditure of your power) under your belt, and of course all three steps profit from Step Zero. At this level, to a certain extent, people will do what you want them to, their own feelings, initiatives and desires factoring far less into the actions they end up choosing than they otherwise would.

 

Those are partly selfish goals unto themselves: power means freedom to do what you want, and that and high social status are already very enjoyable for their own sake. Additionally, the more of us achieve them (and the larger the capacity in which they achieve them), the more resources they can get assigned and the more support they can gather (or force) for the sake of efforts towards preventing existential risk. But I suggest that they be mainly planned, optimized and instrumentalized for Step Four, the most dangerous of all:

 

  • Step Four: Use the gained knowledge, skills, assets, and position to improve the overall level of both cognitive and instrumental rationality of humanity.

 

Which has the following advantages I can think of, listed without regard for altruism or selfishness:

 

  • Humans are enabled to be far more successful in the pursuit of happiness, whatever that is, and in otherwise improving themselves, their lives, and the world around them. Their liberty in terms of choices is also greatly increased, once free of akrasia and with an enhanced ability to identify and accurately assess choices.
  • We as rationalists feel far less isolated and vulnerable and far more at home in a world where there are more people like us and where people are more like us (not quite the same thing). Life will generally be more fun and interesting.
  • We'll get a much wider pool of potential candidates from which people with the ability to help prevent existential risk, and the cost and difficulty of gathering more support and resources will be greatly diminished. In other words, we'll be much more effective at preventing existential risk.

 

Does achieving Step Four mean humanity will actually be in less of a danger of self-destructing at that point? It's not a rhetorical question, and I don't think its answer is trivial: in particular, having many half rationalists (I might well still be one myself) running around might represent a considerable danger, which could be sustained in time. However, projects such as Methods of Rationality or The Centre for Modern Rationality, as well as this site's very existence seem to hint that some of the smartest among us are willing to take the risk.

So, the immediate question I ask of you in earnest, the whole point of this post: How do we go about spending our money and effort in the most effective way to prevent existential risk? How much to de expend in directly attacking the problem as we are, how much do we expend in actually making ourselves stronger?

In sillier terms: Should the Z Warriors go and try to confront Cell right now, before he grows too strong to beat, or should they avoid the fight and go train instead? (assume that they do nothing with their lives but be in fights, train to prepare for fights, or run away from fights they are not prepared for yet)

Ranking the "competition" based on optimization power

-1 blogospheroid 17 October 2010 04:55PM

Most long term users on Less Wrong understand the concept of optimization power and how a system can be called intelligent if it can restrict the future in significant ways. Now I believe that in this world, only institutions are close to superintelligence in any significant way. 

I believe it is important for us to have atleast some outside idea of which institutions/systems are powerful in today's world so that we can atleast see some outlines of how the increasing optimization power will end up affecting normal people.

So, my question is - what are the present institutions or systems that you would classify as having the maximum optimization power. Please present your thought behind it if you feel you are mentioning some unknown institution. I am presenting my guesses after the break.

continue reading »