The Unique Games Conjecture and FAI: A Troubling Obstacle
I am not a computer scientist and do not know much about complexity theory. However, it's a field that interests me, so I occasionally browse some articles on the subject. I was brought to https://www.simonsfoundation.org/mathematics-and-physical-science/approximately-hard-the-unique-games-conjecture/ by a link on Scott Aaronson's blog, and read the article to reacquaint myself with the Unique Games Conjecture, which I had partially forgotten about. If you are not familiar with the UGC, that article will explain it to you better than I can.
One phrase in the article stuck out to me: "there is some number of colors k for which it is NP-hard (that is, effectively impossible) to distinguish between networks in which it is possible to satisfy at least 99% of the constraints and networks in which it is possible to satisfy at most 1% of the constraints". I think this sentence is concerning for those interested in the possibility of creating FAI.
It is impossible to perfectly satisfy human values, as matter and energy are limited, and so will be the capabilities of even an enormously powerful AI. Thus, in trying to maximize human happiness, we are dealing with a problem that's essentially isomorphic to the UGC's coloring problem. Additionally, our values themselves are ill-formed. Human values are numerous, ambiguous, even contradictory. Given the complexities of human value systems, I think it's safe to say we're dealing with a particularly nasty variation of the problem, worse than what computer scientists studying it have dealt with.
Not all specific instances of complex optimization problems are subject to the UGC and thus NP hard, of course. So this does not in itself mean that building an FAI is impossible. Also, even if maximizing human values is NP hard (or maximizing the probability of maximizing human values, or maximizing the probability of maximizing the probability of human values) we can still assess a machine's code and actions heuristically. However, even the best heuristics are limited, as the UGC itself demonstrates. At bottom, all heuristics must rely on inflexible assumptions of some sort.
Minor edits.
What is optimization power, formally?
I'm interested in thinking formally about AI risk. I believe that a proper mathematization of the problem is important to making intellectual progress in that area.
I have been trying to understand the rather critical notion of optimization power. I was hoping that I could find a clear definition in Bostrom's Superintelligence. But having looked in the index at all the references to optimization power that it mentions, as far as I can tell he defines it nowhere. The closest he gets is defining it in terms of rate of change and recalcitrance (pp.62-77). This is an empty definition--just tautologically defining it in terms of other equally vague terms.
Looking around, this post by Yudkowksy, "Measuring Optimization Power" doesn't directly formalize optimization power. He does discuss how one would predict or identify if a system were the result of an optimization process in a Bayesian way:
The quantity we're measuring tells us how improbable this event is, in the absence of optimization, relative to some prior measure that describes the unoptimized probabilities. To look at it another way, the quantity is how surprised you would be by the event, conditional on the hypothesis that there were no optimization processes around. This plugs directly into Bayesian updating: it says that highly optimized events are strong evidence for optimization processes that produce them.
This is not, however, a definition that can be used to help identify the pace of AI development, for example. Rather, it is just an expression of how one would infer anything in a Bayesian way, applied to the vague 'optimization process' phenomenon.
Alex Altair has a promising attempt at formalization here but it looks inconclusive. He points out the difficulty of identifying optimization power with just the shift in the probability mass of utility according to some utility function. I may be misunderstanding, but my gloss on this is that defining optimization power purely in terms of differences in probability of utility doesn't say anything substantive about how a process has power. Which is important it is going to be related to some other concept like recalcitrance in a useful way.
Has there been any further progress in this area?
It's notable that this discussion makes zero references to computational complexity, formally or otherwise. That's notable because the informal discussion about 'optimization power' is about speed and capacity to compute--whether it be brains, chips, or whatever. There is a very well-developed formal theory of computational complexity that's at the heart of contemporary statistical learning theory. I would think that the tools for specifying optimization power would be in there somewhere.
Those of you interested in the historical literature on this sort of thing may be interested in cyberneticist's Rosenblueth, Weiner, and Bigelow's 1943 paper "Behavior, Purpose and Teleology", one of the first papers to discuss machine 'purpose', which they associate with optimization but in the particular sense of a process that is driven by a negative feedback loop as it approaches its goal. That does not exactly square with an 'explosively' teleology. This is one indicator that explosively purposeful machines might be quite rare or bizarre. In general, the 20th century cybernetics movement has a lot in common with contemporary AI research community. Which is interesting, because its literature is rarely directly referenced. I wonder why.
Moloch: optimisation, "and" vs "or", information, and sacrificial ems
Go read Yvain/Scott's Meditations On Moloch. It's one of the most beautiful, disturbing, poetical look at the future that I've ever seen.
Go read it.
Don't worry, I can wait. I'm only a piece of text, my patience is infinite.
De-dum, de-dum.
You sure you've read it?
Ok, I believe you...
Really.
I hope you wouldn't deceive an innocent and trusting blog post? You wouldn't be a monster enough to abuse the trust of a being as defenceless as a constant string of ASCII symbols?
Of course not. So you'd have read that post before proceeding to the next paragraph, wouldn't you? Of course you would.
Academic Moloch
Ok, now to the point. The "Moloch" idea is very interesting, and, at the FHI, we may try to do some research in this area (naming it something more respectable/boring, of course, something like "how to avoid stable value-losing civilization attractors").
The project hasn't started yet, but a few caveats to the Moloch idea have already occurred to me. First of all, it's not obligatory for an optimisation process to trample everything we value into the mud. This is likely to happen with an AI's motivation, but it's not obligatory for an optimisation process.
One way of seeing this is the difference between "or" and "and". Take the democratic election optimisation process. It's clear, as Scott argues, that this optimises badly in some ways. It encourages appearance over substance, some types of corruption, etc... But it also optimises along some positive axes, with some clear, relatively stable differences between the parties which reflects some voters preferences, and punishment for particularly inept behaviour from leaders (I might argue that the main benefit of democracy is not the final vote between the available options, but the filtering out of many pernicious options because they'd never be politically viable). The question is whether these two strands of optimisation can be traded off against each other, or if a minimum of each is required. So can we make a campaign that is purely appearance based with any substantive position ("or": maximum on one axis is enough), or do you need a minimum of substance and a minimum of appearance to buy off different constituencies ("and": you need some achievements on all axes)? And no, I'm not interested in discussing current political examples.
Another example Scott gave was of the capitalist optimisation process, and how it in theory matches customers' and producers' interests, but could go very wrong:
Suppose the coffee plantations discover a toxic pesticide that will increase their yield but make their customers sick. But their customers don't know about the pesticide, and the government hasn't caught up to regulating it yet. Now there's a tiny uncoupling between "selling to [customers]" and "satisfying [customers'] values", and so of course [customers'] values get thrown under the bus.
This effect can be combated to some extent with extra information. If the customers (or journalists, bloggers, etc...) know about this, then the coffee plantations will suffer. "Our food is harming us!" isn't exactly a hard story to publicise. This certainly doesn't work in every case, but increased information is something that technological progress would bring, and this needs to be considered when asking whether optimisation processes will inevitably tend to a bad equilibrium as technology improves. An accurate theory of nutrition, for instance, would have great positive impact if its recommendations could be measured.
Finally, Zack Davis's poem about the em stripped of (almost all) humanity got me thinking. The end result of that process is tragic for two reasons: first, the em retains enough humanity to have curiosity, only to get killed for this. And secondly, that em once was human. If the em was entirely stripped of human desires, the situation would be less tragic. And if the em was further constructed in a process that didn't destroy any humans, this would be even more desirable. Ultimately, if the economy could be powered by entities developed non-destructively from humans, and which were clearly not conscious or suffering themselves, this would be no different that powering the economy with the non-conscious machines we use today. This might happen if certain pieces of a human-em could be extracted, copied and networked into an effective, non-conscious entity. In that scenario, humans and human-ems could be the capital owners, and the non-conscious modified ems could be the workers. The connection of this with the Moloch argument is that it shows that certain nightmare scenarios could in some circumstances be adjusted to much better outcomes, with a small amount of coordination.
The point of the post
The reason I posted this is to get people's suggestions about ideas relevant to a "Moloch" research project, and what they thought of the ideas I'd had so far.
Optimizing Workouts for Intellectual Performance
So this year I've stopped working out, and my grades have improved drastically, but at the cost of losing muscle mass and gaining fat, and becoming physically slower and lazier just as I became faster and more active intellectually. One effect I especially noticed was the disappearance of that perpetual state of happiness/satisfaction that comes from frequent physical exertion, which I think had a tendency to get in the way of a feeling of urgency regarding studies; why bother with tiresome and frustrating intellectual exercise when physical exercise yielded results and pleasure/satisfaction much more easily and reliably?
Anyway, this got me thinking: "I need to figure out a training that is optimized for intellectual performance. Aspects that might be interesting to work on would be:
- getting as much blood (oxygen, nutrients) as possible to the brain, whenever needed.
- minimizing the amount of other tissue (including muscle in excess of what is strictly needed for a comfortable daily life, and digestive organs in excess of what is needed to get the nutrients from the food).
- optimizing the diet in order to feed the brain according to its needs while avoiding dietetical imbalances that would result in damage of some sort or another (too much sugar can damage the pancreas, too much protein and the kidneys can suffer, etc.)
- something that is easy and quick to implement and follow, relatively inexpensive and straightforward; the idea is to save as much time, resources and energy as possible for the needs of studying/working.
These ideas I'm throwing around from a position of extreme ignorance. I've tried hiring nutritionists, but their diets were optimized for bodybuilding, not for intellectual efficacy, and were incredibly troublesome to follow. These involved about five to eight meals a day, large amounts of meat or meat substitutes, which is expensive to sustain, and me in a perpetual state of either hunger or digestive lethargy, plus permanent muscular soreness from the training regime that goes with it... and then there's the supplements.
So, yeah, I'm no gwern, but I'd love to figure out a diet that allows me to work at maximum efficacy. Other concerns, such as feeling strong or looking attractive or even dancing well, are quite far behind in priority. How should I go about this? How about you lads and ladies? What's your experience with dieting/working-out? More importantly, what does the research say?
P.S. I tried to read "Good Calories Bad Calories", but I never managed to finish it: it spent so much time attacking the current paradygm that I grew tired of waiting for it to actually list and summarize its recommendations. If anyone here finished reading that and drew out the conclusions, I'd love to hear them.
P.P.S. The main post will update as the discussion advances; once enough proper information is gathered, a top level post might emerge.
Modifying Universal Intelligence Measure
In 2007, Legg and Hutter wrote a paper using the AIXI model to define a measure of intelligence. It's pretty great, but I can think of some directions of improvement.
- Reinforcement learning. I think this term and formalism are historically from much simpler agent models which actually depended on being reinforced to learn. In its present form (Hutter 2005 section 4.1) it seems arbitrarily general, but it still feels kinda gross to me. Can we formalize AIXI and the intelligence measure in terms of utility functions, instead? And perhaps prove them equivalent?
- Choice of Horizon. AIXI discounts the future by requiring that total future reward is bounded, and therefore so does the intelligence measure. This seems to me like a constraint that does not reflect reality, and possibly an infinitely important one. How could we remove this requirement? (Much discussion on the "Choice of the Horizon" in Hutter 2005 section 5.7).
- Unknown utility function. When we reformulate it in terms of utility functions, let's make sure we can measure its intelligence/optimization power without having to know its utility function. Perhaps by using an average of utility functions weighted by their K-complexity.
- AI orientation. Finally, and least importantly, it tests agents across all possible programs, even those which are known to be inconsistent with our universe. This might okay if your agent is a playing arbitrary games on a computer, but if you are trying to determine how powerful an agent will be in this universe, you probably want to replace the Solomonoff prior with the posterior resulting from updating the Solomonoff prior with data from our universe.
Any thought or research on this by others? I imagine lots of discussion has occurred over these topics; any referencing would be appreciated.
Roommate interest and coordination thread
This thread is for the discussion of options for people interested in changing their living environments some time in the next year or so. It's a place to:
- Share your situation to get an outside view
- Get on the radar of potential roommates
- Discuss existing communities or places that may be a good fit
- Describe what you're looking for in a living environment
- Post your procedure for deciding where to live
- Coordinate with others to find compatible roommates
- Discuss which factors are relevant to deciding where to live
- Post resources or data relevant to deciding where to live
Whether you're graduating from college, moving for a new job, or looking to further optimize your living environment for other reasons, talking with others can help you identify options, catch inaccurate beliefs or poor reasoning, meet potential roommates, and more. Thanks to everyone who contributes!
(This thread has been on my mind for a while. Reading this recent roommate-seeking post inspired me actually write and post it. I'll post my own situation in the comments below.)
To discuss the concept of this thread (rather than participating in the thread's intended discussion), please reply to this comment. Credit goes to the open transactions thread and group rationality diary for some of the style and wording of this post.
Thoughts and problems with Eliezer's measure of optimization power
Back in the day, Eliezer proposed a method for measuring the optimization power (OP) of a system S. The idea is to get a measure of small a target the system can hit:
You can quantify this, at least in theory, supposing you have (A) the agent or optimization process's preference ordering, and (B) a measure of the space of outcomes - which, for discrete outcomes in a finite space of possibilities, could just consist of counting them - then you can quantify how small a target is being hit, within how large a greater region.
Then we count the total number of states with equal or greater rank in the preference ordering to the outcome achieved, or integrate over the measure of states with equal or greater rank. Dividing this by the total size of the space gives you the relative smallness of the target - did you hit an outcome that was one in a million? One in a trillion?
Actually, most optimization processes produce "surprises" that are exponentially more improbable than this - you'd need to try far more than a trillion random reorderings of the letters in a book, to produce a play of quality equalling or exceeding Shakespeare. So we take the log base two of the reciprocal of the improbability, and that gives us optimization power in bits.
For example, assume there were eight equally likely possible states {X0, X1, ... , X7}, and S gives them utilities {0, 1, ... , 7}. Then if S can make X6 happen, there are two states better or equal to its achievement (X6 and X7), hence it has hit a target filling 1/4 of the total space. Hence its OP is log2 4 = 2. If the best S could manage is X4, then it has only hit half the total space, and has an OP of only log2 2 = 1. Conversely, if S reached the perfect X7, 1/8 of the total space, then it would have an OP of log2 8 = 3.
How to measure optimisation power
As every school child knows, an advanced AI can be seen as an optimisation process - something that hits a very narrow target in the space of possibilities. The Less Wrong wiki entry proposes some measure of optimisation power:
One way to think mathematically about optimization, like evidence, is in information-theoretic bits. We take the base-two logarithm of the reciprocal of the probability of the result. A one-in-a-million solution (a solution so good relative to your preference ordering that it would take a million random tries to find something that good or better) can be said to have log2(1,000,000) = 19.9 bits of optimization.
This doesn't seem a fully rigorous definition - what exactly is meant by a million random tries? Also, it measures how hard it would be to come up with that solution, but not how good that solution is. An AI that comes up with a solution that is ten thousand bits more complicated to find, but that is only a tiny bit better than the human solution, is not one to fear.
Other potential measurements could be taking any of the metrics I suggested in the reduced impact post, but used in reverse: to measure large deviations from the status quo, not small ones.
Anyway, before I reinvent the coloured wheel, I just wanted to check whether there was a fully defined agreed upon measure of optimisation power.
Why would we think artists perform better on drugs ?
Introduction
It is common knowledge that many artists have used drugs (alcohol, opiates, cannabis, LSD, ...) and that this account for part of their creativity. This common knowledge is usually opposed to people advocating rationality in sentences like "but with only your rationality, we wouldn't have much art", "you need chaos to make art" or even "the best artists were that great because they were irrational".
Eliezer partly addressed the issue in the lawful intelligence Sequence. While this Sequence is very interesting, I feel it didn't completely address the issue (unlike most of the Sequences). My hypothesis is that it's mostly focused on what's important to building a Friendly AI (which is a worthy goal, this should not be taken as a criticism), not so much as explaining creativity in actual humans. So I'm writing an article with my current thoughts on the topic, and I would welcome any additional argument, hypothesis, research paper, ... that anyone from the LW community can point me to. This article is not supposed to come to any definitive conclusion, but to show my current state of thinking on that issue. I hope to both give and receive in writing it.
Reasons for which it could be an illusion
Availability bias
The first question to ask about "it is a common knowledge that many artists were using drugs" is : is this common knowledge true, or not, and if not, why do so many people believe something which is false ?
Availability bias comes will full power on this issue : when we hear that a given artist (musician, writer, painter, ...) was taking drugs, we add a "drug addict" tag to him. Or more accurately we create a link between the "drug addict" node and his node in our belief network. When asked about artists who did take drugs, we can easily state many names : for example Hemingway, Van Gogh, the Beatles. When asked about artists who didn't take drugs... well, we usually don't have "did not take drug" node in our belief networks, and no easy way to say that Asimov or Bach didn't take drugs.
Even when doing specific research, we can know with almost absolute certainty that Hemingway was drinking a lot of alcohol, but not so confidently that Asimov didn't. It's easier to be sure of the existence of something, than to be sure of its non-existence.
Reverse causality
The second question, if even after considering the affect of availability bias, it still seems than artists take drugs more often than average, is to ask about which sens the causality flows. Statistical correlation points to a causality, but doesn't tell you which sense is the causality, nor if it's direct or indirect.
There can be many reasons for which the causality works backwards : someone is not a good artist because he takes drug, but he takes drugs because he is an artist.
The lifestyle of a professional artist is usually different from the lifestyle of most other people. They usually don't have to wake up at 7 to be at work at 8, since they can work at any time. They also tend to be either very poor (many artists were only praised and recognized after being dead) or very rich (for the few who reach success while they are still alive). And we know that very poor people tend to fall on alcohol more often, while very rich people tend to use more frequently some of the very expensive drugs like cocaine.
Being an artist also usually induces a higher uncertainty about the future than with most regular jobs, which may trigger the use of drugs to make the angst easier to withstand.
Common cause
Apart from direct causality one way or another, a statistical correlation can also indicate that is a hidden common cause between the two phenomenas. If artists take drugs more often, it could be because there is a common reason that pushes people to both by a great artist and to take drugs.
Many reasons can be invoked that way, due to superexponential hypothesis space. I'll risk to be privileging the hypothesis but I can name a few. For example someone with an overdevelop emotional sensitivity could be both great at writing art able to call to our emotions and be more tempted to use drugs as relief from over-experienced negative emotions. Or someone who happens to be an outcast can be more likely to perform art (since it is usually a solitary work, not a team work) and at the same time use more drugs to escape from the pain of being an outcast.
So, where do we stand now ?
When faced with a statement such as "artists take drugs more often than average, so drugs help creativity" we can emit 4 different classes of hypothesis :
- The initial statement is wrong, artists don't take more drugs than average.
- Artists take more drugs than average, but the causality is reverse (it's being an artist that make you take drugs, not the other way around).
- Artists take more drugs than average, but that's because of a common factor that increases likelihood of taking drugs and of making great art, not the drugs themselves increasing artistic creativity.
- This is true, for a reason or another, drugs help creativity.
We saw some possible reasons for 1., 2. and 3. Some of them seem to be very real to me, especially the availability bias, but I do not think they totally account for the facts.
As much as I would love to be able to stop here and say that drugs and chaos play no positive role in creativity, that creativity is purely lawful and rational, I fear that would be wishful thinking and refusing to attack my belief's weak points. To state it more lightly : my D&D alignment could very well be lawful-good (as my friends tease me it is), but that shouldn't prevent me from admitting that chaos play a positive role somewhere if it actually does.
Reasons for which it could be real
Chaos and optimization
Generating great art can be seen as an optimization process. The actual function that evaluate a piece of art may be very complex, partly depending of the recipient, and its formalization unknown, but it can still be considered an optimization process : generating a book, or a painting, or a song that scores very high in most people's evaluation function.
In general, chaos is not an optimization process. Adding chaos to an optimization process usually makes it worse. But there are known counter-examples, where an imperfect optimization process will gain from a slight controlled increase of chaos.
Lawfully controlled chaotic optimization
The first known example is the first optimization process ever : evolution. Evolution involves too part : mutations which are chaotic, done at random, and natural selection which is lawful and selects the few evolutions that happened to be positive. The Roger Zelazny picture of the universe being an equilibrium between Order and Chaos may come from that pattern. If you increase chaos too much in natural selection, the information will not be replicated enough from generation to generation, and not much optimization will occur. But if you don't have any mutation, if you remove all the chaos, the process will freeze too.
I remember an experiment from biology lessons in high school : take two small boxes of glass, put cotton with water and sugar at the bottom. Take some bacteria, and but the bacteria on side of the box. Take an antibiotics pill, and put it on the other side of the box. Put box A in safe storage. Put box B in safe storage too, but every day, expose it to a small amount of UV light. The bacteria of box A will quickly spread on the cotton, but will not go anywhere close the antibiotics pill. The bacteria of box B will start doing the same, but after two or three weeks, they conquer even the antibiotics area. After a longer time period, box A bacteria will also overcome the antibiotics, but it'll take them much longer. The UV light increased the mutation rate, and sped up the optimization process of evolution. But only a very small dose of UV light does that, overdose it, and the bacteria B will all die.
That's what I call "lawfully controlled chaotic optimization" : there is a lawful control process (here, natural selection) that selects randomly tried solutions. That's something that can directly apply to artists : the control process (be it the filter of editors/publishers, or the filter of public reaction) is, to a point, lawful, but the process that generate solutions could benefit from a slight increase in chaos. Or more exactly, the combined (generator + filter) algorithm could perform better with a slightly more chaotic generator. To retake Eliezer's definition of creativity, which was "the creative surprise is the idea that ranks high in your preference ordering but low in your search ordering", adding chaos to preference ordering would be pointless, but adding chaos to the search ordering can allow more creative surprise to happen in a given finite time.
There is still a major difference between the two processes described here (evolution and human creativity) : evolution uses a fully random generator, whereas the human brain has a great ability of generating non-random designs, allowing a much faster improvement rate. You'll never get a book of Hemingway or a painting of Van Gogh by randomly selecting letters or randomly putting paint on a canvas. The chance of that is too infinitesimally low. So the generator will have to stay mostly lawful. Hemingway used words and respected the rules of grammar. Van Gogh painted something which look very like real sunflowers. A fully chaotic process would never produce anything near their masterpieces even given billions of years. So artistic creativity must be mostly lawful, even for generating its hypothesis to select from.
As spotted by Vaniver in the comments, Hemingway himself said something very similar to that thesis : "Write drunk; edit sober."
Avoiding local minimal
One big problem of optimization processes is local minimal. Most of the naive optimization process, like a gradient descent, will get trapped into local minima. Let's have a look at that curve (burrowed from Wikipedia) :
If you start a naive optimization algorithm in the right part of the curve, you'll very likely end up in the local minimum, while the global minimum would rank much better in your preferences. Adding some form of controlled chaos to the algorithm is an easy way to increase the chance of reaching the global minimum, even in much more complex setups than this simple curve.
For a relatively broad class of problems, like selecting the best position of nodes to minimize the length of edges when doing a bitmap representation of a graph structure, an algorithm which works quite well and is simple to code is the simulated annealing algorithm, which works by doing local optimizations, but having a global temperature which adds chaos (the higher the temperature, the more random is the process). The temperature itself decreases with the process, and ultimately reaches 0 (pure lawful optimization).
Such methods are of course "dirty hacks", that are used only when the problem is too complicated and we don't have a purely lawful algorithm that gives the answer, or (most of the time) when we do have one, but with an exponential complexity, meaning we can't run it in real life.
The same idea applies to human creativity : chaos wouldn't be needed, nor useful, if we had a fully working algorithm to write the best books or songs or make the best paintings. But since we don't, using a purely lawful process has a risk (but yes, only a risk) of getting us stuck into a local minimum - improving the methods of the previous generation of artists, but not inventing brand new styles of art. This is a similar concept to the "jumping out of the system" described by Douglas Hofstadter and analyzed by Eliezer. JOOTSing is escaping a local minimum. It's escaping the safe warmth of the valley, climbing the cold and dangerous mountain top, to find another, more fertile valley on the other side. That requires to violate the rules of "staying into the safe and warm valley".
(Note : there is somehow an analogy between the use of drugs and the simulated annealing : drugs induce a state of high chaos, which then slowly goes down as the drug effects disappears. Or at least I was told so, since I never tried personally. But that seems a surface analogy to me, so I won't give it much credit).
Inhibitions and art
Or another way to consider it is to look at is inhibitions : the human mind contains a process that'll check your actions (painting and writing in that case, but applies more broadly) and sometimes say "no, don't do that, you'll look as a fool". Those inhibitions are usually here to protect us from botching in social situations. But they are (as most of the human brain) imperfectly calibrated, and will tend to repress anything that goes out of the current norm. Lowering those inhibitions increases the risk of botching - but also the chance of doing something awesome.
This points to much more general pattern, which applies when what matters is not improving your average gain, but your chance of being one of the few best. Consider you've a task to do, and two ways to achieve it. Way A is quite classical, and doesn't involve much risk. Way B is much less proven, and involves much risk of doing both better and worse. Being a role player, I usually use dice rolls to model those kinds of process. Let's say process A is 20d10. That means, rolling 20 times a 10-sided die, and doing the sum. This will give an expected value of 110, with only 1% of the rolls above 140. Now process B is 2d100 (rolling 2 times a 100-sided die and doing the sum). This will give a lower expected value, of 101 instead of 110. But with 18% of the rolls above 140. Here is a picture of the two process (way A in green, way B in red) generated with a quick Python script :

If what matters is doing your best in average (your score at the task will directly map to an amount of money between $2 and $200), then the best choice is to look only at the expected value of A and B, and select the one which has the best expected value, so A in this case, as you can see, the green curve peaks at a higher value.
But if what matters is not doing the best in average, but being the best : 100 people are performing the task, and the best will take the prize, the rest won't have anything. Then, you except one of the 100 to be above 140, even if they all use way A. So for yourself, if you use way A, you only have 1 chance in 100 to be above 140. If you use way B, you've 18 chances in 100 of beating the 140 mark. Looking at the curve, there is much bigger blob of the red curve that goes to very high values.
When looking at arts, we don't regard the average. Countless people write books or paint. Almost everyone at least tried once. What history remembers are the few best of their time. Not those who did better in average, but those who manage to do better than most of their peers. Those to the right of the picture, in which the amplitude of the green curve is nearly void, but the red curve still exists.
The complexity of testing certain hypothesis
I emitted many hypothesis in this article, to try to explain the common knowledge that "so many great artists take drugs", and more generally to look into the reasons for which chaos can, in some cases, improve a result.
All those hypothesis seem totally plausible to me - and I would say that they all play a role in the process. But saying "everything plays a role" is not saying much, a graph with all possible edges contains as much informations as a graph with no edge. What would be require now is to consider how much each hypothesis contributes to the result - and then, probably one or two will account for most.
But how can we setup such a test ? In physics, doing experiments is relatively easy. It can costs a lot like building the LHC or sending the Hubble space telescope in orbit, but still, devising experiments is relatively easy. In social sciences, it's often much harder. Most social science experiments are done on a panel of test subjects (with a control group, ...). But right now we are speaking of the best artists. How can we build such a panel ? Defining who are the best artists is a very hard task. And then, getting them to participate in studies...
The simplest hypothesis to test, the availability bias, would require a procedure like (numbers can be adjusted) :
- Take 1000 people at random, from various ages, social classes and background.
- Ask for each of them to name the 10 artists they like the most (without of course mentioning the purpose of the experiment).
- For each artist nominated by at least 4 persons, look if that artist did take drugs.
- Compare with the average drug use.
But even that is not without troubles : for 3., how can you be sure an artist didn't take drug secretly, especially in time/places where drug use is prohibited or frown upon ? For 4., how do you ponder for the variation in drug use depending of the place/time ?
Does anyone know of such a study (I couldn't find any, but I'm not well versed in the art for looking for social science studies) ?
For the other hypothesis, testing them becomes even harder.
Conclusion
As Eliezer explained, pure chaos cannot lead to anything but static on a TV screen. Any optimization process, and art is one, requires a lawful part. But as I showed, for several reasons, an imperfect optimization process may perform better with a limited amount of added chaos. Since the human brain is an imperfect optimization process, it would not be surprising that in the purpose of creating awesome pieces of art in a limited time, some added chaos can help. But on the other hand, there are other reasons for which there could be a common knowledge that "artistic creativity requires some chaos" even if it were not true. And it is very hard to tell apart the various reasons.
But even if some amount of chaos can help in generating exceptionally awesome pieces of art, it should not shadow the fact that the lawful part of process is absolutely required, and even the most important one; nor that chaos can only be useful when the optimization is itself imperfect. Improving the quality of the optimization process (by, for example, raising the sanity waterline or understanding better the human brain) would lower the need of chaos to generate the same awesomeness.
PS : I post that to "Less Wrong discussion", for initial review and because it's half-way between a "real" article and a call for discussion on the topic. Depending of feedback, I hope to repost it to "main Less Wrong", hopefully improved from the feedback.
Ranking the "competition" based on optimization power
Most long term users on Less Wrong understand the concept of optimization power and how a system can be called intelligent if it can restrict the future in significant ways. Now I believe that in this world, only institutions are close to superintelligence in any significant way.
I believe it is important for us to have atleast some outside idea of which institutions/systems are powerful in today's world so that we can atleast see some outlines of how the increasing optimization power will end up affecting normal people.
So, my question is - what are the present institutions or systems that you would classify as having the maximum optimization power. Please present your thought behind it if you feel you are mentioning some unknown institution. I am presenting my guesses after the break.
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)