Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
This post is for all the people who have been following Arbital's progress since 2015 via whispers, rumors, and clairvoyant divination. That is to say: we didn't do a very good job of communicating on our part. I hope this posts corrects some of that.
The top question on your mind is probably: "Man, I was promised that Arbital will solve X! Why hasn't it solved X already?" Where X could be intuitive explanations, online debate, all LessWrong problems, AGI, or just cancer. Well, we did try to solve the first two and it didn't work. Math explanations didn't work because we couldn't find enough people who would spend the time to write good math explanations. (That said, we did end up with some decent posts on abstract algebra. Thank you to everyone who contributed!) Debates didn't work because... well, it's a very complicated problem. There was also some disagreement within the team about the best approach, and we ended up moving too slowly.
So what now?
You are welcome to use Arbital in its current version. It's mostly stable, though a little slow sometimes. It has a few features some might find very helpful for their type of content. Eliezer is still writing AI Alignment content on it, and he heavily relies on the specific Arbital features, so it's pretty certain that the platform is not going away. In fact, if the venture fails completely, it's likely MIRI will adopt Arbital for their personal use.
I'm starting work on Arbital 2.0. It's going to be a (micro-)blogging platform. (If you are a serious blogger / Tumblr user, let me know; I'd love to ask you some questions!) I'm not trying to solve online debates, build LW 2.0, or cure cancer. It's just going to be a damn good blogging platform. If it goes well, then at some point I'd love to revisit the Arbital dream.
I'm happy to answer any and all questions in the comments.
Some of the ensuing responses discussed the fidelity with which such a simulation would need to be run, in order to keep the population living within it guessing as to whether they were in a digital simulation, which is a topic that's been discussed before on LessWrong:
If a simulation can be not just run, but also loaded from previous saved states then edited, it should be possible for the simulation's Architect to start it running with low granularity, wait for some inhabitant to notice an anomaly, then rewind a little, use a more accurate but computing intensive algorithm in the relevant parts of the inhabitant's timecone and edit the saved state to include that additional detail, before setting the simulation running again and waiting for the next anomaly.
construct a system with easy-to-verify but arbitrarily-hard-to-compute behavior ("Project: Piss Off God"), and then scrupulously observe its behavior. Then we could keep making it more expensive until we got to a system that really shouldn't be practically computable in our universe.
but I'm wondering how easy that would be.
The problem would need to be physical (for example, make a net with labelled strands of differing lengths joining the nodes, then hang it from one corner), else humanity would have to be doing as much work as the simulation.
The solution should be discrete (for example, what are the labels on the strands making up the limiting path that prevents the lowest point from hanging further down)
The solution should be not just analytic, but also difficult to get via numerical analysis.
The problem should be scalable to very large sizes (so, for example, the net problem wouldn't work, because with large size nets making the strands sufficiently different in length that you could tell two close solutions apart would be a limiting factor)
And, ideally, the problem would be one that occurs (and is solved) naturally, such that humanity could just record data in multiple locations over a period of years, then later decide which examples of the problem to verify. (See this paper by Scott Aaronson: "NP-complete Problems and Physical Reality")
[Epistemic Status: I suspect that this is at least partially wrong. But I don’t know why yet, and so I figured I’d write it up and let people tell me. First post on Less Wrong, for what that’s worth.]
First thesis: IQ is more akin to a composite measure of performance such as the decathlon than it is to a single characteristic such as height or speed.
Second thesis: When looking at extraordinary performance in any specific field, IQ will usually be highly correlated with success, but it will not fully explain or predict top-end performance, because extraordinary performance in a specific field is a result of extraordinary talent in a sub-category of intelligence (or even a sub-category of a sub-category), rather than truly top-end achievement in the composite metric.
Before we go too far, here are some of the things I’m not arguing:
- IQ is largely immutable (though perhaps not totally immutable).
- IQ is a heritable, polygenic trait.
- IQ is highly correlated with a variety of achievement measures, including academic performance, longevity, wealth, happiness, and health.
- That parenting and schooling matter far less than IQ in predicting performance.
- That IQ matters more than “grit” and “mindset” when explaining performance.
- Most extraordinary performers, from billionaire tech founders to chess prodigies, to writers and artists and musicians, will possess well-above-average IQ.
Here is one area why I’m certain I’m in the minority:
- I believe that Spearman’s G is a reification. At least one smart person has also expressed this opinion, but most experts disagree with him (this ties in with the First Thesis).
Here is the issue where I’m not sure if my opinion is controversial, and thus why I’m writing to get feedback:
- While IQ is almost certainly highly correlated with high-end performance, IQ fails a metric to explain or, more importantly, to predict top-end individual performance (the Second Thesis).
Why IQ Isn’t Like Height
Height is a single, measurable characteristic. Speed over any distance is a single, measurable characteristic. Ability to bench-press is a single, measurable characteristic.
But intelligence is more like the concept of athleticism than it is the concept of height, speed, or the ability to bench-press.
Here is an excerpt from the Slate Star Codex article Talents part 2, Attitude vs. Altitude:
The average eminent theoretical physicist has an IQ of 150-160. The average NBA player has a height of 6’ 7”. Both of these are a little over three standard deviations above their respective mean. Since z-scores are magic and let us compare unlike domains, we conclude that eminent theoretical physicists are about as smart as pro basketball players are tall.
Any time people talk about intelligence, height is a natural sanity check. It’s another strongly heritable polygenic trait which is nevertheless susceptible to environmental influences, and which varies in a normal distribution across the population – but which has yet to accrete the same kind of cloud of confusion around it that IQ has.
All of this is certainly true. But here’s what I’d like to discuss more in depth:
Height is a trait that can be measured in a single stroke. IQ has to be measured by multiple sub-tests.
IQ measures the following sub-components of intelligence:
- Verbal Intelligence
- Mathematical Ability
- Spatial Reasoning Skills
- Visual/Perceptual Skills
- Classification Skills
- Logical Reasoning Skills
- Pattern Recognition Skills
Even though both height and intelligence are polygenic traits, there is a category difference between two.
That’s why I think that athleticism is a better polygenic-trait-comparator to intelligence than height. Obviously, people are born with different degrees of athletic talent. Athleticism can be affected by environmental factors (nutrition, lack of access to athletic facilities, etc.). Athleticism, like intelligence, because it is composed of different sub-variables (speed, agility, coordination – verbal intelligence, mathematical intelligence, spatial reasoning skills), can be measured in a variety of ways. You could measure athleticism with an athlete’s performance in the decathlon, or you could measure it with a series of other tests. Those results would be highly correlated, but not identical. And those results would probably be highly correlated with lots of seemingly unrelated but important physical outcomes.
Measure intelligence with an LSAT vs. IQ test vs. GRE vs. SAT vs. ACT vs. an IQ test from 1900 vs. 1950 vs. 2000 vs. the blink test, and the results will be highly correlated, but again, not identical.
Whether you measure height in centimeters or feet, however, the ranking of the people you measure will be identical no matter how you measure it.
To me, that distinction matters.
I think this athleticism/height distinction explains part (but not all) of the “cloud” surrounding IQ.
Athletic Quotient (“AQ”)
Play along with me for a minute.
Imagine we created a single, composite metric to measure overall athletic ability. Let’s call it AQ, or Athletic Quotient. We could measure AQ just as we measure IQ, with 100 as the median score, and with two standard deviations above at 130 and four standard deviations above at 160.
For the sake of simplicity, let’s measure athletes’ athletic ability with the decathlon. This event is an imperfect test of speed, strength, jumping ability, and endurance.
An Olympic-caliber decathlete could compete at a near-professional level in most sports. But the best decathletes aren’t the people whom we think of when we think of the best athletes in the world. When we think of great athletes, we think of the top performers in one individual discipline, rather than the composite.
When people think of the best athlete in the world, they think of Leo Messi or Lebron James, not Ashton Eaton.
IQ and Genius
Here’s where my ideas might start to get controversial.
I don’t think most of the people we consider geniuses necessarily had otherworldly IQs. People with 200-plus IQs are like Olympic decathletes. They’re amazingly intelligent people who can thrive in any intellectual environment. They’re intellectual heavyweights without specific weaknesses. But those aren’t necessarily the superstars of the intellectual world. The Einsteins, Mozarts, Picassos, or the Magnus Carlsens of the world – they’re great because of domain-specific talent, rather than general intelligence.
Phlogiston and Albert Einstein’s IQ
Check out this article.
The article declares, without evidence, that Einstein had an IQ of 205-225.
The thinking seems to go like this: Most eminent physicists have IQs of around 150-160. Albert Einstein created a paradigm shift in physics (or perhaps multiple such shifts). So he must have had an IQ around 205-225. We’ll just go ahead and retroactively apply that IQ to this man who’s been dead for 65 years and that’ll be great for supporting the idea that IQ and high-end field-specific performance are perfectly correlated.
As an explanation of intelligence, that’s no more helpful than phlogiston in chemistry.
But here’s the thing: It’s easy to ascribe super-high IQs retroactively to highly accomplished dead people, but I have never heard of IQ predicting an individual’s world-best achievement in a specific field. I have never read an article that says, “this kid has an IQ of 220; he’s nearly certain to create a paradigm-shift in physics in 20 years.” There are no Nate Silvers predicting individual achievement based on IQ. IQ does not predict Nobel Prize winners or Fields Medal winners or the next chess #1. A kid with a 220 IQ may get a Ph.D. at age 17 from CalTech, but that doesn’t mean he’s going to be the next Einstein.
Einstein was Einstein because he was an outsider. Because he was intransigent. Because he was creative. Because he was an iconoclast. Because he had the ability to focus. But there is no evidence that he had an IQ over 200. But according to the Isaacson biography at least, there were other pre-eminent physicists who were stronger at math than he was. Of course he was super smart. But there's no evidence he had a super-high IQ (as in, above 200).
We’ve been using IQ as a measure of intelligence for over 100 years and it has never predicted an Einstein, a Musk, or a Carlsen. Who is the best counter-example to this argument? Terence Tao? Without obvious exception, those who have been recognized for early-age IQ are still better known for their achievements as prodigies than their achievements as adults.
Is it unfair to expect that predictive capacity from IQ? Early-age prediction of world-class achievement does happen. Barcelona went and scooped up Leo Messi from the hinterlands of Argentina at age 12 and he went and became Leo Messi. Lebron James was on the cover of Sports Illustrated when he was in high school.
In some fields, predicting world-best performance happens at an early age. But IQ – whatever its other merits – does not seem to serve as an effective mechanism for predicting world-best performance in specific individualized activities.
Magnus Carlsen’s IQ
When I type in Magnus Carlsen’s name into Google, the first thing that autofills (after chess) is “Magnus Carlsen IQ.”
People seem to want to believe that his IQ score can explain why he is the Mozart of chess.
We don’t know what his IQ is, but the instinct people have to try to explain his performance in terms of IQ feels very similar to people’s desire to ascribe an IQ of 225 to Einstein. It’s phlogiston.
Magnus Carlsen probably has a very high IQ. He obviously has well above-average intelligence. Maybe his IQ is 130, 150, or 170 (there's a website called ScoopWhoop that claims, without citation, that it's 190). But however high his IQ, doubtless there are many or at least a few chess players in the world who have higher IQs than he has. But he’s the #1 chess player in the world – not his competitors with higher IQs. And I don’t think the explanation for why he’s so great is his “mindset” or “grit” or anything like that.
It’s because IQ is akin to an intellectual decathlon, whereas chess is a single-event competition. If we dug deep into the sub-components of Carlsen’s IQ (or perhaps the sub-components of the sub-components), we’d probably find some sub-component where he measured off the charts. I’m not saying there’s a “chess gene,” but I suspect that there is a trait that could be measured as a sub-component of intelligence that that is more specific than IQ that would be a greater explanatory variable of his abilities than raw IQ.
Leo Messi isn’t the greatest soccer player in the world because he’s the best overall athlete in the world. He’s the best soccer player in the world because of his agility and quickness in incredibly tight spaces. Because of his amazing coordination in his lower extremities. Because of his ability to change direction with the ball before defenders have time to react. These are all natural talents. But they are only particularly valuable because of the arbitrary constraints in soccer.
Leo Messi is a great natural athlete. If we had a measure of AQ, he’d probably be in the 98th or 99th percentile. But that doesn’t begin to explain his otherworldly soccer-playing talents. He probably could have been a passable high-school point guard at a school of 1000 students. He would have been a well-above-average decathlete (though I doubt he could throw the shot put worth a damn).
But it’s the unique athletic gifts that are particularly well suited to soccer that enabled him to be the best in the world at soccer. So, too, with Magnus Carlsen with chess, Elon Musk with entrepreneurialism, and Albert Einstein with paradigm-shifting physics.
The decathlon won’t predict the next Leo Messi or the next Lebron James. And IQ won’t predict the next Magnus Carlsen, Elon Musk, Picasso, Mozart, or Albert Einstein.
And so we shouldn’t seek it out as an after-the-fact explanation for their success, either.
 Of course, high performance in some fields is probably more closely correlated with IQ than others: physics professor > english professor > tech founder > lawyer > actor > bassist in grunge band. [Note: this footnote is total unfounded speculation]
 The other part is that people don’t like to be defined by traits that they feel they cannot change or improve.
 Let me know if I am missing any famous examples here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should start on Monday, and end on Sunday.
4. Unflag the two options "Notify me of new top level comments on this article" and "
I need help getting out of a logical trap I've found myself in after reading The Age of Em.
Some statements needed to set the trap:
If mind-uploading is possible, then a mind can theoretically exist for an arbitrary length of time.
If a mind is contained in software, it can be copied, and therefore can be stolen.
An uploaded mind can retain human attributes indefinitely.
Some subset of humans are sadistic jerks, many of these humans have temporal power.
All humans, under certain circumstances, can behave like sadistic jerks.
Human power relationships will not simply disappear with the advent of mind uploading.
Some minor negative implications:
Torture becomes embarrassingly parallel.
US states with the death penalty may adopt death plus simulation as a penalty for some offenses.
Over a long enough timeline, the probability of a copy of any given uploaded mind falling into the power of a sadistic jerk approaches unity. Once an uploaded mind has fallen under the power of a sadistic jerk, there is no guarantee that it will ever be 'free', and the quantity of experienced sufferring could be arbitrarily large, due in part to the embarrassingly parallel nature of torture enabled by running multiple copies of a captive mind.
Therefore! If you believe that mind uploading will become possible in a given individual's lifetime, the most ethical thing you can do from the utilitarian standpoint of minimizing aggregate suffering, is to ensure that the person's mind is securely deleted before it can be uploaded.
Imagine the heroism of a soldier, who faced with capture by an enemy capable of uploading minds and willing to parallelize torture spends his time ensuring that his buddies' brains are unrecoverable at the cost of his own capture.
I believe that mind uploading will become possible in my lifetime, please convince me that running through the streets with a blender screaming for brains is not an example of effective altruism.
On a more serious note, can anyone else think of examples of really terrible human decisions that would be incentivised by the development of AGI or mind uploading? This problem appears related to AI safety.
Hello guys, I am currently writing my master's thesis on biases in the investment context. One sub-sample that I am studying is people who are educated about biases in a general context, but not in the investment context. I guess LW is the right place to find some of those so I would be very happy if some of you would participate, especially since people who are aware about biases are hard to come by elsewhere. Also I explicitly ask for activity in the LW community in the survey, so if enough of LWers participate I could analyse them as an individual sub-sample. It would be interesting to know how LWers perform compared to psychology students for example.The link to the survey is: https://survey.deadcrab.de/
It’s only been recently that I’ve been thinking about epistemics in the context of figuring out my behavior and debiasing. Aside from trying to figure out how I actually behave (as opposed to what I merely profess I believe), I’ve been thinking about how to confront uncertainty—and what it feels like.
For many areas of life, I think we shy away from confronting uncertainty and instead flee into the comforting non-falsifiability of vagueness.
Consider these examples:
1) You want to get things done today. You know that writing things down can help you finish more things. However, it feels aversive to write down what you specifically want to do. So instead, you don’t write things down and instead just keep a hazy notion of “I will do things today”.
2) You try to make a confidence interval for a prediction where money is on the line. You notice yourself feeling uncomfortable, no matter what your bounds are; it feels bad to set down any number at all, which is accompanied by a dread feeling of finality.
3) You’re trying to find solutions to a complex, entangled problem. Coming up with specific solutions feels bad because none of them seem to completely solve the problem. So instead you decide to create a meta-framework that produces solutions, or argue in favor of some abstract process like a “democratized system that focuses on holistic workarounds”.
In each of the above examples, it feels like we move away from making specific claims because that opens us up to specific criticism. But instead of trying to improve the strengths of specific claims, we retreat to fuzzily-defined notions that allow us to incorporate any criticism without having to really update.
I think there’s a sense in which, in some areas of life, we’re embracing shoddy epistemology (e.g. not wanting to validate or falsify our beliefs) because of a fear of wanting to fail / put in the effort to update. I think this failure is what fuels this feeling of aversion.
It seems useful to face this feeling of badness or aversion with the understanding that this is what confronting uncertainty feels like. The best action doesn’t always feel comfortable and easy; it can just as easily feel aversive and final.
Look for situations where you might be flinching away from making specific claims and replacing them with vacuous claims that support all evidence you might see.
If you never put your beliefs to the test with specific claims, then you can never verify them in the real world. And if your beliefs don’t map well onto the real world, they don’t seem very useful to even have in the first place.
Not really sure where else I might post this, but there seems to be a UI issue on the site. When I hit the homepage of lesswrong.com while logged in I no longer see the user sidebar or the header links for Main and Discussion. This is kind of annoying because I have to click into an article first to get to a page where I can access those things. Would be nice to have them back on the front page.
About a year ago, I made a setting of the Litany of Tarski for four-part a cappella (i.e. unaccompanied) chorus.
More recently, in the process of experimenting with MuseScore for potential use in explaining musical matters on the internet (it makes online sharing of playback-able scores very easy), the thought occurred to me that perhaps the Tarski piece might be of interest to some LW readers (if no one else!), so I went ahead and re-typeset it in MuseScore for your delectation.
Here it is (properly notated :-)).
Here it is (alternate version designed to avoid freaking out those who aren't quite the fanatical enthusiasts of musical notation that I am).
Home appliances, such as washing machines, are apparently much less durable now than they were decades ago. [ETA: Thanks to commenters for providing lots of reasons to doubt this claim (especially from here and here).]
Perhaps this is a kind of mirror image of "cost disease". In many sectors (education, medicine), we pay much more now for a product that is no better than what we got decades ago at a far lower cost, even accounting for inflation. It takes more money to buy the same level of quality. Scott Alexander (Yvain) argues that the cause of cost disease is a mystery. There are several plausible accounts, but they don't cover all the cases in a satisfying way. (See the link for more on the mystery of cost disease.)
Now, what if the mysterious cause of cost disease were to set to work in a sector where price can't go up, for whatever reason? Then you would expect quality to take a nosedive. If price per unit quality goes up, but total price can't go up, then quality must go down. So maybe the mystery of crappy appliances is just cost disease in another guise.
In the spirit of inadequate accounts of cost disease, I offer this inadequate account of crappy appliances:
As things get better globally, they get worse locally.
Global goodness provides a buffer against local badness. This makes greater local badness tolerable. That is, the cheapest tolerable thing gets worse. Thus, worse and worse things dominate locally as things get better globally.
This principle applies in at least two ways to washing machines:
Greater global wealth: Consumers have more money, so they can afford to replace washing machines more frequently. Thus, manufacturers can sell machines that require frequent replacement.
Manufacturers couldn't get away with this if people were poorer and could buy only one machine every few decades. If you're poor, you prioritize durability more. In the aggregate, the market will reward durability more. But a rich market accepts less durability.
Better materials science: Globally, materials science has improved. Hence, at the local level, manufacturers can get away with making worse materials.
Rich people might tolerate a washer that lasts 3 years, give or take. But even they don't want a washer that breaks in one month. If you build washers, you need to be sure that nearly every single one lasts a full month, at least. But, with poor materials science, you have to overshoot by a lot to ensure of that. Maybe you have to aim for a mean duration of decades to guarantee that the minimum duration doesn't fall below one month. On the other hand, with better materials science, you can get the distribution of duration to cluster tightly around 3 years. You still have very few washers lasting only one month, but the vast majority of your washers are far less durable than they used to be.
Maybe this is just Nassim Taleb's notion of antifragility. I haven't read the book, but I gather that the idea is that individuals grow stronger in environments that contain more stressors (within limits). Conversely, if you take away the stressors (i.e., make the environment globally better), then you get more fragile individuals (i.e., things are locally worse).
In this post, I'll argue that Joyce's equilibrium CDT (eCDT) can be made into FDT (functional decision theory) with the addition of an intermediate step - a step that should have no causal consequences. This would show that eCDT is unstable under causally irrelevant changes, and is in fact a partial version of FDT.
Joyce's principle is:
Full Information. You should act on your time-t utility assessments only if those assessments are based on beliefs that incorporate all the evidence that is both freely available to you at t and relevant to the question about what your acts are likely to cause.
When confronted by a problem with a predictor (such as Death in Damascus or the Newcomb problem), this allows eCDT to recursively update their probabilities of the behaviour of the predictor, based on their own estimates of their own actions, until this process reaches equilibrium. This allows it to behave like FDT/UDT/TDT on some (but not all) problems. I'll argue that you can modify the setup to make eCDT into a full FDT.
Death in Damascus
In this problem, Death has predicted whether the agent will stay in Damascus (S) tomorrow, or flee to Aleppo (F). And Death has promised to be in the same city as the agent (D or A), to kill them. Having made its prediction, Death then travels to that city to wait for the agent. Death is known to be a perfect predictor, and the agent values survival at $1000, while fleeing costs $1.
Then eCDT fleeing to Aleppo with probability 999/2000. To check this, let x be the probability of fleeing to Aleppo (F), and y the probability of Death being there (A). The expected utility is then
- 1000(x(1-y)+(1-x)y)-x (1)
Differentiating this with respect to x gives 999-2000y, which is zero for y=999/2000. Since Death is a perfect predictor, y=x and eCDT's expected utility is 499.5.
The true expected utility, however, is -999/2000, since Death will get the agent anyway, and the only cost is the trip to Aleppo.
The eCDT decision process seems rather peculiar. It seems to allow updating of the value of y dependent on the value of x - hence allow acausal factors to be considered - but only in a narrow way. Specifically, it requires that the probability of F and A be equal, but that those two events remain independent. And it then differentiates utility according to the probability of F only, leaving that of A fixed. So, in a sense, x correlates with y, but small changes in x don't correlate with small changes in y.
That's somewhat unsatisfactory, so consider the problem now with an extra step. The eCDT agent no longer considers whether to stay or flee; instead, it outputs X, a value between 0 and 1. There is a uniform random process Z, also valued between 0 and 1. If Z<X, then the agent flees to Aleppo; if not, it stays in Damascus.
This seems identical to the original setup, for the agent. Instead of outputting a decision as to whether to flee or stay, it outputs the probability of fleeing. This has moved the randomness in the agent's decision from inside the agent to outside it, but this shouldn't make any causal difference, because the agent knows the distribution of Z.
Death remains a perfect predictor, which means that it can predict X and Z, and will move to Aleppo if and only if Z<X.
Now let the eCDT agent consider outputting X=x for some x. In that case, it updates its opinion of Death's behaviour, expecting that Death will be in Aleppo if and only if Z<x. Then it can calculate the expected utility of setting X=x, which is simply 0 (Death will always find the agent) minus x (the expected cost of fleeing to Aleppo), hence -x. Among the "pure" strategies, X=0 is clearly the best.
Now let's consider mixed strategies, where the eCDT agent can consider a distribution PX over values of X (this is a sort of second order randomness, since X and Z already give randomness over the decision to move to Aleppo). If we wanted the agent to remain consistent with the previous version, the agent then models Death as sampling from PX, independently of the agent. The probability of fleeing is just the expectation of PX; but the higher the variance of PX, the harder it is for Death to predict where the agent will go. The best option is as before: PX will set X=0 with probability 1001/2000, and X=1 with probability 999/2000.
But is this a fair way of estimating mixed strategies?
Average Death in Aleppo
Consider a weaker form of Death, Average Death. Average Death cannot predict X, but can predict PX, and will use that to determine its location, sampling independently from it. Then, from eCDT's perspective, the mixed-strategy behaviour described above is the correct way of dealing with Average Death.
But that means that the agent above is incapable of distinguishing between Death and Average Death. Joyce argues strongly for considering all the relevant information, and the distinction between Death and Average Death is relevant. Thus it seems when considering mixed strategies, the eCDT agent must instead look at the pure strategies, compute their value (-x in this case) and then look at the distribution over them.
One might object that this is no longer causal, but the whole equilibrium approach undermines the strictly causal aspect anyway. It feels daft to be allowed to update on Average Death predicting PX, but not on Death predicting X. Especially since moving from PX to X is simply some random process Z' that samples from the distribution PX. So Death is allowed to predict PX (which depends on the agent's reasoning) but not Z'. It's worse than that, in fact: Death can predict PX and Z', and the agent can know this, but the agent isn't allowed to make use of this knowledge.
Given all that, it seems that in this situation, the eCDT agent must be able to compute the mixed strategies correctly and realise (like FDT) that staying in Damascus (X=0 with certainty) is the right decision.
Let's recurse again, like we did last summer
This deals with Death, but not with Average Death. Ironically, the "X=0 with probability 1001/2000..." solution is not the correct solution for Average Death. To get that, we need to take equation (1), set x=y first, and then differentiate with respect to x. This gives x=1999/4000, so setting "X=0 with probability 2001/4000 and X=1 with probability 1999/4000" is actually the FDT solution for Average Death.
And we can make the eCDT agent reach that. Simply recurse to the next level, and have the agent choose PX directly, via a distribution PPX over possible PX.
But these towers of recursion are clunky and unnecessary. It's simpler to state that eCDT is unstable under recursion, and that it's a partial version of FDT.
You should always cooperate with an identical copy of yourself in the prisoner's dilemma. This is obvious, because you and the copy will reach the same decision.
That justification implicitly assumes that you and your copy as (somewhat) antagonistic: that you have opposite aims. But the conclusion doesn't require that at all. Suppose that you and your copy were instead trying to ensure that one of you got maximal reward (it doesn't matter which). Then you should still jointly cooperate because (C,C) is possible, while (C,D) and (D,C) are not (I'm ignoring randomising strategies for the moment).
Now look at the Newcomb problem. You decision enters twice: once when you decide how many boxes to take, and once when Omega is simulating or estimating you to decide how much money to put in box B. You would dearly like your two "copies" (one of which may just be an estimate) to be out of sync - for the estimate to 1-box while the real you two-boxes. But without any way of distinguishing between the two, you're stuck with taking the same action - (1-box,1-box). Or, seeing it another way, (C,C).
This also makes the Newcomb problem into an anti-coordination game, where you and your copy/estimate try to pick different options. But, since this is not possible, you have to stick to the diagonal. This is why the Newcomb problem can be seen both as an anti-coordination game and a prisoners' dilemma - the differences only occur in the off-diagonal terms that can't be reached.
Note: This post is in error, I've put up a corrected version of it here. I'm leaving the text in place, as historical record. The source of the error is that I set Pa(S)=Pe(D) and then differentiated with respect to Pa(S), while I should have differentiated first and then set the two values to be the same.
Nate Soares and Ben Levinstein have a new paper out on "Functional Decision theory", the most recent development of UDT and TDT.
It's good. Go read it.
This post is about further analysing the "Death in Damascus" problem, and to show that Joyce's "equilibrium" version of CDT (causal decision theory) is in a certain sense intermediate between CDT and FDT. If eCDT is this equilibrium theory, then it can deal with a certain class of predictors, which I'll call distribution predictors.
Death in Damascus
In the original Death in Damascus problem, Death is a perfect predictor. It finds you in Damascus, and says that it's already planned it's trip for tomorrow - and it'll be in the same place you will be.
You value surviving at $1000, and can flee to Aleppo for $1.
Classical CDT will put some prior P over Death being in Damascus (D) or Aleppo (A) tomorrow. And then, if P(A)>999/2000, you should stay (S) in Damascus, while if P(A)<999/2000, you should flee (F) to Aleppo.
FDT estimates that Death will be wherever you will, and thus there's no point in F, as that will just cost you $1 for no reason.
But it's interesting what eCDT produces. This decision theory requires that Pe (the equilibrium probability of A and D) be consistent with the action distribution that eCDT computes. Let Pa(S) be the action probability of S. Since Death knows what you will do, Pa(S)=Pe(D).
The expected utility is 1000.Pa(S)Pe(A)+1000.Pa(F)Pe(D)-Pa(F). At equilibrium, this is 2000.Pe(A)(1-Pe(A))-Pe(A). And that quantity is maximised when Pe(A)=1999/4000 (and thus the probability of you fleeing is also 1999/4000).
This is still the wrong decision, as paying the extra $1 is pointless, even if it's not a certainty to do so.
So far, nothing interesting: both CDT and eCDT fail. But consider the next example, on which eCDT does not fail.
Statistical Death in Damascus
Let's assume now that Death has an assistant, Statistical Death, that is not a prefect predictor, but is a perfect distribution predictor. It can predict the distribution of your actions, but not your actual decision. Essentially, you have access to a source of true randomness that it cannot predict.
It informs you that its probability over whether to be in Damascus or Aleppo will follow exactly the same distribution as yours.
Classical CDT follows the same reasoning as before. As does eCDT, since Pa(S)=Pe(D), as before, since Statistical Death follows the same distribution as you do.
But what about FDT? Well, note that FDT will reach the same conclusion as eCDT. This is because 1000.Pa(S)Pe(A)+1000.Pa(F)Pe(D)-Pa(F) is the correct expected utility, the Pa(S)=Pe(D) assumption is correct for Statistical Death, and (S,F) is independent of (A,D) once the action probabilities have been fixed.
So on the Statistical Death problem, eCDT and FDT say the same thing.
Factored joint distribution versus full joint distributions
What's happening is that there is a joint distribution over (S,F) (your actions) and (D,A) (Death's actions). FDT is capable of reasoning over all types of joint distributions, and fully assessing how its choice of Pa acausally affects Death's choice of Pe.
But eCDT is only capable of reasoning over ones where the joint distribution factors into a distribution over (S,F) times a distribution over (D,A). Within the confines of that limitation, it is capable of (acausally) changing Pe via its choice of Pa.
Death in Damascus does not factor into two distributions, so eCDT fails on it. Statistical Death in Damascus does so factor, so eCDT succeeds on it. Thus eCDT seems to be best conceived of as a version of FDT that is strangely limited in terms of which joint distributions its allowed to consider.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should start on Monday, and end on Sunday.
4. Unflag the two options "Notify me of new top level comments on this article" and "
Attractor Theory: A Model of Minds and Motivation
[Epistemic status: Moderately strong. Attractor Theory is a model based on the well-researched concept of time-inconsistent preferences combined with anecdotal evidence that extends the theory to how actions affect our preferences in general. See the Caveats at the end for a longer discussion on what this model is and isn’t.]
<Cross-posted from mindlevelup>
I’ve thinking about minds and motivation somewhat on/off for about a year now, and I think I now have a model that merges some related ideas together into something useful. The model is called Attractor Theory, and it brings together ideas from Optimizing Your Mindstate, behavioral economics, and flow.
Attractor Theory is my attempt to provide a way of looking at the world that hybridizes ideas from the Resolve paradigm (where humans Actually Try and exert their will) and the “click-whirr” paradigm (where humans are driven by “if-then” loops and proceduralized habits).
As a brief summary, Attractor Theory basically states that you should consider any action you take as being easier to continue than to start, as well as having meta-level effects on changing your perception of which actions feel desirable.
Here’s a metaphor that provides most of the intuitions behind Attractor Theory:
Imagine that you are in a hamster ball:
As a human inside this ball, you can kinda roll around by exerting energy. But it’s hard to do so all of the time — you’d likely get tired. Still, if you really wanted to, you could push the ball and move.
These are Utilons. They represent productivity hours, lives saved, HPMOR fanfictions written, or anything else you care about maximizing. You are trying to roll around and collect as many Utilons as possible.
But the terrain isn’t actually smooth. Instead, there are all these Attractors that pull you towards them. Attractors are like valleys, or magnets, or point charges. Or maybe electrically charged magnetic valleys. (I’m probably going to Physics Hell for that.)
The point is that they draw you towards them, and it’s hard to resist their pull.
Also, Attractors have an interesting property: Once you’re being pulled in by one, this actually modifies other Attractors. This usually manifests by changing how strongly other ones are pulling you in. Sometimes, though, this even means that some Attractors will disappear, and new ones may appear.
As a human, your goal is to navigate this tangle of Utilons and Attractors from your hamster ball, trying to collect Utilons.
Now you could just try to take a direct path to all the nearest Utilons, but that would mean exerting a lot of energy to fight the pull of Attractors that pull you in Utilon-sparse directions.
Instead, given that you can’t avoid Attractors (they’re everywhere!) and that you want to get as many Utilons as possible, the best thing to do seems to be to strategically choose which Attractors you’re drawn to and selectively choose when to exert energy to move from one to another to maximize your overall trajectory.
In the above metaphor, actions and situations serve as Attractors, which are like slippery slopes that pull you in. Your agency is represented by the “meta-human” that inhabits the ball, which has some limited control when it comes to choosing which Attractor-loops to dive into and which ones to pop out of.
So the default view of humans and decisions seems to be something like viewing actions as time-chunks that we can just slot into our schedule. Attractor Theory attempts to present a model that moves away from that and shifts our intuitions to:
a) think less about our actions in a vacuum / individually
b) consider starting / stopping costs more
c) see our preferences in a more mutable light
It’s my hope that thinking about actions in as “things that draw you in” can better improve our intuitions about global optimization:
My point here is that, phenomenologically, it feels like our actions change the sorts of things we might want. Every time we take an action, this will, in turn, prime how we view other actions, often in somewhat predictable ways. I might not know exactly how they’ll change, but we can get good, rough ideas from past experience and our imaginations.
For example, the set of things that feel desirable to me after running a marathon may differ greatly from the set of things after I read a book on governmental corruption.
(I may still have core values, like wanting everyone to be happy, which I place higher up in my sense of self, which aren’t affected by these, but I’m mainly focusing on how object-level actions feel for this discussion. There’s a longer decision-theoretic discussion here that I’ll save for a later post.)
When you start seeing your actions in terms of, not just their direct effects, but also their effects on how you can take further actions, I think this is useful. It changes your decision algorithm to be something like:
“Choose actions such that their meta-level effects on me by my taking them allow me to take more actions of this type in the future and maximize the number of Utilons I can earn in the long run.”
By phrasing it this way, it makes it more clear that most things in life are a longer-term endeavor that involve trying to globally optimize, rather than locally. It also provides a model for evaluating actions on a new axis — the extent to which is influences your future, which seems like an important thing to consider.
(While it’s arguable that a naive view of maximization should by default take this into account from a consequentialist lens, I think making it explicitly clear, as the above formulation does, is a useful distinction.)
This allows us to better evaluate actions which, by themselves, might not be too useful, but do a good job of reorienting ourselves into a better state of mind. For example, spending a few minutes outside to get some air might not be directly useful, but it’ll likely help clear my mind, which has good benefits down the line.
Along the same lines, you want to view actions, not as one-time deals, but a sort of process that actively changes how you perceive other actions. In fact, these effects should somtimes perhaps be as important a consideration as time or effort when looking at a task.
Attractor Theory also conceptually models the idea of precommitment:
Humans often face situations where we fall prey to “in the moment” urges, which soon turn to regret. These are known as time-inconsistent preferences, where what we want quickly shifts, often because we are in the presence of something that really tempts us.
An example of this is the dieter who proclaims “I’ll just give in a little today” when seeing a delicious cake on the restaurant menu, and then feeling “I wish I hadn’t done that” right after gorging themselves.
Precommitment is the idea that you can often “lock-in” your choices beforehand, such that you will literally be unable to give into temptation when the actual choice comes before you, or entirely avoid the opportunity to even face the choice.
An example from the above would be something like having a trustworthy friend bring food over instead of eating out, so you can’t stuff yourself on cake because you weren’t even the one who ordered food.
There’s seems to be a general principle here of going “upstream”, such that you’re trying to target places where you have the most control, such that you can improve your experiences later down the line. This seems to be a useful idea, whether the question is about finding leverage or self-control.
Attractor Theory views all actions and situations as self-reinforcing slippery slopes. As such, it more realistically models the act of taking certain actions as leading you to other Attractors, so you’re not just looking at things in isolation.
In this model, we can reasonably predict, for example, that any video on YouTube will likely lead to more videos because the “sucked-in-craving-more-videos Future You” will have different preferences than “needing-some-sort-of-break Present You”.
This view allows you to better see certain “traps”, where an action will lead you deeper and deeper down an addiction/reward cycle, like a huge bag of chips or a webcomic. These are situations where, after the initial buy-in, it becomes incredibly attractive to continue down the same path, as these actions make reinforce themselves, making it easy to continue on and on…
Under the Attractor metaphor, your goal, then, is to focus on finding ways of being drawn to certain actions and avoidong others. You wan to find ways that you can avoid specific actions which you could lead you down bad spirals, even if the initial actions themselves may not be that distractiong.
The result is chaining together actions and their effects on how you perceive things in an upstream way, like precommitment.
Exploring, Starting, and Stopping:
Local optima is also visually represented by this model: We can get caught in certain chains of actions that do a good job of netting Utilons. Similar to the above traps, it can be hard to try new things once we’ve found an effective route already.
Chances are, though, that there’s probably even more Utilons to be had elsewhere. In which case, being able to break out to explore new areas could be useful.
Attractor Theory also does a good job of modeling how actions seem much harder to start than to stop. Moving from one Attractor to a disparate one can be costly in terms of energy, as you need to move against the pull of the current Attractor.
Once you’re pulled in, though, it’s usually easier to keep going with the flow. So using this model ascribes costs to starting and places less of a cost on continuing actions.
By “pulled in”, I mean making it feel effortless or desirable to continue with the action. I’m thinking of the feeling you get when you have a decent album playing music, and you feel sort of tempted to switch it to a better album, except that, given that this good song is already playing, you don’t really feel like switching.
Given the costs between switching, you want to invest your efforts and agency into, perhaps not always choosing the immediate Utilon-maximizing action moment-by-moment but by choosing the actions / situations whose attractors pull you in desirable directions, or make it such that other desirable paths are now easier to take.
Summary and Usefulness:
Attractor Theory attempts to retain willpower as a coherent idea, while also hopefully more realistically modeling how actions can affect our preferences with regards to other actions.
It can serve as an additional intuition pump behind using willpower in certain situations. Thinking about “activation energy” in terms of putting in some energy to slide into positive Attractors removes the mental block I’ve recently had on using willpower. (I’d been stuck in the “motivation should come from internal cooperation” mindset.)
The meta-level considerations when looking at how Attractors affect how other Attractors affect us provides a clearer mental image of why you might want to precommit to avoid certain actions.
For example, when thinking about taking breaks, I now think about which actions can help me relax without strongly modifying my preferences. This means things like going outside, eating a snack, and drawing as far better break-time activities than playing an MMO or watching Netflix.
This is because the latter are powerful self-reinforcing Attractors that also pull me towards more reward-seeking directions, which might distract me from my task at hand. The former activities can also serve as breaks, but they don’t do much to alter your preferences, and thus, help keep you focused.
I see Attractor Theory as being useful when it comes to thinking upstream and providing an alternative view of motivation that isn’t exactly internally based.
Hopefully, this model can be useful when you look at your schedule to identify potential choke-points / bottlenecks can arise, as a result of factors you hadn’t previously considered, when it comes to evaluating actions.
Attractor Theory assumes that different things can feel desirable depending on the situation. It relinquishes some agency by assuming that you can’t always choose what you “want” because of external changes to how you perceive actions. It also doesn’t try to explain internal disagreements, so it’s still largely at odds with the Internal Double Crux model.
I think this is fine. The goal here isn’t exactly to create a wholly complete prescriptive model or a descriptive one. Rather, it’s an attempt to create a simplified model of humans, behavior, and motivation into a concise, appealing form your intuitions can crystallize, similar to the System 1 and System 2 distinction.
I admit that if you tend to use an alternate ontology when it comes to viewing how your actions relate to the concept of “you”, this model might be less useful. I think that’s also fine.
This is not an attempt to capture all of the nuances / considerations in decision-making. It’s simply an attempt to partially take a few pieces and put them together in a more coherent framework. Attractor Theory merely takes a few pieces that I’d previously had as disparate nodes and chunks them together into a more unified model of how we think about doing things.
Rationalists like to live in group houses. We are also as a subculture moving more and more into a child-having phase of our lives. These things don't cooperate super well - I live in a four bedroom house because we like having roommates and guests, but if we have three kids and don't make them share we will in a few years have no spare rooms at all. This is frustrating in part because amenable roommates are incredibly useful as alloparents if you value things like "going to the bathroom unaccompanied" and "eating food without being screamed at", neither of which are reasonable "get a friend to drive for ten minutes to spell me" situations. Meanwhile there are also people we like living around who don't want to cohabit with a small child, which is completely reasonable, small children are not for everyone.
For this and other complaints ("househunting sucks", "I can't drive and need private space but want friends accessible", whatever) the ideal solution seems to be somewhere along the spectrum between "a street with a lot of rationalists living on it" (no rationalist-friendly entity controls all those houses and it's easy for minor fluctuations to wreck the intentional community thing) and "a dorm" (sorta hard to get access to those once you're out of college, usually not enough kitchens or space for adult life). There's a name for a thing halfway between those, at least in German - "baugruppe" - buuuuut this would require community or sympathetic-individual control of a space and the money to convert it if it's not already baugruppe-shaped.
Maybe if I complain about this in public a millionaire will step forward or we'll be able to come up with a coherent enough vision to crowdfund it or something. I think there is easily enough demand for a couple of ten-to-twenty-adult baugruppen (one in the east bay and one in the south bay) or even more/larger, if the structures materialized. Here are some bulleted lists.
- Units that it is really easy for people to communicate across and flow between during the day - to my mind this would be ideally to the point where a family who had more kids than fit in their unit could move the older ones into a kid unit with some friends for permanent sleepover, but still easily supervise them. The units can be smaller and more modular the more this desideratum is accomplished.
- A pricing structure such that the gamut of rationalist financial situations (including but not limited to rent-payment-constraining things like "impoverished app academy student", "frugal Google engineer effective altruist", "NEET with a Patreon", "CfAR staffperson", "not-even-ramen-profitable entrepreneur", etc.) could live there. One thing I really like about my house is that Spouse can pay for it himself and would by default anyway, and we can evaluate roommates solely on their charming company (or contribution to childcare) even if their financial situation is "no". However, this does require some serious participation from people whose financial situation is "yes" and a way to balance the two so arbitrary numbers of charity cases don't bankrupt the project.
- Variance in amenities suited to a mix of Soylent-eating restaurant-going takeout-ordering folks who only need a fridge and a microwave and maybe a dishwasher, and neighbors who are not that, ideally such that it's easy for the latter to feed neighbors as convenient.
- Some arrangement to get repairs done, ideally some compromise between "you can't do anything to your living space, even paint your bedroom, because you don't own the place and the landlord doesn't trust you" and "you have to personally know how to fix a toilet".
- I bet if this were pulled off at all it would be pretty easy to have car-sharing bundled in, like in Benton House That Was which had several people's personal cars more or less borrowable at will. (Benton House That Was may be considered a sort of proof of concept of "20 rationalists living together" but I am imagining fewer bunk beds in the baugruppe.) Other things that could be shared include longish-term storage and irregularly used appliances.
- Dispute resolution plans and resident- and guest-vetting plans which thread the needle between "have to ask a dozen people before you let your brother crash on the couch, let alone a guest unit" and "cannot expel missing stairs". I think there are some rationalist community Facebook groups that have medium-trust networks of the right caution level and experiment with ways to maintain them.
- Bikeshedding. Not that it isn't reasonable to bikeshed a little about a would-be permanent community edifice that you can't benefit from or won't benefit from much unless it has X trait - I sympathize with this entirely - but too much from too many corners means no baugruppen go up at all even if everything goes well, and that's already dicey enough, so please think hard on how necessary it is for the place to be blue or whatever.
- Location. The only really viable place to do this for rationalist population critical mass is the Bay Area, which has, uh, problems, with new construction. Existing structures are likely to be unsuited to the project both architecturally and zoningwise, although I would not be wholly pessimistic about one of those little two-story hotels with rooms that open to the outdoors or something like that.
- Principal-agent problems. I do not know how to build a dormpartment building and probably neither do you.
- Community norm development with buy-in and a good match for typical conscientiousness levels even though we are rules-lawyery contrarians.
Please share this wherever rationalists may be looking; it's definitely the sort of thing better done with more eyes on it.
In the deep dark lurks of the internet, several proactive lesswrong and diaspora leaders have been meeting each day. If we could have cloaks and silly hats; we would.
We have been discussing the great diversification, and noticed some major hubs starting to pop up. The ones that have been working together include:
- Lesswrong slack
- SlateStarCodex Discord
- Reddit/Rational Discord
- Lesswrong Discord
- Exegesis (unofficial rationalist tumblr)
The ones that we hope to bring together in the future include (on the willingness of those servers):
- Lesswrong IRC (led by Gwern)
- Slate Star Codex IRC
- AGI slack
- Transhumanism Discord
- Artificial Intelligence Discord
How will this work?
About a year ago, the lesswrong slack tried to bridge across to the lesswrong IRC. That was bad. From that experience we learnt a lot that can go wrong, and have worked out how to avoid those mistakes. So here is the general setup.
Each server currently has it's own set of channels, each with their own style of talking and addressing problems, and sharing details and engaging with each other. We definitely don't want to do anything that will harm those existing cultures. In light of this, taking the main channel from one server and mashing it into the main channel of another server is going to reincarnate into HELL ON EARTH. and generally leave both sides with the sentiment that "<the other side> is wrecking up <our> beautiful paradise". Some servers may have a low volume buzz at all times, other servers may become active for bursts, it's not good to try to marry those things.
I am in <exegesis, D/LW, R/R, SSC> what does this mean?
If you want to peek into the lesswrong slack and see what happens in their #open channel. You can join or unmute your respective channel and listen in, or contribute (two way relay) to their chat. Obviously if everyone does this at once we end up spamming the other chat and probably after a week we cut the bridge off because it didn't work. So while it's favourable to increase the community; be mindful of what goes on across the divide and try not to anger our friends.
I am in Lesswrong-Slack, what does this mean?
We have new friends! Posts in #open will be relayed to all 4 children rooms where others can contribute if they choose. Mostly they have their own servers to chat on, and if they are not on an info-diet already, then maybe they should be. We don't anticipate invasion or noise.
Why do they get to see our server and we don't get to see them?
So glad you asked - we do. There is an identical set up for their server into our bridge channels. in fact the whole diagram looks something like this:
Pretty right? No it's not. But that's in the backend.
For extra clarification, the rows are the channels that are linked. Which is to say that Discord-SSC, is linked to a child channel in each of the other servers. The last thing we want to do is impact this existing channels in a negative way.
But what if we don't want to share our open and we just want to see the other side's open? (/our talk is private, what about confidential and security?)
Oh you mean like the prisoners dilemma? Where you can defect (not share) and still be rewarded (get to see other servers). Yea it's a problem. Tends to be when one group defects, that others also defect. There is a chance that the bridge doesn't work. That this all slides, and we do spam each other, and we end up giving up on the whole project. If it weren't worth taking the risk we wouldn't have tried.
We have not rushed into this bridge thing, we have been talking about it calmly and slowly and patiently for what seems like forever. We are all excited to be taking a leap, and keen to see it take off.
Yes, security is a valid concern, walled gardens being bridged into is a valid concern, we are trying our best. We are just as hesitant as you, and being very careful about the process. We want to get it right.
So if I am in <server1> and I want to talk to <server3> I can just post in the <bridge-to-server2> room and have the message relayed around to server 3 right?
Whilst that is correct, please don't do that. You wouldn't like people relaying through your main to talk to other people. Also it's pretty silly, you can just post in your <servers1> main and let other people see it if they want to.
This seems complicated, why not just have one room where everyone can go and hang out?
- How do you think we ended up with so many separate rooms
- Why don't we all just leave <your-favourite server> and go to <that other server>? It's not going to happen
Why don't all you kids get off my lawn and stay in your own damn servers?
Thank's grandpa. No one is coming to invade, we all have our own servers and stuff to do, we don't NEED to be on your lawn, but sometimes it's nice to know we have friends.
<server2> shitposted our server, what do we do now?
This is why we have mods, why we have mute and why we have ban. It might happen but here's a deal; don't shit on other people and they won't shit on you. Also if asked nicely to leave people alone, please leave people alone. Remember anyone can tap out of any discussion at any time.
I need a picture to understand all this.
Great! Friends on exegesis made one for us.
Who are our new friends:
Lesswrong slack has been active since 2015, and has a core community. The slack has 50 channels for various conversations on specific topics, the #open channel is for general topics and has all kinds of interesting discoveries shared here.
Discord-Exegesis (private, entry via tumblr)
Exegesis is a discord set up by a tumblr rationalist for all his friends (not just rats). It took off so well and became such a hive in such a short time that it's now a regular hub.
Following Exegesis's growth, a discord was set up for lesswrong, it's not as active yet, but has the advantage of a low barrier to entry and it's filled with lesswrongers.
Scott posted a link on an open thread to the SSC discord and now it holds activity from users that hail from the SSC comment section. it probably has more conversation about politics than other servers but also has every topic relevant to his subscribers.
reddit rational discord grew from the rationality and rational fiction subreddit, it's quite busy and covers all topics.
As at the publishing of this post; the bridge is not live, but will go live when we flip the switch.
Meta: this took 1 hour to write (actualy time writing) and half way through I had to stop and have a voice conference about it to the channels we were bridging.
Cross posted to lesswrong: http://lesswrong.com/lw/oqz
Could utility functions be for narrow AI only, and downright antithetical to AGI? That's a quite fundamental question and I'm kind of afraid there's an obvious answer that I'm just too uninformed to know about. But I did give this some thought and I can't find the fault in the following argument, so maybe you can?
Eliezer Yudkowsky says that when AGI exists, it will have a utility function. For a long time I didn't understand why, but he gives an explanation in AI Alignment: Why It's Hard, and Where to Start. You can look it up there, but the gist of the argument I got from it is:
- (explicit) If an agent's decisions are incoherent, the agent is behaving foolishly.
- Example 1: If an agent's preferences aren't ordered, the agent prefers A to B, B to C but also C to A, it behaves foolishly.
- Example 2: If an agent allocates resources incoherently, it behaves foolishly.
- Example 3: If an agent's preferences depend on the probability of the choice even having to be made, it behaves foolishly.
- Example 1: If an agent's preferences aren't ordered, the agent prefers A to B, B to C but also C to A, it behaves foolishly.
- (implicit) An AGI shouldn't behave foolishly, so its decisions have to be coherent.
- (explicit) Making coherent decisions is the same thing as having a utility function.
I accept that if all of these were true, AGI should have a utility function. I also accept points 1 and 3. I doubt point 2.
Before I get to why, I should state my suspicion why discussions of AGI really focus on utility functions so much. Utility functions are fundamental to many problems of narrow AI. If you're trying to win a game, or to provide a service using scarce computational resources, a well-designed utility function is exactly what you need. Utility functions are essential in narrow AI, so it seems reasonable to assume they should be essential in AGI because... we don't know what AGI will look like but it sounds similar to narrow AI, right?
So that's my motivation. I hope to point out that maybe we're confused about AGI because we took a wrong turn way back when we decided it should have a utility function. But I'm aware it is more likely I'm just too dumb to see the wisdom of that decision.
The reasons for my doubt are the following.
- Humans don't have a utility function and make very incoherent decisions. Humans are also the most intelligent organisms on the planet. In fact, it seems to me that the less intelligent an organism is, the easier its behavior can be approximated with model that has a utility function!
- Apes behave more coherently than humans. They have a far smaller range of behaviors. They switch between them relatively predictably. They do have culture - one troop of chimps will fish for termites using a twig, while another will do something like a rain dance - but their cultural specifics number in the dozens, while those of humans are innumerable.
- Cats behave more coherently than apes. There are shy cats and bold ones, playful ones and lazy ones, but once you know a cat, you can predict fairly precisely what kind of thing it is going to do on a random day.
- Earthworms behave more coherently than cats. There aren't playful earthworms and lazy ones, they basically all follow the nutrients that they sense around them and occasionally mate.
- And single-celled organisms are so coherent we think we can even model them them entirely on standard computing hardware. Which, if it succeeds, means we actually know e.coli's utility function to the last decimal point.
- Apes behave more coherently than humans. They have a far smaller range of behaviors. They switch between them relatively predictably. They do have culture - one troop of chimps will fish for termites using a twig, while another will do something like a rain dance - but their cultural specifics number in the dozens, while those of humans are innumerable.
- The randomness of human decisions seems essential to human success (on top of other essentials such as speech and cooking). Humans seem to have a knack for sacrificing precious lifetime for fool's errands that very occasionally create benefit for the entire species.
A few occasions where such fool's errands happen to work out will later look like the most intelligent things people ever did - after hindsight bias kicks in. Before Einstein revolutionized physics, he was not obviously more sane than those contemporaries of his who spent their lives doing earnest work in phrenology and theology.
And many people trying many different things, most of them forgotten and a few seeming really smart in hindsight - that isn't a special case that is only really true for Einstein, it is the typical way humans have randomly stumbled into the innovations that accumulate into our technological superiority. You don't get to epistemology without a bunch of people deciding to spend decades of their lives thinking about why a stick looks bent when it goes through a water surface. You don't settle every little island in the Pacific without a lot of people deciding to go beyond the horizon in a canoe, and most of them dying like the fools that they are. You don't invent rocketry without a mad obsession with finding new ways to kill each other.
- An AI whose behavior is determined by a utility function has a couple of problems that human (or squid or dolphin) intelligence doesn't have, and they seem to be fairly intrinsic to having a utility function in the first place. Namely, the vast majority of possible utility functions lead directly into conflict with all other agents.
To define a utility function is to define a (direction towards a) goal. So a discussion of an AI with one, single, unchanging utility function is a discussion of an AI with one, single, unchanging goal. That isn't just unlike the intelligent organisms we know, it isn't even a failure mode of intelligent organisms we know. The nearest approximations we have are the least intelligent members of our species.
- Two agents with identical utility functions are arguably functionally identical to a single agent that exists in two instances. Two agents with utility functions that are not identical are at best irrelevant to each other and at worst implacable enemies.
This enormously limits the interactions between agents and is again very different from the intelligent organisms we know, which frequently display intelligent behavior in exactly those instances where they interact with each other. We know communicating groups (or "hive minds") are smarter than their members, that's why we have institutions. AIs with utility functions as imagined by e.g. Yudkowsky cannot form these.
They can presumably create copies of themselves instead, which might be as good or even better, but we don't know that, because we don't really understand whatever it is exactly that makes institutions more intelligent than their members. It doesn't seem to be purely multiplied brainpower, because a person thinking for ten hours often doesn't find solutions that ten persons thinking together find in an hour. So if an AGI can multiply its own brainpower, that doesn't necessarily achieve the same result as thinking with others.
Now I'm not proposing an AGI should have nothing like a utility function, or that it couldn't temporarily adopt one. Utility functions are great for evaluating progress towards particular goals. Within well-defined areas of activity (such as playing Chess), even humans can temporarily behave as if they had utility functions, and I don't see why AGI shouldn't.
I'm also not saying that something like a paperclip maximizer couldn't be built, or that it could be stopped once underway. The AI alignment problem remains real.
I do contend that the paperclip maximizer wouldn't be an AGI, it would be narrow AI. It would have a goal, it would work towards it, but it would lack what we look for when we look for AGI. And whatever that is, I propose we don't find it within the space of things that can be described with (single, unchanging) utility functions.
And there are other places we could look. Maybe some of it is in whatever it is exactly that makes institutions more intelligent than their members. Maybe some of it is in why organisms (especially learning ones) play - playfulness and intelligence seem correlated, and playfulness has that incoherence that may be protective against paperclip-maximizer-like failure modes. I don't know.
I've posted about this once before, but here's a more developed version of the idea. Does this pose a serious problem for the simulation hypothesis, or does it merely complicate the idea?
1. Which room am I in?
Imagine two rooms, A and B. At a timeslice t2, there are exactly 1000 people in room B and only 1 person in room A. Neither room contains any clues as to which it is; i.e., no one can see anyone else in room B. If you were placed in one of these rooms with only the information above, which would you guess that you were in? The correct answer appears to be room B. After all, if everyone were to bet that they are in room B, almost everyone would win, whereas if everyone were to bet that they are in room A, almost everyone would lose.
Now imagine that you are told that during a time segment t1 to t2, a total of 100 trillion people had sojourned in room A and only 1 billion in room B. How does this extra information influence your response? The question posed above is not which room you are likely to have been in, all things considered, but which room you are currently in at t2. Insofar as betting odds guide rational belief, it still follows that if everyone at t2 were to bet that they are in room A, almost everyone would lose. This differs from what appears to be the correct conclusion if one reasons across time, from t1 to t2. Thus, we can imagine that at some future moment t3 everyone who ever sojourned in either room A or B is herded into another room C and then asked whether their journeys from t1 to t3 took them through room A or B. In this case, most people would win the bet if they were to point at room A rather than room B.
Let’s complicate this situation. Since more people in total pass through room A than room B, imagine that people are swapped in and out of room A faster than room B. Once in either room, a blindfold is removed and the occupant is asked which room they are in. After they answer, the blindfold is put back on. Thus, there are more total instances of removing blindfolds in room A than room B between t1 and t2. Should this fact change your mind about where you are at exactly t2? Surely one could argue that the directly relevant information is that pertaining to each individual timeslice, rather than the historical details of occupants being swapped in and out of rooms. After all, the bet is being made at a particular timeslice about a particular timeslice, and the fact is that most people who bet at t2 that they are in room B at t2 will win some cash, whereas those who bet that they are in room A will lose.
2. The simulation argument
Nick Bostrom (2003) argues that at least one of the following disjuncts is true: (1) civilizations like ours tend to self-destruct before reaching technological maturity, (2) civilizations like ours tend to reach technological maturity but refrain from running a large number of ancestral simulations, or (3) we are almost certainly in a simulation. The third disjunct corresponds to the “simulation hypothesis.” It is based on the following premises: first, assume the truth of functionalism, i.e., that physical systems that exhibit the right functional organization will give rise to conscious mental states like ours. Second, consider the computational power that could be available to future humans. Bostrom provides a convincing analysis that future humans will have at least the capacity to run a large number of ancestral simulations—or, more generally, simulations in which minds sufficiently “like ours” exist.
The final step of the argument proceeds as follows: if (1) and (2) are false, then we do not self-destruct before reaching a state of technological maturity and do not refrain from running a large number of ancestral simulations. It follows that we run a large number of ancestral simulations. If so, we have no independent knowledge of whether we exist in vivo or in machina. A “bland” version of the principle of indifference thus tells us to distribute our probabilities equally among all the possibilities. Since the number of sims would far exceed the number of non-sims in this scenario, we should infer that we are almost certainly simulated. As Bostrom writes, “it may also be worth to ponder that if everybody were to place a bet on whether they are in a simulation or not, then if people use the bland principle of indifference, and consequently place their money on being in a simulation if they know that that’s where almost all people are, then almost everyone will win their bets. If they bet on not being in a simulation, then almost everyone will lose. It seems better that the bland indifference principle be heeded” (Bostrom 2003).
Now, let us superimpose the scenario of Section 1 onto the simulation argument. Imagine that our posthuman descendants colonize the galaxy and their population grows to 100 billion individuals in total. Imagine further that at t2 they are running 100 trillion simulations, each of which contains 100 billion individuals. Thus, the total number of sims equals 10^25. If one of our posthuman descendants were asked whether she is a sim or non-sim, she should therefore answer that she is almost certainly a sim. Alternatively, imagine that at t2 our posthuman descendants decide to run only a single simulation in the universe that contains a mere 1 billion sims, ceteris paribus. Given this situation: if one of our posthuman descendants were asked whether she is a sim given this information, she should quite clearly answer that she is most likely a non-sim.
With this in mind, consider a final possible scenario: our posthuman descendants decide to run simulations with relatively small populations in a serial fashion, that is, one at a time. These simulations could be sped up a million times to enable complete recapitulations of our evolutionary history (as per Bostrom). The result is that at any given timeslice the total number of non-sims will far exceed the total number of sims—yet across time the total number of sims will accumulate and eventually far exceed the total number of non-sims. The result is that if one takes a bird’s-eye view of our posthuman civilization from its inception to its decline (say, because of the entropy death of the cosmos), and if one were asked whether she is more likely to have existed in vivo or in machina, it appears that she should answer “I was a sim.”
But this might not be the right way to reason about the situation. Consider that history is nothing more than a series of timeslices, one after the other. Since the ratio of non-sims to sims favors the former at every possible timeslice, one might argue that one should always answer the question, “Are you right now more likely to exist in vivo or in machina?” with “I probably exist in vivo.” Again, the difficulty that skeptics of this answer must overcome is the ostensible fact that if everyone were to bet on being simulated at any given timeslice—even billions of years after the first serial simulation is run—then nearly everyone would lose, whereas if everyone were to bet that they are a non-sim, then almost everyone would win.
The tension here emerges from the difference between timeslice reasoning and the sort of “atemporal” reasoning that Bostrom employs. If the former is epistemically robust, then Bostrom’s tripartite argument fails because none of the disjuncts are true. This is because the scenario above entails (a) we survive to reach technological maturity, and (b) we run a large number of ancestor simulations, yet (c) we do not have reason to believe that we are in a simulation at any particular moment. The latter proposition depends, of course, upon how we run the simulations (serially versus in parallel) and, relatedly, how we decide to reason about our metaphysical status at each moment in time.
In conclusion, I am unsure about whether this constitutes a refutation of Bostrom or merely complicates the picture. At the very least, I believe it does the latter, requiring more work on the topic.
Bostrom, Nick. 2003. Are You Living in a Computer Simulation? Philosophical Quarterly. 53(211): 243-255.
Original post: http://bearlamp.com.au/in-support-of-yak-shaving/
Yak shaving is heralded as pretty much "the devil" of trying to get things done. The anti-yak shaving movement will identify this problem as being one of focus. The moral of the story they give is "don't yak shave".
Originally posted in MIT's media lab with the description:
Any seemingly pointless activity which is actually necessary to solve a problem which solves a problem which, several levels of recursion later, solves the real problem you're working on.
But I prefer the story by Seth Godin:
"I want to wax the car today."
"Oops, the hose is still broken from the winter. I'll need to buy a new one at Home Depot."
"But Home Depot is on the other side of the Tappan Zee bridge and getting there without my EZPass is miserable because of the tolls."
"But, wait! I could borrow my neighbor's EZPass..."
"Bob won't lend me his EZPass until I return the mooshi pillow my son borrowed, though."
"And we haven't returned it because some of the stuffing fell out and we need to get some yak hair to restuff it."
And the next thing you know, you're at the zoo, shaving a yak, all so you can wax your car.
I disagree with the conclusion to not yak shave, and here's why.
The problem here is that you didn't wax the car because you spent all day shaving yaks (see also "there's a hole in my bucket"). In a startup that translates to not doing the tasks that get customers - the tasks which get money and actually make an impact, say "playing with the UI". It's easy to see why such anti-yak shaving sentiment would exist (see also: bikeshedding, rearranging deck chairs on the titanic, hamming questions). You can spend a whole day doing a whole lot of nothings; getting to bed and wonder what you actually accomplished that day (hint: a whole lot of running in circles).
Or at least that's what it looks like on the surface. But let's look a little deeper into what the problems and barriers are in the classic scenario.
- Want to wax car
- Broken hose
- Hardware store is far away
- No EZpass for tolls
- Neighbour won't lend the pass until pillow is returned
- Broken mooshi pillow
- Have to go get yak hair.
So it's not just one problem, but a series of problems that come up in a sequence. Hopefully by the end of the list you can turn around and walk all the way straight back up the list. But in the real world there might even be other problems like, you get to the hardware store and realise you don't know the hose-fitting size of your house so you need to call someone at home to check...
On closer inspection; this sort of behaviour is not like bikeshedding at all. Nor is it doing insignificant things under the guise of "real work". Instead this is about tackling what stands in the way of your problem. In problem solving in the real world, Don't yak shave" is not what I have found to be the solution. In experiencing this the first time it feels like a sequence of discoveries. For example, first you discover the hose. Then you discover the EZpass problem, then you discover the pillow problem, at which point you are pretty sick of trying to wax your car and want a break or to work on something else.
I propose that classic yak shaving presents a very important sign that things are broken. In order to get to the classic scenario we had to
- have borrowed a pillow from our neighbour,
- have it break and not get fixed,
- not own our own EZpass,
- live far from a hardware store,
- have a broken hose, and
- want to wax a car.
Each open problem in this scenario presents an open problem or an open loop. Yak shaving presents a warning sign that you are in a Swiss-cheese model scenario of problems. This might sound familiar because it's the kind of situation which leads to the Fukushima reactor meltdown. It's the kind of scenario when you try to work out why the handyman fell off your roof and died, and you notice that:
- he wasn't wearing a helmet.
- He wasn't tied on safely
- His ladder wasn't tied down
- It was a windy day
- His harness was old and worn out
- He was on his phone while on the roof...
And you realise that any five of those things could have gone wrong and not caused much of a problem. But you put all six of those mistakes together and line the wind up in just the right way, everything comes tumbling down.
Yak shaving is a sign that you are living with problems waiting to crash down. And living in a situation where you don't have time to do the sort of maintenance that would fix things and keep smoulders from bursting into flames.
I can almost guaranteed that when your house of cards all come falling down, it happens on a day that you don't have the spare time to waste on ridiculous seeming problems.
What should you do if you are in this situation?
Yak shave. The best thing you can do if half your projects are unfinished and spread around the room is to tidy up. Get things together; organise things, initiate the GTD system (or any system), wrap up old bugs, close the open loops (advice from GTD) and as many times as you can; YAK SHAVE for all you are worth!
If something is broken, and you are living with it, that's not acceptable. You need a system in your life to regularly get around to fixing it. Notepads, reviews, list keeping, set time aside for doing it and plan to fix things.
So I say, Yak Shave, as much, as long, and as many times as it takes till there are no more yaks to shave.
Something not mentioned often enough is a late addition to my list of common human goals.
Improve the tools available – sharpen the axe, write a new app that can do the thing you want, invent systems that work for you. prepare for when the rest of the work comes along.
People often ask how you can plan for lucky breaks in your life. How do you cultivate opportunity? I can tell you right here and now, this is how.
Keep a toolkit at the ready, a work-space (post coming soon) at the ready, spare time for things to go wrong and things to go right. And don't forget to play. Why do we sharpen the axe? Clear Epistemics, or clear Instrumental Rationality. Be prepared for the situation that will come up.
Yak Shave like your life depends on it. Because your life might one day depend on it. Your creativity certainly does.
Meta: this took 2.5 hrs to write.
Some theater people at NYU people wanted to demonstrate how gender stereotypes affected the 2016 US presidential election. So they decided to put on a theatrical performance of the presidential debates – but with the genders of the principals swapped. They assumed that this would show how much of a disadvantage Hillary Clinton was working under because of her gender. They were shocked to discover the opposite – audiences full of Clinton supporters, watching the gender-swapped debates, came away thinking that Trump was a better communicator than they'd thought.
The principals don't seem to have come into this with a fair-minded attitude. Instead, it seems to have been a case of "I'll show them!":
Salvatore says he and Guadalupe began the project assuming that the gender inversion would confirm what they’d each suspected watching the real-life debates: that Trump’s aggression—his tendency to interrupt and attack—would never be tolerated in a woman, and that Clinton’s competence and preparedness would seem even more convincing coming from a man.
Let's be clear about this. This was not epistemic even-handedness. This was a sincere attempt at confirmation bias. They believed one thing, and looked only for confirming evidence to prove their point. It was only when they started actually putting together the experiment that they realized they might learn the opposite lesson:
But the lessons about gender that emerged in rehearsal turned out to be much less tidy. What was Jonathan Gordon smiling about all the time? And didn’t he seem a little stiff, tethered to rehearsed statements at the podium, while Brenda King, plainspoken and confident, freely roamed the stage? Which one would audiences find more likeable?
What made this work? I think what happened is that they took their own beliefs literally. They actually believed that people hated Hillary because she was a woman, and so their idea of something that they were confident would show this clearly was a fair test. Because of this, when things came out the opposite of the way they'd predicted, they noticed and were surprised, because they actually expected the demonstration to work.
But they went further. Even though they knew in advance of the public performances that the experiment got the wrong answer, they neither falsified nor file-drawered the evidence. They tried to show, they got a different answer, they showed it anyway.
This is much, much better science than contemporary medical or psychology research were before the replication crisis.
Sometimes, when I think about how epistemically corrupt our culture is, I'm tempted to adopt a permanent defensive crouch and disbelieve anything I can't fact-check, to explicitly adjust for all the relevant biases, and this prospect sounds exhausting. It's not actually necessary. You don't have to worry too much about your biases. Just take your own beliefs literally, as though they mean what they say they mean, and try to believe all their consequences as well. And, when you hit a contradiction – well, now you have an opportunity to learn where you're wrong.
(Cross-posted at my personal blog.)
View more: Next