Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
It's unlikely that by pure chance we are currently writing the correct number of LW posts. So it might be useful to try to figure out if we're currently writing too few or too many LW posts. If commenters are evenly divided on this question then we're probably close to the optimal number; otherwise we have an opportunity to improve. Here's my case for why we should be writing more posts.
Let's say you came up with a new and useful life hack, you have a novel line of argument on an important topic, or you stumbled across some academic research that seems valuable and isn't frequently discussed on Less Wrong. How valuable would it be for you to share your findings by writing up at post for Less Wrong?
Recently I visited a friend of mine and commented on the extremely bright lights he had in his room. He referenced this LW post written over a year ago. That got me thinking. The bright lights in my friend's room make his life better every day, for a small upfront cost. And my friend is probably just one of tens or hundreds of people to use bright lights this way as a result of that post. Given that the technique seems to be effective, that number will probably continue going up, and will grow exponentially via word of mouth (useful memes tend to spread). So by my reckoning, chaosmage has created and will create a lot of utility. If they had kept that idea to themselves, I suspect they would have captured less than 1% of the total value to be had from the idea.
You can reach orders of magnitude more people writing an obscure Less Wrong comment than you can talking to a few people at a party in person. For example, at least 100 logged in users read this fairly obscure comment of mine. So if you're going to discuss an important topic, it's often best to do it online. Given enough eyeballs, all bugs in human reasoning are shallow.
Yes, peoples' time does have opportunity costs. But people are on Less Wrong because they need a break anyway. (If you're a LW addict, you might try the technique I describe in this post for dealing with your addiction. If you're dealing with serious cravings, for LW or video games or drugs or anything else, perhaps look at N-acetylcysteine... a variety of studies suggest it helps reduce cravings (behavioral addictions are pretty similar to drug addictions neurologically btw), it has a good safety profile, and you can buy it on Amazon. Not prescribed by doctors because it's not approved by the FDA. Yes, you could use willpower (it's worked so well in the past...) or you could hit the "stop craving things as much" button, and then try using willpower. Amazing what you can learn on Less Wrong isn't it?)
And LW does a good job of indexing content by how much utility people are going to get out of it. It's easy to look at a post's keywords and score and guess if it's worth reading. If your post is bad it will vanish in to obscurity and few will be significantly harmed. (Unless it's bad and inflammatory, or bad with a linkbait title... please don't write posts like that.) If your post is good, it will spread virally on its own and you'll generate untold utility.
Given that above-average posts get read much more than below-average posts, if you're post's expected quality is average, sharing it on Less Wrong has a high positive expected utility. Like Paul Graham, I think we should be spreading our net wide and trying to capture all of the winners we can.
I'm going to call out a particular subset of LW commenters in particular. If you're a commenter and you (a) have at least 100 karma, (b) it's over 80% positive, and (c) you have a draft post with valuable new ideas you've been sitting on for a while, you should totally polish it off and share it with us! In general, the better your track record, the more you should be inclined to share ideas that seem valuable. Worst case you can delete your post and cut your losses.
This is crossposted from my new blog. I was planning to write a short post explaining how Newcomblike problems are the norm and why any sufficiently powerful intelligence built to use causal decision theory would self-modify to stop using causal decision theory in short order. Turns out it's not such a short topic, and it's turning into a short intro to decision theory.
I've been motivating MIRI's technical agenda (decision theory and otherwise) to outsiders quite frequently recently, and I received a few comments of the form "Oh cool, I've seen lots of decision theory type stuff on LessWrong, but I hadn't understood the connection." While the intended audience of my blog is wider than the readerbase of LW (and thus, the tone might seem off and the content a bit basic), I've updated towards these posts being useful here. I also hope that some of you will correct my mistakes!
This sequence will probably run for four or five posts, during which I'll motivate the use of decision theory, the problems with the modern standard of decision theory (CDT), and some of the reasons why these problems are an FAI concern.
I'll be giving a talk on the material from this sequence at Purdue next week.
Choice is a crucial component of reasoning. Given a set of available actions, which action do you take? Do you go out to the movies or stay in with a book? Do you capture the bishop or fork the king? Somehow, we must reason about our options and choose the best one.
Of course, we humans don't consciously weigh all of our actions. Many of our choices are made subconsciously. (Which letter will I type next? When will I get a drink of water?) Yet even if the choices are made by subconscious heuristics, they must be made somehow.
In practice, decisions are often made on autopilot. We don't weigh every available alternative when it's time to prepare for work in the morning, we just pattern-match the situation and carry out some routine. This is a shortcut that saves time and cognitive energy. Yet, no matter how much we stick to routines, we still spend some of our time making hard choices, weighing alternatives, and predicting which available action will serve us best.
The study of how to make these sorts of decisions is known as Decision Theory. This field of research is closely intertwined with Economics, Philosophy, Mathematics, and (of course) Game Theory. It will be the subject of today's post.
If the future is determined by physics, how can anyone control it?
In Thou Art Physics, I pointed out that since you are within physics, anything you control is necessarily controlled by physics. Today we will talk about a different aspect of the confusion, the words "determined" and "control".
The "Block Universe" is the classical term for the universe considered from outside Time. Even without timeless physics, Special Relativity outlaws any global space of simultaneity, which is widely believed to suggest the Block Universe—spacetime as one vast 4D block.
When you take a perspective outside time, you have to be careful not to let your old, timeful intuitions run wild in the absence of their subject matter.
In the Block Universe, the future is not determined before you make your choice. "Before" is a timeful word. Once you descend so far as to start talking about time, then, of course, the future comes "after" the past, not "before" it.
Although I feel that Nick Bostrom’s new book “Superintelligence” is generally awesome and a well-needed milestone for the field, I do have one quibble: both he and Steve Omohundro appear to be more convinced than I am by the assumption that an AI will naturally tend to retain its goals as it reaches a deeper understanding of the world and of itself. I’ve written a short essay on this issue from my physics perspective, available at http://arxiv.org/pdf/1409.0813.pdf.
give you, some we can't, few have been written up and even fewer in any
well-organized way. Benja or Nate might be able to expound in more detail
while I'm in my seclusion.
Very briefly, though:
The problem of utility functions turning out to be ill-defined in light of
new discoveries of the universe is what Peter de Blanc named an
"ontological crisis" (not necessarily a particularly good name, but it's
what we've been using locally).
The way I would phrase this problem now is that an expected utility
maximizer makes comparisons between quantities that have the type
"expected utility conditional on an action", which means that the AI's
utility function must be something that can assign utility-numbers to the
AI's model of reality, and these numbers must have the further property
that there is some computationally feasible approximation for calculating
expected utilities relative to the AI's probabilistic beliefs. This is a
constraint that rules out the vast majority of all completely chaotic and
uninteresting utility functions, but does not rule out, say, "make lots of
Models also have the property of being Bayes-updated using sensory
information; for the sake of discussion let's also say that models are
about universes that can generate sensory information, so that these
models can be probabilistically falsified or confirmed. Then an
"ontological crisis" occurs when the hypothesis that best fits sensory
information corresponds to a model that the utility function doesn't run
on, or doesn't detect any utility-having objects in. The example of
"immortal souls" is a reasonable one. Suppose we had an AI that had a
naturalistic version of a Solomonoff prior, a language for specifying
universes that could have produced its sensory data. Suppose we tried to
give it a utility function that would look through any given model, detect
things corresponding to immortal souls, and value those things. Even if
the immortal-soul-detecting utility function works perfectly (it would in
fact detect all immortal souls) this utility function will not detect
anything in many (representations of) universes, and in particular it will
not detect anything in the (representations of) universes we think have
most of the probability mass for explaining our own world. In this case
the AI's behavior is undefined until you tell me more things about the AI;
an obvious possibility is that the AI would choose most of its actions
based on low-probability scenarios in which hidden immortal souls existed
that its actions could affect. (Note that even in this case the utility
function is stable!)
Since we don't know the final laws of physics and could easily be
surprised by further discoveries in the laws of physics, it seems pretty
clear that we shouldn't be specifying a utility function over exact
physical states relative to the Standard Model, because if the Standard
Model is even slightly wrong we get an ontological crisis. Of course
there are all sorts of extremely good reasons we should not try to do this
anyway, some of which are touched on in your draft; there just is no
simple function of physics that gives us something good to maximize. See
also Complexity of Value, Fragility of Value, indirect normativity, the
whole reason for a drive behind CEV, and so on. We're almost certainly
going to be using some sort of utility-learning algorithm, the learned
utilities are going to bind to modeled final physics by way of modeled
higher levels of representation which are known to be imperfect, and we're
going to have to figure out how to preserve the model and learned
utilities through shifts of representation. E.g., the AI discovers that
humans are made of atoms rather than being ontologically fundamental
humans, and furthermore the AI's multi-level representations of reality
evolve to use a different sort of approximation for "humans", but that's
okay because our utility-learning mechanism also says how to re-bind the
learned information through an ontological shift.
This sorta thing ain't going to be easy which is the other big reason to
start working on it well in advance. I point out however that this
doesn't seem unthinkable in human terms. We discovered that brains are
made of neurons but were nonetheless able to maintain an intuitive grasp
on what it means for them to be happy, and we don't throw away all that
info each time a new physical discovery is made. The kind of cognition we
want does not seem inherently self-contradictory.
Three other quick remarks:
*) Natural selection is not a consequentialist, nor is it the sort of
consequentialist that can sufficiently precisely predict the results of
modifications that the basic argument should go through for its stability.
The Omohundrian/Yudkowskian argument is not that we can take an arbitrary
stupid young AI and it will be smart enough to self-modify in a way that
preserves its values, but rather that most AIs that don't self-destruct
will eventually end up at a stable fixed-point of coherent
consequentialist values. This could easily involve a step where, e.g., an
AI that started out with a neural-style delta-rule policy-reinforcement
learning algorithm, or an AI that started out as a big soup of
self-modifying heuristics, is "taken over" by whatever part of the AI
first learns to do consequentialist reasoning about code. But this
process doesn't repeat indefinitely; it stabilizes when there's a
consequentialist self-modifier with a coherent utility function that can
precisely predict the results of self-modifications. The part where this
does happen to an initial AI that is under this threshold of stability is
a big part of the problem of Friendly AI and it's why MIRI works on tiling
agents and so on!
*) Natural selection is not a consequentialist, nor is it the sort of
consequentialist that can sufficiently precisely predict the results of
modifications that the basic argument should go through for its stability.
It built humans to be consequentialists that would value sex, not value
inclusive genetic fitness, and not value being faithful to natural
selection's optimization criterion. Well, that's dumb, and of course the
result is that humans don't optimize for inclusive genetic fitness.
Natural selection was just stupid like that. But that doesn't mean
there's a generic process whereby an agent rejects its "purpose" in the
light of exogenously appearing preference criteria. Natural selection's
anthropomorphized "purpose" in making human brains is just not the same as
the cognitive purposes represented in those brains. We're not talking
about spontaneous rejection of internal cognitive purposes based on their
causal origins failing to meet some exogenously-materializing criterion of
validity. Our rejection of "maximize inclusive genetic fitness" is not an
exogenous rejection of something that was explicitly represented in us,
that we were explicitly being consequentialists for. It's a rejection of
something that was never an explicitly represented terminal value in the
first place. Similarly the stability argument for sufficiently advanced
self-modifiers doesn't go through a step where the successor form of the
AI reasons about the intentions of the previous step and respects them
apart from its constructed utility function. So the lack of any universal
preference of this sort is not a general obstacle to stable
*) The case of natural selection does not illustrate a universal
computational constraint, it illustrates something that we could
anthropomorphize as a foolish design error. Consider humans building Deep
Blue. We built Deep Blue to attach a sort of default value to queens and
central control in its position evaluation function, but Deep Blue is
still perfectly able to sacrifice queens and central control alike if the
position reaches a checkmate thereby. In other words, although an agent
needs crystallized instrumental goals, it is also perfectly reasonable to
have an agent which never knowingly sacrifices the terminally defined
utilities for the crystallized instrumental goals if the two conflict;
indeed "instrumental value of X" is simply "probabilistic belief that X
leads to terminal utility achievement", which is sensibly revised in the
presence of any overriding information about the terminal utility. To put
it another way, in a rational agent, the only way a loose generalization
about instrumental expected-value can conflict with and trump terminal
actual-value is if the agent doesn't know it, i.e., it does something that
it reasonably expected to lead to terminal value, but it was wrong.
This has been very off-the-cuff and I think I should hand this over to
Nate or Benja if further replies are needed, if that's all right.
Related to: Policy Debates Should Not Appear One-Sided
There is a well-known fable which runs thus:
“Driven by hunger, a fox tried to reach some grapes hanging high on the vine but was unable to, although he leaped with all his strength. As he went away, the fox remarked 'Oh, you aren't even ripe yet! I don't need any sour grapes.' People who speak disparagingly of things that they cannot attain would do well to apply this story to themselves.”
This gives rise to the common expression ‘sour grapes’, referring to a situation in which one incorrectly claims to not care about something to save face or feel better after being unable to get it.
This seems to be related to a general phenomenon, in which motivated cognition leads one to flinch away from the prospect of an action that is inconvenient or painful in the short term by concluding that a less-painful option strictly dominates the more-painful one.
In the fox’s case, the allegedly-dominating option is believing (or professing) that he did not want the grapes. This spares him the pain of feeling impotent in face of his initial failure, or the embarrassment of others thinking him to have failed. If he can’t get the grapes anyway, then he might as well erase the fact that he ever wanted them, right? The problem is that considering this line of reasoning will make it more tempting to conclude that the option really was dominating—that he really couldn’t have gotten the grapes. But maybe he could’ve gotten the grapes with a bit more work—by getting a ladder, or making a hook, or Doing More Squats in order to Improve His Vert.
The fable of the fox and the grapes doesn’t feel like a perfect fit, though, because the fox doesn’t engage in any conscious deliberation before giving up on sour grapes; the whole thing takes place subconsciously. Here are some other examples that more closely illustrate the idea of conscious rationalization by use of overly convenient partitions:
“Be who you are and say what you feel, because those who mind don't matter and those who matter don't mind.”
This advice is neither good in full generality nor bad in full generality. Clearly there are some situations where some person is worrying too much about other people judging them, or is anxious about inconveniencing others without taking their own preferences into account. But there are also clearly situations (like dealing with an unpleasant, incompetent boss) where fully exposing oneself or saying whatever comes into one’s head is not strategic and outright disastrous. Without taking into account the specifics of the situation of the recipient of the advice, it is of limited use.
It is convenient to absolve oneself of blame by writing off anybody who challenges our first impulse as someone who ‘doesn’t matter’; it means that if something goes wrong, one can avoid the painful task of analysing and modifying one’s behaviour.
In particular, we have the following corollary:
The Fundamental Fallacy of Dating:
“Be yourself and don’t hide who you are. Be up-front about what you want. If it puts your date off, then they wouldn’t have been good for you anyway, and you’ve dodged a bullet!”
In the short-term it is convenient to not have to filter or reflect on what one says (face-to-face) or writes (online dating). In the longer term, having no filter is not a smart way to approach dating. As the biases and heuristics program has shown, people are often mistaken about what they would prefer under reflection, and are often inefficient and irrational in pursuing what they want. There are complicated courtship conventions governing timelines for revealing information about oneself and negotiating preferences, that have evolved to work around these irrationalities, to the benefit of both parties. In particular, people are dynamically inconsistent, and willing to compromise a lot more later on in a courtship than they thought they would earlier on; it is often a favour to both of you to respect established boundaries regarding revealing information and getting ahead of the current stage of the relationship.
For those who have not much practised the skill of avoiding triggering Too Much Information reactions, it can feel painful and disingenuous to even try changing their behaviour, and they rationalise it via the Fundamental Fallacy. At any given moment, changing this behaviour is painful and causes a flinch reaction, even though the value of information of trying a different approach might be very high, and might cause less pain (e.g. through reduced loneliness) in the long term.
We also have:
PR rationalization and incrimination:
“There’s already enough ammunition out there if anybody wants to assassinate my character, launch a smear campaign, or perform a hatchet job. Nothing I say at this point could make it worse, so there’s no reason to censor myself.”
This is an overly convenient excuse. It does not take into account, for example, that new statements provide a new opportunity for one to come to the attention of quote miners in the first place, or that different statements might be more or less easy to seed a smear campaign; ammunition can vary in type and accessibility, so that adding more can increase the convenience of a hatchet job. It might turn out, after weighing the costs and benefits, that speaking honestly is the right decision. But one can’t know that on the strength of a convenient deontological argument that doesn’t consider those costs. Similarly:
“I’ve already pirated so much stuff I’d be screwed if I got caught. Maybe it was unwise and impulsive at first, but by now I’m past the point of no return.”
This again fails to take into account the increased risk of one’s deeds coming to attention; if most prosecutions are caused by (even if not purely about) offences shortly before the prosecution, and you expect to pirate long into the future, then your position now is the same as when you first pirated; if it was unwise then, then it’s unwise now.
The common fallacy in all these cases is that one looks at only the extreme possibilities, and throws out the inconvenient, ambiguous cases. This results in a disconnected space of possibilities that is engineered to allow one to prove a convenient conclusion. For example, the Seating Fallacy throws out the possibility that there are people who mind but also matter; the Fundamental Fallacy of Dating prematurely rules out people who are dynamically inconsistent or are imperfect introspectors, or who have uncertainty over preferences; PR rationalization fails to consider marginal effects and quantify risks in favour of a lossy binary approach.
What are other examples of situations where people (or Less Wrongers specifically) might fall prey to this failure mode?
Summary: Overhead expenses' (CEO salary, percentage spent on fundraising) are often deemed a poor measure of charity effectiveness by Effective Altruists, and so they disprefer means of charity evaluation which rely on these. However, 'funding cannibalism' suggests that these metrics (and the norms that engender them) have value: if fundraising is broadly a zero-sum game between charities, then there's a commons problem where all charities could spend less money on fundraising and all do more good, but each is locally incentivized to spend more. Donor norms against increasing spending on zero-sum 'overheads' might be a good way of combating this. This valuable collective action of donors may explain the apparent underutilization of fundraising by charities, and perhaps should make us cautious in undermining it.
The EA critique of charity evaluation
Pre-Givewell, the common means of evaluating charities (Guidestar, Charity Navigator) used a mixture of governance checklists 'overhead indicators'. Charities would gain points both for having features associated with good governance (being transparent in the right ways, balancing budgets, the right sorts of corporate structure), but also in spending its money on programs and avoiding 'overhead expenses' like administration and (especially) fundraising. For shorthand, call this 'common sense' evaluation.
The standard EA critique is that common sense evaluation doesn't capture what is really important: outcomes. It is easy to imagine charities that look really good to common sense evaluation yet have negligible (or negative) outcomes. In the case of overheads, it becomes unclear whether these are even proxy measures of efficacy. Any fundraising that still 'turns a profit' looks like a good deal, whether it comprises five percent of a charity's spending or fifty.
A summary of the EA critique of common sense evaluation that its myopic focus on these metrics gives pathological incentives, as these metrics frequently lie anti-parallel to maximizing efficacy. To score well on these evaluations, charities may be encouraged to raise less money, hire less able staff, and cut corners in their own management, even if doing these things would be false economies.
Funding cannibalism and commons tragedies
In the wake of the ALS 'Ice bucket challenge', Will MacAskill suggested there is considerable of 'funding cannabilism' in the non-profit sector. Instead of the Ice bucket challenge 'raising' money for ALS, it has taken money that would have been donated to other causes instead - cannibalizing other causes. Rather than each charity raising funds independently of one another, they compete for a fairly fixed pie of aggregate charitable giving.
The 'cannabilism' thesis is controversial, but looks plausible to me, especially when looking at 'macro' indicators: proportion of household charitable spending looks pretty fixed whilst fundraising has increased dramatically, for example.
If true, cannibalism is important. As MacAskill points out, the money tens of millions of dollars raised for ALS is no longer an untrammelled good, alloyed as it is with the opportunity cost of whatever other causes it has cannibalized (q.v.). There's also a more general consideration: if there is a fixed pot of charitable giving insensitive to aggregate fundraising, then fundraising becomes a commons problem. If all charities could spend less on their fundraising, none would lose out, so all could spend more of their funds on their programs. However, for any alone to spend less on fundraising allows the others to cannibalize it.
Civilizing Charitable Cannibals, and Metric Meta-Myopia
Coordination among charities to avoid this commons tragedy is far fetched. Yet coordination of donors on shared norms about 'overhead ratio' can help. By penalizing a charity for spending too much on zero-sum games with other charities like fundraising, donors can stop a race to the bottom fundraising free for all and burning of the charitable commons that implies. The apparently-high marginal return to fundraising might suggest this is already in effect (and effective!)
The contrarian take would be that it is the EA critique of charity evaluation which is myopic, not the charity evaluation itself - by looking at the apparent benefit for a single charity of more overhead, the EA critique ignores the broader picture of the non-profit ecosystem, and their attack undermines a key environmental protection of an important commons - further, one which the right tail of most effective charities benefit from just as much as the crowd of 'great unwashed' other causes. (Fundraising ability and efficacy look like they should be pretty orthogonal. Besides, if they correlate well enough that you'd expect the most efficacious charities would win the zero-sum fundraising game, couldn't you dispense with Givewell and give to the best fundraisers?)
The contrarian view probably goes too far. Although there's a case for communally caring about fundraising overheads, as cannibalism leads us to guess it is zero sum, parallel reasoning is hard to apply to administration overhead: charity X doesn't lose out if charity Y spends more on management, but charity Y is still penalized by common sense evaluation even if its overall efficacy increases. I'd guess that features like executive pay lie somewhere in the middle: non-profit executives could be poached by for-profit industries, so it is not as simple as donors prodding charities to coordinate to lower executive pay; but donors can prod charities not to throw away whatever 'non-profit premium' they do have in competing with one another for top talent (c.f.). If so, we should castigate people less for caring about overhead, even if we still want to encourage them to care about efficacy too.
The invisible hand of charitable pan-handling
If true, it is unclear whether the story that should be told is 'common sense was right all along and the EA movement overconfidently criticised' or 'A stopped clock is right twice a day, and the generally wrong-headed common sense had an unintended feature amongst the bugs'. I'd lean towards the latter, simply the advocates of the common sense approach have not (to my knowledge) articulated these considerations themselves.
However, many of us believe the implicit machinery of the market can turn without many of the actors within it having any explicit understanding of it. Perhaps the same applies here. If so, we should be less confident in claiming the status quo is pathological and we can do better: there may be a rationale eluding both us and its defenders.
Attempt at the briefest content-full Less Wrong post:
Once AI is developed, it could "easily" colonise the universe. So the Great Filter (preventing the emergence of star-spanning civilizations) must strike before AI could be developed. If AI is easy, we could conceivably have built it already, or we could be on the cusp of building it. So the Great Filter must predate us, unless AI is hard.
Some of the comments on the link by James_Miller exactly six months ago provided very specific estimates of how the events might turn out:
- The odds of Russian intervening militarily = 40%.
- The odds of the Russians losing the conventional battle (perhaps because of NATO intervention) conditional on them entering = 30%.
- The odds of the Russians resorting to nuclear weapons conditional on them losing the conventional battle = 20%.
"Russians intervening militarily" could be anything from posturing to weapon shipments to a surgical strike to a Czechoslovakia-style tank-roll or Afghanistan invasion. My guess that the odds of the latter is below 5%.
A bet between James_Miller and solipsist:
I will bet you $20 U.S. (mine) vs $100 (yours) that Russian tanks will be involved in combat in the Ukraine within 60 days. So in 60 days I will pay you $20 if I lose the bet, but you pay me $100 if I win.
While it is hard to do any meaningful calibration based on a single event, there must be lessons to learn from it. Given that Russian armored columns are said to capture key Ukrainian towns today, the first part of James_Miller's prediction has come true, even if it took 3 times longer than he estimated.
Note that even the most pessimistic person in that conversation (James) was probably too optimistic. My estimate of 5% appears way too low in retrospect, and I would probably bump it to 50% for a similar event in the future.
Now, given that the first prediction came true, how would one reevaluate the odds of the two further escalations he listed? I still feel that there is no way there will be a "conventional battle" between Russia and NATO, but having just been proven wrong makes me doubt my assumptions. If anything, maybe I should give more weight to what James_Miller (or at least Dan Carlin) has to say on the issue. And if I had any skin in the game, I would probably be even more cautious.
There are two insights from Bayesianism which occurred to me and which I hadn't seen anywhere else before.
I like lists in the two posts linked above, so for the sake of completeness, I'm going to add my two cents to a public domain. This post is about the second penny, the first one is here.
Or to ask the question another way, is there such a thing as a theory of bounded rationality, and if so, is it the same thing as a theory of general intelligence?
The LW Wiki defines general intelligence as "ability to efficiently achieve goals in a wide range of domains", while instrumental rationality is defined as "the art of choosing and implementing actions that steer the future toward outcomes ranked higher in one's preferences". These definitions seem to suggest that rationality and intelligence are fundamentally the same concept.
However, rationality and AI have separate research communities. This seems to be mainly for historical reasons, because people studying rationality started with theories of unbounded rationality (i.e., with logical omniscience or access to unlimited computing resources), whereas AI researchers started off trying to achieve modest goals in narrow domains with very limited computing resources. However rationality researchers are trying to find theories of bounded rationality, while people working on AI are trying to achieve more general goals with access to greater amounts of computing power, so the distinction may disappear if the two sides end up meeting in the middle.
We also distinguish between rationality and intelligence when talking about humans. I understand the former as the ability of someone to overcome various biases, which seems to consist of a set of skills that can be learned, while the latter is a kind of mental firepower measured by IQ tests. This seems to suggest another possibility. Maybe (as Robin Hanson recently argued on his blog) there is no such thing as a simple theory of how to optimally achieve arbitrary goals using limited computing power. In this view, general intelligence requires cooperation between many specialized modules containing domain specific knowledge, so "rationality" would just be one module amongst many, which tries to find and correct systematic deviations from ideal (unbounded) rationality caused by the other modules.
I was more confused when I started writing this post, but now I seem to have largely answered my own question (modulo the uncertainty about the nature of intelligence mentioned above). However I'm still interested to know how others would answer it. Do we have the same understanding of what "rationality" and "intelligence" mean, and know what distinction someone is trying to draw when they use one of these words instead of the other?
ETA: To clarify, I'm asking about the difference between general intelligence and rationality as theoretical concepts that apply to all agents. Human rationality vs intelligence may give us a clue to that answer, but isn't the main thing that I'm interested here.
View more: Next