Crazy Global Warming Solution Ideas
Mine was to work tax policy to incentivize companies to make all their packaging shiny and white, incentivize people to litter, and disincentivize everybody from recycling.
My friend's was to use a giant rocket to push the earth farther away from the sun
Experiment: Changing minds vs. preaching to the choir
1. Problem
In the market economy production is driven by monetary incentives – higher reward for an economic activity makes more people willing to engage in it. Internet forums follow the same principle but with a different currency - instead of money the main incentive of internet commenters is the reaction of their audience. A strong reaction expressed by a large number of replies or “likes” encourages commenters to increase their output. Its absence motivates them to quit posting or change their writing style.
On neutral topics, using audience reaction as an incentive works reasonably well: attention focuses on the most interesting or entertaining comments. However, on partisan issues, such incentives become counterproductive. Political forums and newspaper comment sections demonstrate the same patterns:
- The easiest way to maximize “likes” for a given amount of effort is by posting an emotionally charged comment which appeals to audience’s biases (“preaching to the choir”).
- The easiest way to maximize the number of replies is by posting a low quality comment that goes against audience’s biases (“trolling”).
- Both effects are amplified when the website places comments with most replies or “likes” at the top of the page.
The problem is not restricted to low-brow political forums. The following graph, which shows the average number of comments as a function of an article’s karma, was generated from the Lesswrong data.

The data suggests that the easiest way to maximize the number of replies is to write posts that are disliked by most readers. For instance, articles with the karma of -1 on average generate twice as many comments (20.1±3.4) as articles with the karma of +1 (9.3±0.8).
2. Technical Solution
Enabling constructive discussion between people with different ideologies requires reversing the incentives – people need to be motivated to write posts that sound persuasive to the opposite side rather than to their own supporters.
We suggest addressing this problem that this problem by changing the voting system. In brief, instead of votes from all readers, comment ratings and position on the page should be based on votes from the opposite side only. For example, in the debate on minimum wage, for arguments against minimum wage only the upvotes of minimum wage supporters would be counted and vice versa.
The new voting system can simultaneously achieve several objectives:
· eliminate incentives for preaching to the choir
· give posters a more objective feedback on the impact of their contributions, helping them improve their writing style
· focus readers’ attention on comments most likely to change their minds instead of inciting comments that provoke an irrational defensive reaction.
3. Testing
If you are interested in measuring and improving your persuasive skills and would like to help others to do the same, you are invited to take part in the following experiment:
Step I. Submit Pro or Con arguments on any of the following topics (up to 3 arguments in total):
Should the government give all parents vouchers for private school tuition?
Should developed countries increase the number of immigrants they receive?
Should there be a government mandated minimum wage?
Step II. For each argument you have submitted, rate 15 arguments submitted by others.
Step III. Participants will be emailed the results of the experiment including:
- ratings their arguments receive from different reviewer groups (supporters, opponents and neutrals)
- the list of the most persuasive Pro & Con arguments on each topic (i.e. arguments that received the highest ratings from opposing and neutral groups)
- rating distribution in each group
Step IV (optional). If interested, sign up for the next round.
The experiment will help us test the effectiveness of the new voting system and develop the best format for its application.
Is my brain a utility minimizer? Or, the mechanics of labeling things as "work" vs. "fun"
I recently encountered something that is, in my opinion, one of the most absurd failure modes of the human brain. I first encountered this after introspection on useful things that I enjoy doing, such as programming and writing. I noticed that my enjoyment of the activity doesn't seem to help much when it comes to motivation for earning income. This was not boredom from too much programming, as it did not affect my interest in personal projects. What it seemed to be, was the brain categorizing activities into "work" and "fun" boxes. On one memorable occasion, after taking a break due to being exhausted with work, I entertained myself, by programming some more, this time on a hobby personal project (as a freelancer, I pick the projects I work on so this is not from being told what to do). Relaxing by doing the exact same thing that made me exhausted in the first place.
The absurdity of this becomes evident when you think about what distinguishes "work" and "fun" in this case, which is added value. Nothing changes about the activity except the addition of more utility, making a "work" strategy always dominate a "fun" strategy, assuming the activity is the same. If you are having fun doing something, handing you some money can't make you worse off. Making an outcome better makes you avoid it. Meaning that the brain is adopting a strategy that has a (side?) effect of minimizing future utility, and it seems like it is utility and not just money here - as anyone who took a class in an area that personally interested them knows, other benefits like grades recreate this effect just as well. This is the reason I think this is among the most absurd biases - I can understand akrasia, wanting the happiness now and hyperbolically discounting what happens later, or biases that make something seem like the best option when it really isn't. But knowingly punishing what brings happiness just because it also benefits you in the future? It's like the discounting curve dips into the negative region. I would really like to learn where is the dividing line between which kinds of added value create this effect and which ones don't (like money obviously does, and immediate enjoyment obviously doesn't). Currently I'm led to believe that the difference is present utility vs. future utility, (as I mentioned above) or final vs. instrumental goals, and please correct me if I'm wrong here.
This is an effect that has been studied in psychology and called the overjustification effect, called that because the leading theory explains it in terms of the brain assuming the motivation comes from the instrumental gain instead of the direct enjoyment, and then reducing the motivation accordingly. This would suggest that the brain has trouble seeing a goal as being both instrumental and final, and for some reason the instrumental side always wins in a conflict. However, its explanation in terms of self-perception bothers me a little, since I find it hard to believe that a recent creation like self-perception can override something as ancient and low-level as enjoyment of final goals. I searched LessWrong for discussions of the overjustification effect, and the ones I found discussed it in the context of self-perception, not decision-making and motivation. It is the latter that I wanted to ask for your thoughts on.
Some concepts are like Newton's Gravity, others are like... Luminiferous Aether?
Let's compare two theories. One is Newton's gravity, the other Luminiferous Aether. When Einstein's theory of relativity arrived, Newton's Gravity turned to be a subset of it, an approximation that works under specific conditions.
On the other hand, Luminiferous Aether is just plain wrong.
Now, imagine that a scientist in the era before Theory of Relativity built a Strong AI (just roll with me here :-) ) and tasked it with finding out why Newton's Gravity doesn't work quite right around Mercury. The AI derived the Theory of Relativity.
Now, imagine this scientist asking the AI what Luminiferous Aether is made from. The AI is going to throw an OutOfLuminiferousAether exception (don't ask me why the AI is written in Java).
Humorous prelude aside, I am wondering which concepts we have today are only slightly wrong, and which are completely wrong? I am asking mostly about the concepts that are discussed on this forum.
Obviously, the more abstract is the concept, the more risk there that it will turn out to be bunkum.
Personally, I don't trust the concept of values. It's already so complex and fragile, I'm afraid it doesn't actually exist.
Two-boxing, smoking and chewing gum in Medical Newcomb problems
I am currently learning about the basics of decision theory, most of which is common knowledge on LW. I have a question, related to why EDT is said not to work.
Consider the following Newcomblike problem: A study shows that most people who two-box in Newcomblike problems as the following have a certain gene (and one-boxers don't have the gene). Now, Omega could put you into something like Newcomb's original problem, but instead of having run a simulation of you, Omega has only looked at your DNA: If you don't have the "two-boxing gene", Omega puts $1M into box B, otherwise box B is empty. And there is $1K in box A, as usual. Would you one-box (take only box B) or two-box (take box A and B)? Here's a causal diagram for the problem:
Since Omega does not do much other than translating your genes into money under a box, it does not seem to hurt to leave it out:
I presume that most LWers would one-box. (And as I understand it, not only CDT but also TDT would two-box, am I wrong?)
Now, how does this problem differ from the smoking lesion or Yudkowsky's (2010, p.67) chewing gum problem? Chewing Gum (or smoking) seems to be like taking box A to get at least/additional $1K, the two-boxing gene is like the CGTA gene, the illness itself (the abscess or lung cancer) is like not having $1M in box B. Here's another causal diagram, this time for the chewing gum problem:
As far as I can tell, the difference between the two problems is some additional, unstated intuition in the classic medical Newcomb problems. Maybe, the additional assumption is that the actual evidence lies in the "tickle", or that knowing and thinking about the study results causes some complications. In EDT terms: The intuition is that neither smoking nor chewing gum gives the agent additional information.
Pattern-botching: when you forget you understand
It’s all too easy to let a false understanding of something replace your actual understanding. Sometimes this is an oversimplification, but it can also take the form of an overcomplication. I have an illuminating story:
Years ago, when I was young and foolish, I found myself in a particular romantic relationship that would later end for epistemic reasons, when I was slightly less young and slightly less foolish. Anyway, this particular girlfriend of mine was very into healthy eating: raw, organic, home-cooked, etc. During her visits my diet would change substantially for a few days. At one point, we got in a tiny fight about something, and in a not-actually-desperate chance to placate her, I semi-jokingly offered: “I’ll go vegetarian!”
“I don’t care,” she said with a sneer.
…and she didn’t. She wasn’t a vegetarian. Duhhh... I knew that. We’d made some ground beef together the day before.
So what was I thinking? Why did I say “I’ll go vegetarian” as an attempt to appeal to her values?
(I’ll invite you to take a moment to come up with your own model of why that happened. You don't have to, but it can be helpful for evading hindsight bias of obviousness.)
(Got one?)
Here's my take: I pattern-matched a bunch of actual preferences she had with a general "healthy-eating" cluster, and then I went and pulled out something random that felt vaguely associated. It's telling, I think, that I don't even explicitly believe that vegetarianism is healthy. But to my pattern-matcher, they go together nicely.
I'm going to call this pattern-botching.† Pattern-botching is when you pattern-match a thing "X", as following a certain model, but then implicit queries to that model return properties that aren't true about X. What makes this different from just having false beliefs is that you know the truth, but you're forgetting to use it because there's a botched model that is easier to use.
†Maybe this already has a name, but I've read a lot of stuff and it feels like a distinct concept to me.
Examples of pattern-botching
So, that's pattern-botching, in a nutshell. Now, examples! We'll start with some simple ones.
Calmness and pretending to be a zen master
In my Againstness Training video, past!me tries a bunch of things to calm down. In the pursuit of "calm", I tried things like...
- dissociating
- trying to imitate a zen master
- speaking really quietly and timidly
None of these are the desired state. The desired state is present, authentic, and can project well while speaking assertively.
But that would require actually being in a different state, which to my brain at the time seemed hard. So my brain constructed a pattern around the target state, and said "what's easy and looks vaguely like this?" and generated the list above. Not as a list, of course! That would be too easy. It generated each one individually as a plausible course of action, which I then tried, and which Val then called me out on.
Personality Types
I'm quite gregarious, extraverted, and generally unflappable by noise and social situations. Many people I know describe themselves as HSPs (Highly Sensitive Persons) or as very introverted, or as "not having a lot of spoons". These concepts are related—or perhaps not related, but at least correlated—but they're not the same. And even if these three terms did all mean the same thing, individual people would still vary in their needs and preferences.
Just this past week, I found myself talking with an HSP friend L, and noting that I didn't really know what her needs were. Like I knew that she was easily startled by loud noises and often found them painful, and that she found motion in her periphery distracting. But beyond that... yeah. So I told her this, in the context of a more general conversation about her HSPness, and I said that I'd like to learn more about her needs.
L responded positively, and suggested we talk about it at some point. I said, "Sure," then added, "though it would be helpful for me to know just this one thing: how would you feel about me asking you about a specific need in the middle of an interaction we're having?"
"I would love that!" she said.
"Great! Then I suspect our future interactions will go more smoothly," I responded. I realized what had happened was that I had conflated L's HSPness with... something else. I'm not exactly sure what, but a preference for indirect communication, perhaps? I have another friend, who is also sometimes short on spoons, who I model as finding that kind of question stressful because it would kind of put them on the spot.
I've only just recently been realizing this, so I suspect that I'm still doing a ton of this pattern-botching with people, that I haven't specifically noticed.
Of course, having clusters makes it easier to have heuristics about what people will do, without knowing them too well. A loose cluster is better than nothing. I think the issue is when we do know the person well, but we're still relying on this cluster-based model of them. It's telling that I was not actually surprised when L said that she would like it if I asked about her needs. On some level I kind of already knew it. But my botched pattern was making me doubt what I knew.
False aversions
CFAR teaches a technique called "Aversion Factoring", in which you try to break down the reasons why you don't do something, and then consider each reason. In some cases, the reasons are sound reasons, so you decide not to try to force yourself to do the thing. If not, then you want to make the reasons go away. There are three types of reasons, with different approaches.
One is for when you have a legitimate issue, and you have to redesign your plan to avert that issue. The second is where the thing you're averse to is real but isn't actually bad, and you can kind of ignore it, or maybe use exposure therapy to get yourself more comfortable with it. The third is... when the outcome would be an issue, but it's not actually a necessary outcome of the thing. As in, it's a fear that's vaguely associated with the thing at hand, but the thing you're afraid of isn't real.
All of these share a structural similarity with pattern-botching, but the third one in particular is a great example. The aversion is generated from a property that the thing you're averse to doesn't actually have. Unlike a miscalibrated aversion (#2 above) it's usually pretty obvious under careful inspection that the fear itself is based on a botched model of the thing you're averse to.
Taking the training wheels off of your model
One other place this structure shows up is in the difference between what something looks like when you're learning it versus what it looks like once you've learned it. Many people learn to ride a bike while actually riding a four-wheeled vehicle: training wheels. I don't think anyone makes the mistake of thinking that the ultimate bike will have training wheels, but in other contexts it's much less obvious.
The remaining three examples look at how pattern-botching shows up in learning contexts, where people implicitly forget that they're only partway there.
Rationality as a way of thinking
CFAR runs 4-day rationality workshops, which currently are evenly split between specific techniques and how to approach things in general. Let's consider what kinds of behaviours spring to mind when someone encounters a problem and asks themselves: "what would be a rational approach to this problem?"
- someone with a really naïve model, who hasn't actually learned much about applied rationality, might pattern-match "rational" to "hyper-logical", and think "What Would Spock Do?"
- someone who is somewhat familiar with CFAR and its instructors but who still doesn't know any rationality techniques, might complete the pattern with something that they think of as being archetypal of CFAR-folk: "What Would Anna Salamon Do?"
- CFAR alumni, especially new ones, might pattern-match "rational" as "using these rationality techniques" and conclude that they need to "goal factor" or "use trigger-action plans"
- someone who gets rationality would simply apply that particular structure of thinking to their problem
In the case of a bike, we see hundreds of people biking around without training wheels, and so that becomes the obvious example from which we generalize the pattern of "bike". In other learning contexts, though, most people—including, sometimes, the people at the leading edge—are still in the early learning phases, so the training wheels are the rule, not the exception.
So people start thinking that the figurative bikes are supposed to have training wheels.
Incidentally, this can also be the grounds for strawman arguments where detractors of the thing say, "Look at these bikes [with training wheels]! How are you supposed to get anywhere on them?!"
Effective Altruism
We potentially see a similar effect with topics like Effective Altruism. It's a movement that is still in its infancy, which means that nobody has it all figured out. So when trying to answer "How do I be an effective altruist?" our pattern-matchers might pull up a bunch of examples of things that EA-identified people have been commonly observed to do.
- donating 10% of one's income to a strategically selected charity
- going to a coding bootcamp and switching careers, in order to Earn to Give
- starting a new organization to serve an unmet need, or to serve a need more efficiently
- supporting the Against Malaria Fund
...and this generated list might be helpful for various things, but be wary of thinking that it represents what Effective Altruism is. It's possible—it's almost inevitable—that we don't actually know what the most effective interventions are yet. We will potentially never actually know, but we can expect that in the future we will generally know more than at present. Which means that the current sampling of good EA behaviours likely does not actually even cluster around the ultimate set of behaviours we might expect.
Creating a new (platform for) culture
At my intentional community in Waterloo, we're building a new culture. But that's actually a by-product: our goal isn't to build this particular culture but to build a platform on which many cultures can be built. It's like how as a company you don't just want to be building the product but rather building the company itself, or "the machine that builds the product,” as Foursquare founder Dennis Crowley puts it.
What I started to notice though, is that we started to confused the particular, transitionary culture that we have at our house, with either (a) the particular, target culture, that we're aiming for, or (b) the more abstract range of cultures that will be constructable on our platform.
So from a training wheels perspective, we might totally eradicate words like "should". I did this! It was really helpful. But once I had removed the word from my idiolect, it became unhelpful to still be treating it as being a touchy word. Then I heard my mentor use it, and I remembered that the point of removing the word wasn't to not ever use it, but to train my brain to think without a particular structure that "should" represented.
This shows up on much larger scales too. Val from CFAR was talking about a particular kind of fierceness, "hellfire", that he sees as fundamental and important, and he noted that it seemed to be incompatible with the kind of culture my group is building. I initially agreed with him, which was kind of dissonant for my brain, but then I realized that hellfire was only incompatible with our training culture, not the entire set of cultures that could ultimately be built on our platform. That is, engaging with hellfire would potentially interfere with the learning process, but it's not ultimately proscribed by our culture platform.
Conscious cargo-culting
I think it might be helpful to repeat the definition:
Pattern-botching is you pattern-match a thing "X", as following a certain model, but then but then implicit queries to that model return properties that aren't true about X. What makes this different from just having false beliefs is that you know the truth, but you're forgetting to use it because there's a botched model that is easier to use.
It's kind of like if you were doing a cargo-cult, except you knew how airplanes worked.
(Cross-posted from malcolmocean.com)
Taking Effective Altruism Seriously
Epistemic status: 90% confident.
Inspiration: Arjun Narayan, Tyler Cowen.
The noblest charity is to prevent a man from accepting charity, and the best alms are to show and enable a man to dispense with alms.
Background
Effective Altruism (EA) is "a philosophy and social movement that applies evidence and reason to determine the most effective ways to improve the world." Along with the related organisation GiveWell, it often focuses on getting the most "bang for your buck" in charitable donations. Unfortunately, despite their stated aims, their actual charitable recommendations are generally wasteful, such as cash transfers to poor Africans. This leads to the obvious question - how can we do better?
Doing better
One of the positive aspects of EA theory is its attempt to widen the scope of altruism beyond the traditional. For instance, to take into account catastrophic risks, and the far future. However, altruism often produces a far-mode bias where intentions matter above results. This can be a particular problem for EA - for example, it is very hard to get evidence about how we are affecting the far future. An effective method needs to rely on a tight feedback loop between action and results, so that continual updates are possible. At the extreme, Far Mode operates in a manner where no updating on results takes place at all. However, it is also important that those results are of significant magnitude as to justify the effort. EA has mostly fallen into the latter trap - achieving measurable results, but which are of no greater consequence.
The population of sub-Saharan Africa is around 950 million people, and growing. They have been a prime target of aid for generations, but it remains the poorest region of the world. Providing cash transfers to them mostly merely raises consumption, rather than substantially raising productivity. A truly altruistic program would enable the people in these countries to generate their own wealth so that they no longer needed poverty - unconditional transfers, by contrast, is an idea so lazy even Bob Geldof could stumble on it. The only novel thing about the GiveWell program is that the transfers are in cash.
Unfortunately, no-one knows how to turn poor African countries into productive Western ones, short of colonization. The problem is emphatically not a shortage of capital, but rather low productivity, and the absence of effective institutions in which that capital can be deployed. Sadly, these conditions and institutions cannot simply be transplanted into those countries.
A greater charity
However, there do exist countries with high productivity, and effective institutions in which that capital can be deployed. That capital then raises world productivity. As F.A. Harper wrote:
Savings invested in privately owned economic tools of production amount to... the greatest economic charity of all.
That is because those tools increase the productivity of labour, and so raise output. The pie has grown. Moreover, the person who invests their portion of the pie into new capital is particularly altruistic, both because they are not taking a share themselves, and because they are making a particularly large contribution to future pies.
In the same way that using steel to build tanks means (on the margin) fewer cars and vice-versa, using craftsmen to build a new home means (on the margin) fewer factories and vice-versa. Investment in capital is foregone consumption. Moreover, you do not need to personally build those economic tools; rather, you can part-finance a range of those tools by investing in the stock market, or other financial mechanisms.
Now, it's true that little of that capital will be deployed in sub-Saharan Africa at present, due to the institutional problems already mentioned. Investing in these countries will likely lead to your capital being stolen or becoming unproductive - the same trap that prevents locals from advancing equally prevents foreign investors from doing so. However, if sub-Saharan Africa ever does fix its culture and institutions, then the availability of that capital will then serve to rapidly raise productivity and then living standards, much as is taking place in China. Moreover, by making the rest of the world richer, this increases the level of aid other countries could provide to sub-Saharan Africa in future, should this ever be judged desirable. It also serves to improve the emigration prospects of individuals within these countries.
Feedback
Another great benefit of capital investment is the sharp feedback mechanism. The market economy in general, and financial markets in particular, serve to redistribute capital from ineffective to effective ventures, and from ineffective to effective investors. As a result, it is no longer necessary to make direct (and expensive) measurements of standards of living in sub-Saharan Africa; as long as your investment fund is gaining in value, you can rest safe in the knowledge that its growth is contributing, in a small way, to future prosperity.
Commitment mechanisms
However, if investment in capital is foregone consumption, then consumption is foregone investment. If I invest in the stock market today (altruistic), then in ten years' time spend my profits on a bigger house (selfish), then some of the good is undone. So the true altruist will not merely create capital, he will make sure that capital will never get spent down. One good way of doing that would be to donate to an institution likely to hold onto its capital in perpetuity, and likely to grow that capital over time. Perhaps the best example of such an institution would be a richly-endowed private university, such as Harvard, which has existed for almost 400 years and is said to have an endowment of $32 billion.
John Paulson recently gave Harvard $400 million. Unfortunately, this meant he came in for a torrent of criticism from people claiming he should have given the money to poor Africans, etc. I hope to see Effective Altruists defending him, as he has clearly followed through on their concepts in the finest way.
Further thoughts and alternatives
- Some people say that we are currently going through a "savings glut" in which capital is less productive than previously thought. In this case, it may be that Effective Altruists should focus on funding (and becoming!) successful entrepreneurs in different spaces.
- I am sympathetic to the Thielian critique that innovation is being steadily stifled by hostile forces. I view the past 50 years, and the foreseeable future, as a race between technology and regulation, which technology is by no means certain to win. It may be that Effective Altruists should focus on political activity, to defend and expand economic liberty where it exists - this is currently the focus of my altruism.
- However, government is not the enemy; rather, the enemy is the cultural beliefs and conditions that create a demand for the destruction of economic liberty. To the extent this critique, it may be that Effective Altruists should focus on promoting a pro-innovation and pro-liberty mindset; for example, through movies and novels.
Conclusion
A Proposal for Defeating Moloch in the Prison Industrial Complex
Summary
I'd like to increasing the well-being of those in the justice system while simultaneously reducing crime. I'm missing something here but I'm not sure what. I'm thinking this may be a worse idea than I originally thought based on comment feedback, though I'm still not 100% sure why this is the case.
Current State
While the prison system may not constitute an existential threat, At this moment more than 2,266,000 adults are incarcerated in the US alone, and I expect that being in prison greatly decreases QALYs for those incarcerated, that further QALYs are lost to victims of crime, family members of the incarcerated, and through the continuing effects of institutionalization and PTSD from sentences served in the current system, not to mention the brainpower and man-hours lost to any productive use.
If you haven't read these Meditations on Moloch, I highly recommend it. It’s long though, so the executive summary is: Moloch is the personification of the forces of competition which perverse incentives, a "race to the bottom" type situation where all human values are discarded in an effort to survive. That this can be solved with better coordination, but it is very hard to coordinate when perverse incentives also penalize the coordinators and reward dissenters. The prison industrial complex is an example of these perverse incentives. No one thinks that the current system is ideal but incentives prevent positive change and increase absolute unhappiness.
- Politicians compete for electability. Convicts can’t vote, prisons make campaign contributions and jobs, and appearing “tough on crime” appeals to a large portion of the voter base.
- Jails compete for money: the more prisoners they house, the more they are paid and the longer they can continue to exist. This incentive is strong for public prisons and doubly strong for private prisons.
- Police compete for bonuses and promotions, both of which are given as rewards to cops who bring in and convict more criminals
- Many of the inmates themselves are motivated to commit criminal acts by the small number of non-criminal opportunities available to them for financial success, besides criminal acts. After becoming a criminal, this number of opportunities is further narrowed by background checks.
The incentives have come far out of line with human values. What can be done to bring incentives back in alignment with the common good?
My Proposal
Using a model that predicts recidivism at sixty days, one year, three years, and five years, predict the expected recidivism rate for all inmates at all individual prison given average recidivism. Sixty days after release, if recidivism is below the predicted rate, the prison gets a small sum of money equaling 25% of the predicted cost to the state of dealing with the predicted recidivism (including lawyer fees, court fees, and jailing costs). This is repeated at one year, three years, and five years.
The statistical models would be readjusted with current data every years, so if this model causes recidivism to drop across the board, jails would be competing against ever higher standard, competing to create the most innovative and groundbreaking counseling and job skills and restorative methods so that they don’t lose their edge against other prisons competing for the same money. As it becomes harder and harder to edge out the competition’s advanced methods, and as the prison population is reduced, additional incentives could come by ending state contracts with the bottom 10% of prisons, or with any prisons who have recidivism rates larger than expected for multiple years in a row.
Note that this proposal makes no policy recommendations or value judgement besides changing the incentive structure. I have opinions on the sanity of certain laws and policies and the private prison system itself, but this specific proposal does not. Ideally, this will reduce some amount of partisan bickering.
Using this added success incentive, here are the modified motivations of each of the major actors.
- Politicians compete for electability. Convicts still can’t vote, prisons make campaign contributions, and appearing “tough on crime” still appeals to a large portion of the voter base. The politician can promise a reduction in crime without making any specific policy or program recommendations, thus shielding themselves from criticism of being soft on crime that might come from endorsing restorative justice or psychological counselling, for instance. They get to claim success for programs that other people, are in charge of administrating and designing. Further, they are saving 75% of the money predicted to have have been spent administrating criminals. Prisons love getting more money for doing the same amount of work so campaign contributions would stay stable or go up for politicians who support reduced recidivism bonuses.
- Prisons compete for money. It costs the state a huge amount of money to house prisoners, and the net profit from housing a prisoner is small after paying for food, clothing, supervision, space, repairs, entertainment, ect. An additional 25% of that cost, with no additional expenditures is very attractive. I predict that some amount of book-cooking will happen, but that the gains possible with book cooking are small compared to gains from actual improvements in their prison program. Small differences in prisons have potential to make large differences in post-prison behavior. I expect having an on-staff CBT psychiatrist would make a big difference; an addiction specialist would as well. A new career field is born: expert consultants who travel from private prison to private prison and make recommendations for what changes would reduce recidivism at the lowest possible cost.
- Police and judges retain the same incentives as before, for bonuses, prestige, and promotions. This is good for the system, because if their incentives were not running counter to the prisons and jails, then there would be a lot of pressure to cook the books by looking the other way on criminals til after the 60 day/1 year/5 year mark. I predict that there will be a couple scandals of cops found to be in league with prisons for a cut of the bonus, but that this method isn’t very profitable. For one thing, an entire police force would have to be corrupt and for another, criminals are mobile and can commit crimes in other precincts. Police are also motivated to work in safer areas, so the general program of rewarding reduced recidivism is to their advantage.
Roadmap
If it could be shown that a model for predicting recidivism is highly predictive, we will need to create another model to predict how much the government could save if switching to a bonus system, and what reduction of crime could be expected.
Halfway houses in Pennsylvania are already receiving non-recidivism bonuses. Is a pilot project using this pricing structure feasible?
Are consequentialism and deontology not even wrong?
I was stunned to read the accounts quoted below. They're claiming that the notion of morality - in the sense of there being a special category of things that you should or should not do for the sake of the things themselves being inherently right or wrong - might not only be a recent invention, but also an incoherent one. Even when I had read debates about e.g. moral realism, I had always understood even the moral irrealists as acknowledging that there are genuine moral attitudes that are fundamentally ingrained in people. But I hadn't ran into a position claiming that it was actually possible for whole cultures to simply not have a concept of morality in the first place.
I'm amazed that I haven't heard these claims discussed more. If they're accurate, then they seem to me to provide a strong argument for both deontology and consequentialism - at least as they're usually understood here - to be not even wrong. Just rationalizations of concepts that got their origin from Judeo-Christian laws and which people held onto because they didn't know of any other way of thinking.
As for morally, we must observe at once – again following Anscombe – that Plato and Aristotle, having no word for “moral,” could not even form a phrase equivalent to “morally right.” The Greek thik aret means “excellence of character,” not “moral virtue”; 2 Cicero's virtus moralis, from which the English phrase descends directly, is simply the Latin for thik aret. This is not the lexical fallacy; it is not just that the word ‘moral’ was missing. The whole idea of a special category called “the moral” was missing. Strictly speaking, the Aristotelian phrase ta thika is simply a generalizing substantive formed on th, “characteristic behaviors,” just as the Ciceronian moralia is formed on mores. To be fully correct – admittedly it would be a bit cumbersome – we should talk not of Aristotle's Nicomachean Ethics but of his Studies-of-our-characteristic-behaviors Edited-by-Nicomachus.Plato and Aristotle were interested – especially Plato – in the question how the more stringent demands of a good disposition like justice or temperance or courage could be reasonable demands, demands that it made sense to obey even at extreme cost. It never occurred to them, as it naturally does to moderns, to suggest that these demands were to be obeyed simply because they were demands of a special, magically compulsive sort: moral demands.Their answer was always that, to show that we have reason to obey the strong demands that can emerge from our good dispositions, we must show that what they demand is in some way a necessary means to or part of human well-being (eudaimonia). If it must be classified under the misconceived modern distinction between “the moral” and “the prudential,” this answer clearly falls into the prudential category. 4 When modern readers who have been brought up on our moral/ prudential distinction see Plato's and Aristotle's insistence on rooting the reasons that the virtues give us in the notion of well-being, they regularly classify both as “moral egoists.” But that is a misapplication to them of a distinction that they were right not to recognize.When we turn from the Greeks to Kant and the classical utilitarians, we may doubt whether they shared the modern interest in finding a neat definition of the “morally right” any more than Plato or Aristotle did. Kant proposed, at most, a necessary (not necessary and sufficient) condition on rationally permissible (not morally right5) action for an individual agent – and had even greater than his usual difficulty expressing this condition at all pithily. The utilitarians often were more interested in jurisprudence than in individual action, and where they addressed the latter – as J. S. Mill often does, but Bentham usually does not – tended, in the interests of long-term utility, to stick remarkably close to the deliverances of that version of “common-sense morality” that was recognized by high-minded Victorian liberals like themselves. When Kant and the utilitarians disagreed, it was not about the question “What are the necessary and sufficient conditions of morally right action?” They weren't even asking that question.
The terms "should" or "ought" or "needs" relate to good and bad: e.g. machinery needs oil, or should or ought to be oiled, in that running without oil is bad for it, or it runs badly without oil. According to this conception, of course, "should" and "ought" are not used in a special "moral" sense when one says that a man should not bilk. (In Aristotle's sense of the term "moral" [...], they are being used in connection with a moral subject-matter: namely that of human passions and (non-technical) actions.) But they have now acquired a special so-called "moral" sense — i.e. a sense in which they imply some absolute verdict (like one of guilty/not guilty on a man) on what is described in the "ought" sentences used in certain types of context: not merely the contexts that Aristotle would call "moral" — passions and actions — but also some of the contexts that he would call "intellectual."The ordinary (and quite indispensable) terms "should," "needs," "ought," "must" — acquired this special sense by being equated in the relevant contexts with "is obliged," or "is bound," or "is required to," in the sense in which one can be obliged or bound by law, or something can be required by law.How did this come about? The answer is in history: between Aristotle and us came Christianity, with its law conception of ethics. For Christianity derived its ethical no- tions from the Torah. [...]In consequence of the dominance of Christianity for many centuries, the concepts of being bound, permitted, or excused became deeply embedded in our language and thought. The Greek word "aiu,avravav," the aptest to be turned to that use, acquired the sense "sin," from having meant "mistake," "missing the mark," "going wrong." The Latin peccatum which roughly corresponded to aiu,avriiu,a was even apter for the sense "sin," because it was already associated with "culpa" — "guilt" — a juridical notion. The blanket term "illicit," "unlawful," meaning much the same as our blanket term "wrong," explains itself. It is interesting that Aristotle did not have such a blanket term. He has blanket terms for wickedness — "villain," "scoundrel"; but of course a man is not a villain or a scoundrel by the performance of one bad action, or a few bad actions. And he has terms like "disgraceful," "impious"; and specific terms signifying defect of the relevant virtue, like "unjust"; but no term corresponding to "illicit." The extension of this term (i.e. the range of its application) could be indicated in his terminology only by a quite lengthy sentence: that is "illicit" which, whether it is a thought or a consented-to passion or an action or an omission in thought or action, is something contrary to one of the virtues the lack of which shows a man to be bad qua man. That formulation would yield a concept co-extensive with the concept "illicit."To have a law conception of ethics is to hold that what is needed for conformity with the virtues failure in which is the mark of being bad qua man (and not merely, say, qua craftsman or logician) — that what is needed for this , is required by divine law. Naturally it is not possible to have such a conception unless you believe in God as a law-giver; like Jews, Stoics, and Christians. But if such a conception is dominant for many centuries, and then is given up, it is a natural result that the concepts of "obligation," of being bound or required as by a law, should remain though they had lost their root; and if the word "ought" has become invested in certain contexts with the sense of "obligation," it too will remain to be spoken with a special emphasis and special feeling in these contexts.It is as if the notion "criminal" were to remain when criminal law and criminal courts had been abolished and forgotten. A Hume discovering this situation might conclude that there was a special sentiment, expressed by "criminal," which alone gave the word its sense. So Hume discovered the situation which the notion "obligation" survived, and the notion "ought" was invested with that peculiar for having which it is said to be used in a "moral" sense, but in which the belief in divine law had long since been abandoned: for it was substantially given up among Protestants at the time of the Reformation.2The situation, if I am right, was the interesting one of the survival of a concept outside the framework of thought that made it a really intelligible one.
Leaving LessWrong for a more rational life
You are unlikely to see me posting here again, after today. There is a saying here that politics is the mind-killer. My heretical realization lately is that philosophy, as generally practiced, can also be mind-killing.
As many of you know I am, or was running a twice-monthly Rationality: AI to Zombies reading group. One of the bits I desired to include in each reading group post was a collection of contrasting views. To research such views I've found myself listening during my commute to talks given by other thinkers in the field, e.g. Nick Bostrom, Anders Sandberg, and Ray Kurzweil, and people I feel are doing “ideologically aligned” work, like Aubrey de Grey, Christine Peterson, and Robert Freitas. Some of these were talks I had seen before, or generally views I had been exposed to in the past. But looking through the lens of learning and applying rationality, I came to a surprising (to me) conclusion: it was philosophical thinkers that demonstrated the largest and most costly mistakes. On the other hand, de Grey and others who are primarily working on the scientific and/or engineering challenges of singularity and transhumanist technologies were far less likely to subject themselves to epistematic mistakes of significant consequences.
Philosophy as the anti-science...
What sort of mistakes? Most often reasoning by analogy. To cite a specific example, one of the core underlying assumption of singularity interpretation of super-intelligence is that just as a chimpanzee would be unable to predict what a human intelligence would do or how we would make decisions (aside: how would we know? Were any chimps consulted?), we would be equally inept in the face of a super-intelligence. This argument is, however, nonsense. The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available. The wave-particle duality of quantum physics, or the 11-dimensional space of string theory may defy human intuition, i.e. our built-in intelligence. But we have proven ourselves perfectly capable of understanding the logical implications of models which employ them. We may not be able to build intuition for how a super-intelligence thinks. Maybe—that's not proven either. But even if that is so, we will be able to reason about its intelligent behaviour in advance, just like string theorists are able to reason about 11-dimensional space-time without using their evolutionarily derived intuitions at all.
This post is not about the singularity nature of super-intelligence—that was merely my choice of an illustrative example of a category of mistakes that are too often made by those with a philosophical background rather than the empirical sciences: the reasoning by analogy instead of the building and analyzing of predictive models. The fundamental mistake here is that reasoning by analogy is not in itself a sufficient explanation for a natural phenomenon, because it says nothing about the context sensitivity or insensitivity of the original example and under what conditions it may or may not hold true in a different situation.
A successful physicist or biologist or computer engineer would have approached the problem differently. A core part of being successful in these areas is knowing when it is that you have insufficient information to draw conclusions. If you don't know what you don't know, then you can't know when you might be wrong. To be an effective rationalist, it is often not important to answer “what is the calculated probability of that outcome?” The better first question is “what is the uncertainty in my calculated probability of that outcome?” If the uncertainty is too high, then the data supports no conclusions. And the way you reduce uncertainty is that you build models for the domain in question and empirically test them.
The lens that sees its own flaws...
Coming back to LessWrong and the sequences. In the preface to Rationality, Eliezer Yudkowsky says his biggest regret is that he did not make the material in the sequences more practical. The problem is in fact deeper than that. The art of rationality is the art of truth seeking, and empiricism is part and parcel essential to truth seeking. There's lip service done to empiricism throughout, but in all the “applied” sequences relating to quantum physics and artificial intelligence it appears to be forgotten. We get instead definitive conclusions drawn from thought experiments only. It is perhaps not surprising that these sequences seem the most controversial.
I have for a long time been concerned that those sequences in particular promote some ungrounded conclusions. I had thought that while annoying this was perhaps a one-off mistake that was fixable. Recently I have realized that the underlying cause runs much deeper: what is taught by the sequences is a form of flawed truth-seeking (thought experiments favored over real world experiments) which inevitably results in errors, and the errors I take issue with in the sequences are merely examples of this phenomenon.
And these errors have consequences. Every single day, 100,000 people die of preventable causes, and every day we continue to risk extinction of the human race at unacceptably high odds. There is work that could be done now to alleviate both of these issues. But within the LessWrong community there is actually outright hostility to work that has a reasonable chance of alleviating suffering (e.g. artificial general intelligence applied to molecular manufacturing and life-science research) due to concerns arrived at by flawed reasoning.
I now regard the sequences as a memetic hazard, one which may at the end of the day be doing more harm than good. One should work to develop one's own rationality, but I now fear that the approach taken by the LessWrong community as a continuation of the sequences may result in more harm than good. The anti-humanitarian behaviors I observe in this community are not the result of initial conditions but the process itself.
What next?
How do we fix this? I don't know. On a personal level, I am no longer sure engagement with such a community is a net benefit. I expect this to be my last post to LessWrong. It may happen that I check back in from time to time, but for the most part I intend to try not to. I wish you all the best.
A note about effective altruism…
One shining light of goodness in this community is the focus on effective altruism—doing the most good to the most people as measured by some objective means. This is a noble goal, and the correct goal for a rationalist who wants to contribute to charity. Unfortunately it too has been poisoned by incorrect modes of thought.
Existential risk reduction, the argument goes, trumps all forms of charitable work because reducing the chance of extinction by even a small amount has far more expected utility than would accomplishing all other charitable works combined. The problem lies in the likelihood of extinction, and the actions selected in reducing existential risk. There is so much uncertainty regarding what we know, and so much uncertainty regarding what we don't know that it is impossible to determine with any accuracy the expected risk of, say, unfriendly artificial intelligence creating perpetual suboptimal outcomes, or what effect charitable work in the area (e.g. MIRI) is have to reduce that risk, if any.
This is best explored by an example of existential risk done right. Asteroid and cometary impacts is perhaps the category of external (not-human-caused) existential risk which we know the most about, and have done the most to mitigate. When it was recognized that impactors were a risk to be taken seriously, we recognized what we did not know about the phenomenon: what were the orbits and masses of Earth-crossing asteroids? We built telescopes to find out. What is the material composition of these objects? We built space probes and collected meteorite samples to find out. How damaging an impact would there be for various material properties, speeds, and incidence angles? We built high-speed projectile test ranges to find out. What could be done to change the course of an asteroid found to be on collision course? We have executed at least one impact probe and will monitor the effect that had on the comet's orbit, and have on the drawing board probes that will use gravitational mechanisms to move their target. In short, we identified what it is that we don't know and sought to resolve those uncertainties.
How then might one approach an existential risk like unfriendly artificial intelligence? By identifying what it is we don't know about the phenomenon, and seeking to experimentally resolve that uncertainty. What relevant facts do we not know about (unfriendly) artificial intelligence? Well, much of our uncertainty about the actions of an unfriendly AI could be resolved if we were to know more about how such agents construct their thought models, and relatedly what language were used to construct their goal systems. We could also stand to benefit from knowing more practical information (experimental data) about in what ways AI boxing works and in what ways it does not, and how much that is dependent on the structure of the AI itself. Thankfully there is an institution that is doing that kind of work: the Future of Life institute (not MIRI).
Where should I send my charitable donations?
Aubrey de Grey's SENS Research Foundation.
100% of my charitable donations are going to SENS. Why they do not get more play in the effective altruism community is beyond me.
If you feel you want to spread your money around, here are some non-profits which have I have vetted for doing reliable, evidence-based work on singularity technologies and existential risk:
- Robert Freitas and Ralph Merkle's Institute for Molecular Manufacturing does research on molecular nanotechnology. They are the only group that work on the long-term Drexlarian vision of molecular machines, and publish their research online.
- Future of Life Institute is the only existential-risk AI organization which is actually doing meaningful evidence-based research into artificial intelligence.
- B612 Foundation is a non-profit seeking to launch a spacecraft with the capability to detect, to the extent possible, ALL Earth-crossing asteroids.
I wish I could recommend a skepticism, empiricism, and rationality promoting institute. Unfortunately I am not aware of an organization which does not suffer from the flaws I identified above.
Addendum regarding unfinished business
I will no longer be running the Rationality: From AI to Zombies reading group as I am no longer in good conscience able or willing to host it, or participate in this site, even from my typically contrarian point of view. Nevertheless, I am enough of a libertarian that I feel it is not my role to put up roadblocks to others who wish to delve into the material as it is presented. So if someone wants to take over the role of organizing these reading groups, I would be happy to hand over the reigns to that person. If you think that person should be you, please leave a reply in another thread, not here.
EDIT: Obviously I'll stick around long enough to answer questions below :)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)