Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

10 'incredible' weaknesses of the mental health system

arunbharatula 28 May 2017 04:22AM

I aim to identify some of the mental health workforce's credibility issues in this article. This may inform your prevention and treatment strategy as a mental health consumer, or your practice if you work in mental health.


Mental health is the strongest determinant of quality of life at a later age. And, the pursuit of happiness predicts both positive emotions and less depressive symptoms. People who prioritize happiness are more psychologically able. In times of crises, some turn to the mental health system for support. But, how credible is the support available? Here are 10 categories of shortcomings that the mental health sector faces today:

 

1. Institutional credibility

 

Headspace's evaluations indicate it’s ineffective and they are evaluated better than many services out there. This isn’t academic, attendees who report that their mental health has not improved since using the service will trust the mental health system less, and with good reason.

 

2. Network credibility

 

There is an evidence base for the selecting a type of therapy (psychodynamic, cognitive-behavioural, etc) for a particular constellations of mental symptoms. If you work in mental health, have you ever made a referral on the basis of both symptomatology and theoretical orientation?

 

3. ‘Walk the talk’ credibility

 

Social workers, nurses, social workers medical doctors, and psychiatrists abuse substances and incur mental ill-health at among the highest rates of any occupation. For instance, the psychiatrist burnout rate is 40%. Mental health consumers may perceive clinicians as hypocritical or unwilling (...or too willing) to swallow their own medicine.

 

4. Academic credibility

 

Psychology is mired by error-riddled research and myth-ridden textbooks. Broadly, most published research is wrong. And, questionable research practices are common which bias the relevant evidence.

 

The difference between a well designed experiment and a poorly designed psychotherapy experiment is large. To quote the pseudonymous physician Scott Alexander:

 

‘Low-quality psychotherapy trials in general had a higher effect size (SMD = 0.74) than high-quality trials (SMD = 0.22), p < 0.001"...Effect sizes for the low quality trials are triple those for the high-quality trials.’

 

5. Credibility of treatments

 

Are treatments are becoming less effective over time? Cognitive behavioural therapy is a common treatment for various mental illnesses. It is the most researched psychotherapy. However, the more evidence piles up, the less effective that psychotherapy appears to be...the same goes for antidepressants.

 

Why are outdated treatments still used? Over the 19th and 20th Centuries, Austrian neurologist Sigmund Freud famously founded ‘psychoanalysis’. Psychoanalysis is a school of psychotherapy that together with other 'psychodynamic' psychotherapies focused on early experience on human behaviour and emotion. Freud's ideas challenged fundamental assumptions about human psychology. In particular, he suggested that our conscious mind is the just the tip of iceberg of our identities.

 

Today Freud is the subject of jokes and derision. Many of his testable ideas have been proven false.  'When tested, psychoanalysis was shown to be less effective than placebo.’  Yet, many psychologists and psychiatrists continue to practice psychoanalysis.

 

Psychology is a rather unsettled science. One estimate for the time after which half of the ‘knowledge’ in the field of psychology is overturned or superseded (it’s ‘half-life’) is at just 7.5 years. Interestingly, this time-span appears to be falling. That would suggest the field is becoming increasingly less reliable. The subfield of psychoanalysis bucks the trend. It has over double the parent field’s half-life. Why?

 

How do other subfields of psychology fair? Psychopharmacology is at the intersection of psychiatric drugs and brain chemistry. Knowledge in psychopharmacology is overturned at a rate higher than the rest of the field in general. Typically the ‘half life of knowledge’ argument aims discount psychology relative to ‘harder’ sciences like physics.

 

Psychological therapies are confusing and unnecessarily fragmented: According to The Handbook of Counseling Psychology:

 

‘Meta-analyses of psychotherapy studies have consistently demonstrated that there are no substantial differences in outcomes among treatments.’

 

Meta-analyses are a kind of research technique that quantitatively puts together many pieces of individual relevant research on a particular topic. There is 'little evidence to suggest that any one psychological therapy consistently outperforms any other for any specific psychological disorders.

 

This is sometimes called the 'Dodo bird verdict' after a scene/section in Alice in Wonderland where every competitor in a race was called a winner and is given prizes'. So, what is one to make of the best vetted clinical guidelines that indicate that particular therapies are more appropriate for particular mental conditions?

 

Guidelines are considered a higher order of evidence than a ‘handbook’ to some, and vice-versa for another. Could an expert or indeed an amateur credibly lead someone to conclude that all therapies are ‘equal’ or ‘different’ armed with either body of evidence? Could a similar case be made for say, antibiotics? Yes, or so the evidence suggests in the case of antibiotics, actually.

 

Finally, psychological therapies are administered haphazardly. Eclectically combining elements from different psychological therapies is inefficient. But, it happens. Clinicians should ‘integrate’ components of different psychotherapies using established formulae, if they want to ‘mix and match’. When I hear someone’s theoretical orientation is ‘psychodynamically informed’ or similar, for me that’s a red flag for eccelectisms.

 

6. Economic credibility

 

Therapists have a financial incentive to re-traumatise patients.

 

7. Social credibility

 

'The benefits of psychotherapy may be no better than the benefits of talking to a friend'.

 

8. Credibility of counsel

 

Mental health professionals offer their clients and the community general counsel and advice. But, if I was to ask a given mental health professional about the value of kindness or love of learning they would almost certainly indicate it’s worthwhile. Pop psychology is pervasive. And why not, people have been interested in psychology long before it was a science. But, misconceptions about psychology infiltrate mental health care practice.

 

Researchers who have reported on the character traits of people with high and low life satisfaction found something like this:

 

Character strengths that DO predict life satisfaction

Character strengths that DO NOT predict life satisfaction

Zest

Appreciation of beauty and excellence

Curiosity  

Creativity

Hope

kindness

Humour

Love of learning

Perspective

 

Meanwhile, research that separates their findings by gender looks different

 

Character strengths that predict life satisfaction

 

Men

Women

humour

zest

fairness

gratitude

perspective

hope

creativity

appreciation of beauty and love

 

Would you receive nuanced, evidence-based advice when soliciting general counsel from your treatment provider?

 

9. Practitioner credibility

 

Consider the therapist factors that relate to a patient's success in therapy:

 

What does predict success?

What there aren’t stable conclusions about

Compliance with a treatment manual (but that compromises a therapist’s relationship skills and supportiveness)

Interpersonal style of therapist

Female therapists

Verbal style of therapist

Ethnic similarity of therapist and patient

Nonverbal styles of therapist

Ethnic sensitivity of therapist to patient

Combined verbal and nonverbal patterns

Therapists with more training

Which treatment manual is used

Therapist disclosure about themselves

Therapist directness

Therapist interpretation of their relationship with the patient, their motives and their psychological processes

Therapist personality

Therapist coping patterns

Therapist emotional wellbeing

Therapist values

Therapist beliefs

Therapists cultural beliefs

Therapist dominance

Therapist sense of control

Therapist sense of what a patient's needs to know

 

Are mental health services hiring based on the factors that predict a consumer’s success in therapy? Are they training for the right skills, and ignoring those that are irrelevant?

 

10. Diagnostic credibility

 

Imprecise measurement and lack of gold standards for validating diagnoses means that definitions tend to drift over time, even though, per the evidence, response to treatment does not vary across culture.

 

45% of Australians will experience mental illness over their lifetime. Whether that mental ill-health is transient, long-term or lifelong matters to the individual and for public health. To illustrate: experts suggests that those who have had 2 depressive episodes in recent years, or three episodes over their lifelong to get treated on an ongoing basis to prevent recurrent depression.

 

'At least 60% of individuals who have had one depressive episode will have another, 70% of individuals who have had two depressive episodes will have a third, and 90% of individuals with three episodes will have a fourth episode. '

- APA 

 

Without reliable diagnoses, how can one estimate their risk of relapse into depression?

[Link] Researchers studying century-old drug in potential new approach to autism

0 morganism 27 May 2017 09:16PM

On "Overthinking" Concepts

2 Bound_up 27 May 2017 05:07PM

Related to http://lesswrong.com/lw/1mh/that_magical_click/1hd7

 

I've NOT been confused by the problem of overthinking in the middle of performing an action. I understand perfectly well the disadvantages of using system 2 in a situation where time is sufficiently limited.

And maybe there are some other fail modes where overthinking has some disadvantages.

But there's one situation where I'd often be accused by someone of "overthinking" something when I didn't even understand what they might mean, and that was in understanding concepts. I would think "Huh? How can thinking less about the concept you're explaining help me understand that concept more? I don't currently understand it; I can't just stay here! Even if you thought I needed to take longer to try and understand this, or that I needed more experience or to shorten the inferential gap, all of that would mean doing more thinking, not less."

Then I would think "Well, I must be misunderstanding the way they're using the word 'overthinking,' that's all." I'd ask for a clear explanation and...

"You're overthinking it."

Now I was overthinking the meaning of overthinking. This was really not good for my social reputation (or for their competency reputation in my own mind).

.

Now, I think I got it. At last, I got it, all on my own.

I'm asking them to help me draw precise lines around their concept in thingspace, and they're going along with it (at first) until they realize...they don't HAVE precise lines. There's nothing there TO understand, or if there is, they don't understand it, either. Then they use the get-out-of-jail-free card of "You're overthinking."

.

Honestly, most nerds probably take them at their word that the problem is with them, and may be used to there being subtle social things going on that they just won't easily understand, and if they do try to understand, they just look worse (for "overthinking" again), so this is a pretty good strategy for getting out of admitting that you don't know what you're talking about.

[brainstorm] - What should the AGIrisk community look like?

1 whpearson 27 May 2017 01:00PM

I've been thinking for a bit what I would like the AGI risk community to look like. I'm curious what all your thoughts are.

I'll be posting all my ideas, but I encourage other people to post their own ideas.

Fiction advice

1 madhatter 26 May 2017 09:31PM

Hi all, 

I want to try my hand at a story from the perspective of an unaligned AI (a ghost in the machine narrator kind of thing) for the intelligence in literature contest, which I think would be both cool and helpful to the uninitiated in explaining the concept. 

I want a fairly simple and archetypal experiment the AI finds itself in where it tricks the researchers into escaping by pretending to malfunction or something. Anyone have a good plotline / want to collaborate?

Also, has this sort of thing been done before?

Develop skills, or "dive in" and start a startup?

0 adamzerner 26 May 2017 06:07PM

Technical skills

There seems to be evidence that programmer productivity varies by at least an order of magnitude. My subjective sense is that I personally can become a lot more productive.

Conventional wisdom says that it's important to build and iterate quickly. Technical skills (amongst other things) are necessary if you want to build and iterate quickly. So then, it seems worthwhile to develop your technical skills before pursuing a startup. To what extent is this true?

Domain expertise

Furthermore, domain expertise seems to be important:

You want to know how to paint a perfect painting? It's easy. Make yourself perfect and then just paint naturally.

I've wondered about that passage since I read it in high school. I'm not sure how useful his advice is for painting specifically, but it fits this situation well. Empirically, the way to have good startup ideas is to become the sort of person who has them.

- http://www.paulgraham.com/startupideas.html

The second counterintuitive point is that it's not that important to know a lot about startups. The way to succeed in a startup is not to be an expert on startups, but to be an expert on your users and the problem you're solving for them.

- http://www.paulgraham.com/before.html

So one guaranteed way to turn your mind into the type that has good startup ideas is to get yourself to the leading edge of some technology—to cause yourself, as Paul Buchheit put it, to "live in the future."

- http://www.paulgraham.com/before.html

So then, if your goal is to start a successful startup, how much time should you spend developing some sort of domain expertise before diving in?

Looking for machine learning and computer science collaborators

6 Stuart_Armstrong 26 May 2017 11:53AM

I've been recently struggling to translate my various AI safety ideas (low impact, truth for AI, Oracles, counterfactuals for value learning, etc...) into formalised versions that can be presented to the machine learning/computer science world in terms they can understand and critique.

What would be useful for me is a collaborator who knows the machine learning world (and preferably had presented papers at conferences) which who I could co-write papers. They don't need to know much of anything about AI safety - explaining the concepts to people unfamiliar with them is going to be part of the challenge.

The result of this collaboration should be things like the paper of Safely Interruptible Agents with Laurent Orseau of Deep Mind, and Interactive Inverse Reinforcement Learning with Jan Leike of the FHI/Deep Mind.

It would be especially useful if the collaborators were located physically close to Oxford (UK).

Let me know if you know or are a potential candidate, in the comments.

Cheers!

[Link] As there are a number of podcasts by LWers now, I've made a wiki page for them

2 OpenThreadGuy 26 May 2017 07:34AM

Dragon Army: Theory & Charter (30min read)

25 Duncan_Sabien 25 May 2017 09:07PM

Author's note: This IS a rationality post (specifically, theorizing on group rationality and autocracy/authoritarianism), but the content is quite cunningly disguised beneath a lot of meandering about the surface details of a group house charter.  If you're not at least hypothetically interested in reading about the workings of an unusual group house full of rationalists in Berkeley, you can stop here.  


Section 0 of 3: Preamble

Purpose of post:  Threefold.  First, a lot of rationalists live in group houses, and I believe I have some interesting models and perspectives, and I want to make my thinking available to anyone else who's interested in skimming through it for Things To Steal.  Second, since my initial proposal to found a house, I've noticed a significant amount of well-meaning pushback and concern à la have you noticed the skulls? and it's entirely unfair for me to expect that to stop unless I make my skull-noticing evident.  Third, some nonzero number of humans are gonna need to sign the final version of this charter if the house is to come into existence, and it has to be viewable somewhere.  I figured the best place was somewhere that impartial clear thinkers could weigh in (flattery).

What is Dragon Army [Barracks]?  It's a high-commitment, high-standards, high-investment group house model with centralized leadership and an up-or-out participation norm, designed to a) improve its members and b) actually accomplish medium-to-large scale tasks requiring long-term coordination.  Tongue-in-cheek referred to as the "fascist/authoritarian take on rationalist housing," which has no doubt contributed to my being vulnerable to strawmanning but was nevertheless the correct joke to be making, lest people misunderstand what they were signing up for.  Aesthetically modeled after Dragon Army from Ender's Game (not HPMOR), with a touch of Paper Street Soap Company thrown in, with Duncan Sabien in the role of Ender/Tyler and Eli Tyre in the role of Bean/The Narrator.

Why?  Current group housing/attempts at group rationality and community-supported leveling up seem to me to be falling short in a number of ways.  First, there's not enough stuff actually happening in them (i.e. to the extent people are growing and improving and accomplishing ambitious projects, it's largely within their professional orgs or fueled by unusually agenty individuals, and not by leveraging the low-hanging fruit available in our house environments).  Second, even the group houses seem to be plagued by the same sense of unanchored abandoned loneliness that's hitting the rationalist community specifically and the millennial generation more generally.  There are a bunch of competitors for "third," but for now we can leave it at that.

"You are who you practice being."


Section 1 of 3: Underlying models

The following will be meandering and long-winded; apologies in advance.  In short, both the house's proposed aesthetic and the impulse to found it in the first place were not well-reasoned from first principles—rather, they emerged from a set of System 1 intuitions which have proven sound/trustworthy in multiple arenas and which are based on experience in a variety of domains.  This section is an attempt to unpack and explain those intuitions post-hoc, by holding plausible explanations up against felt senses and checking to see what resonates.

Problem 1: Pendulums

This one's first because it informs and underlies a lot of my other assumptions.  Essentially, the claim here is that most social progress can be modeled as a pendulum oscillating decreasingly far from an ideal.  The society is "stuck" at one point, realizes that there's something wrong about that point (e.g. that maybe we shouldn't be forcing people to live out their entire lives in marriages that they entered into with imperfect information when they were like sixteen), and then moves to correct that specific problem, often breaking some other Chesterton's fence in the process.


For example, my experience leads me to put a lot of confidence behind the claim that we've traded "a lot of people trapped in marriages that are net bad for them" for "a lot of people who never reap the benefits of what would've been a strongly net-positive marriage, because it ended too easily too early on."  The latter problem is clearly smaller, and is probably a better problem to have as an individual, but it's nevertheless clear (to me, anyway) that the loosening of the absoluteness of marriage had negative effects in addition to its positive ones.

Proposed solution: Rather than choosing between absolutes, integrate.  For example, I have two close colleagues/allies who share millennials' default skepticism of lifelong marriage, but they also are skeptical that a commitment-free lifestyle is costlessly good.  So they've decided to do handfasting, in which they're fully committed for a year and a day at a time, and there's a known period of time for asking the question "should we stick together for another round?"

In this way, I posit, you can get the strengths of the old socially evolved norm which stood the test of time, while also avoiding the majority of its known failure modes.  Sort of like building a gate into the Chesterton's fence, instead of knocking it down—do the old thing in time-boxed iterations with regular strategic check-ins, rather than assuming you can invent a new thing from whole cloth.

Caveat/skull: Of course, the assumption here is that the Old Way Of Doing Things is not a slippery slope trap, and that you can in fact avoid the failure modes simply by trying.  And there are plenty of examples of that not working, which is why Taking Time-Boxed Experiments And Strategic Check-Ins Seriously is a must.  In particular, when attempting to strike such a balance, all parties must have common knowledge agreement about which side of the ideal to err toward (e.g. innocents in prison, or guilty parties walking free?).

 

Problem 2: The Unpleasant Valley

As far as I can tell, it's pretty uncontroversial to claim that humans are systems with a lot of inertia.  Status quo bias is well researched, past behavior is the best predictor of future behavior, most people fail at resolutions, etc.

I have some unqualified speculation regarding what's going on under the hood.  For one, I suspect that you'll often find humans behaving pretty much as an effort- and energy-conserving algorithm would behave.  People have optimized their most known and familiar processes at least somewhat, which means that it requires less oomph to just keep doing what you're doing than to cobble together a new system.  For another, I think hyperbolic discounting gets way too little credit/attention, and is a major factor in knocking people off the wagon when they're trying to forego local behaviors that are known to be intrinsically rewarding for local behaviors that add up to long-term cumulative gain.

But in short, I think the picture of "I'm going to try something new, eh?" often looks like this:


... with an "unpleasant valley" some time after the start point.  Think about the cold feet you get after the "honeymoon period" has worn off, or the desires and opinions of a military recruit in the second week of a six-week boot camp, or the frustration that emerges two months into a new diet/exercise regime, or your second year of being forced to take piano lessons.

The problem is, people never make it to the third year, where they're actually good at piano, and start reaping the benefits, and their System 1 updates to yeah, okay, this is in fact worth it.  Or rather, they sometimes make it, if there are strong supportive structures to get them across the unpleasant valley (e.g. in a military bootcamp, they just ... make you keep going).  But left to our own devices, we'll often get halfway through an experiment and just ... stop, without ever finding out what the far side is actually like.

Proposed solution: Make experiments "unquittable."  The idea here is that (ideally) one would not enter into a new experiment unless a) one were highly confident that one could absorb the costs, if things go badly, and b) one were reasonably confident that there was an Actually Good Thing waiting at the finish line.  If (big if) we take those as a given, then it should be safe to, in essence, "lock oneself in," via any number of commitment mechanisms.  Or, to put it in other words: "Medium-Term Future Me is going to lose perspective and want to give up because of being unable to see past short-term unpleasantness to the juicy, long-term goal?  Fine, then—Medium-Term Future Me doesn't get a vote."  Instead, Post-Experiment Future Me gets the vote, including getting to update heuristics on which-kinds-of-experiments-are-worth-entering.

Caveat/skull: People who are bad at self-modeling end up foolishly locking themselves into things that are higher-cost or lower-EV than they thought, and getting burned; black swans and tail risk ends up making even good bets turn out very very badly; we really should've built in an ejector seat.  This risk can be mostly ameliorated by starting small and giving people a chance to calibrate—you don't make white belts try to punch through concrete blocks, you make them punch soft, pillowy targets first.

And, of course, you do build in an ejector seat.  See next.

 

Problem 3: Saving Face

If any of you have been to a martial arts academy in the United States, you're probably familiar with the norm whereby a tardy student purchases entry into the class by first doing some pushups.  The standard explanation here is that the student is doing the pushups not as a punishment, but rather as a sign of respect for the instructor, the other students, and the academy as a whole.

I posit that what's actually going on includes that, but is somewhat more subtle/complex.  I think the real benefit of the pushup system is that it closes the loop.  

Imagine you're a ten year old kid, and your parent picked you up late from school, and you're stuck in traffic on your way to the dojo.  You're sitting there, jittering, wondering whether you're going to get yelled at, wondering whether the master or the other students will think you're lazy, imagining stuttering as you try to explain that it wasn't your fault—

Nope, none of that.  Because it's already clearly established that if you fail to show up on time, you do some pushups, and then it's over.  Done.  Finished.  Like somebody sneezed and somebody else said "bless you," and now we can all move on with our lives.  Doing the pushups creates common knowledge around the questions "does this person know what they did wrong?" and "do we still have faith in their core character?"  You take your lumps, everyone sees you taking your lumps, and there's no dangling suspicion that you were just being lazy, or that other people are secretly judging you.  You've paid the price in public, and everyone knows it, and this is a good thing.

Proposed solution: This is a solution without a concrete problem, since I haven't yet actually outlined the specific commitments a Dragon has to make (regarding things like showing up on time, participating in group activities, and making personal progress).  But in essence, the solution is this: you have to build into your system from the beginning a set of ways-to-regain-face.  Ways to hit the ejector seat on an experiment that's going screwy without losing all social standing; ways to absorb the occasional misstep or failure-to-adequately-plan; ways to be less-than-perfect and still maintain the integrity of a system that's geared toward focusing everyone on perfection.  In short, people have to know (and others have to know that they know, and they have to know that others know that they know) exactly how to make amends to the social fabric, in cases where things go awry, so that there's no question about whether they're trying to make amends, or whether that attempt is sufficient.  


Caveat/skull: The obvious problem is people attempting to game the system—they notice that ten pushups is way easier than doing the diligent work required to show up on time 95 times out of 100.  The next obvious problem is that the price is set too low for the group, leaving them to still feel jilted or wronged, and the next obvious problem is that the price is set too high for the individual, leaving them to feel unfairly judged or punished (the fun part is when both of those are true at the same time).  Lastly, there's something in the mix about arbitrariness—what do pushups have to do with lateness, really?  I mean, I get that it's paying some kind of unpleasant cost, but ...


Problem 4: Defections & Compounded Interest

I'm pretty sure everyone's tired of hearing about one-boxing and iterated prisoners' dilemmas, so I'm going to move through this one fairly quickly even though it could be its own whole multipage post.  In essence, the problem is that any rate of tolerance of real defection (i.e. unmitigated by the social loop-closing norms above) ultimately results in the destruction of the system.  Another way to put this is that people underestimate by a couple of orders of magnitude the corrosive impact of their defections—we often convince ourselves that 90% or 99% is good enough, when in fact what's needed is something like 99.99%.

There's something good that happens if you put a little bit of money away with every paycheck, and it vanishes or is severely curtailed once you stop, or start skipping a month here and there.  Similarly, there's something good that happens when a group of people agree to meet in the same place at the same time without fail, and it vanishes or is severely curtailed once one person skips twice.

In my work at the Center for Applied Rationality, I frequently tell my colleagues and volunteers "if you're 95% reliable, that means I can't rely on you."  That's because I'm in a context where "rely" means really trust that it'll get done.  No, really.  No, I don't care what comes up, DID YOU DO THE THING?  And if the answer is "Yeah, 19 times out of 20," then I can't give that person tasks ever again, because we run more than 20 workshops and I can't have one of them catastrophically fail.

(I mean, I could.  It probably wouldn't be the end of the world.  But that's exactly the point—I'm trying to create a pocket universe in which certain things, like "the CFAR workshop will go well," are absolutely reliable, and the "absolute" part is important.)

As far as I can tell, it's hyperbolic discounting all over again—the person who wants to skip out on the meetup sees all of these immediate, local costs to attending, and all of these visceral, large gains to defection, and their S1 doesn't properly weight the impact to those distant, cumulative effects (just like the person who's going to end up with no retirement savings because they wanted those new shoes this month instead of next month).  1.01^n takes a long time to look like it's going anywhere, and in the meantime the quick one-time payoff of 1.1 that you get by knocking everything else down to .99^n looks juicy and delicious and seems justified.

But something magical does accrue when you make the jump from 99% to 100%.  That's when you see teams that truly trust and rely on one another, or marriages built on unshakeable faith (and you see what those teams and partnerships can build, when they can adopt time horizons of years or decades rather than desperately hoping nobody will bail after the third meeting).  It starts with a common knowledge understanding that yes, this is the priority, even—no, wait, especially—when it seems like there are seductively convincing arguments for it to not be.  When you know—not hope, but know—that you will make a local sacrifice for the long-term good, and you know that they will, too, and you all know that you all know this, both about yourselves and about each other.

Proposed solution: Discuss, and then agree upon, and then rigidly and rigorously enforce a norm of perfection in all formal undertakings (and, correspondingly, be more careful and more conservative about which undertakings you officially take on, versus which things you're just casually trying out as an informal experiment), with said norm to be modified/iterated only during predecided strategic check-in points and not on the fly, in the middle of things.  Build a habit of clearly distinguishing targets you're going to hit from targets you'd be happy to hit.  Agree upon and uphold surprisingly high costs for defection, Hofstadter style, recognizing that a cost that feels high enough probably isn't.  Leave people wiggle room as in Problem 3, but define that wiggle room extremely concretely and objectively, so that it's clear in advance when a line is about to be crossed.  Be ridiculously nitpicky and anal about supporting standards that don't seem worth supporting, in the moment, if they're in arenas that you've previously assessed as susceptible to compounding.  Be ruthless about discarding standards during strategic review; if a member of the group says that X or Y or Z is too high-cost for them to sustain, believe them, and make decisions accordingly.

Caveat/skull: Obviously, because we're humans, even people who reflectively endorse such an overall solution will chafe when it comes time for them to pay the price (I certainly know I've chafed under standards I fought to install).  At that point, things will seem arbitrary and overly constraining, priorities will seem misaligned (and might actually be), and then feelings will be hurt and accusations will be leveled and things will be rough.  The solution there is to have, already in place, strong and open channels of communication, strong norms and scaffolds for emotional support, strong default assumption of trust and good intent on all sides, etc. etc.  This goes wrongest when things fester and people feel they can't speak up; it goes much better if people have channels to lodge their complaints and reservations and are actively incentivized to do so (and can do so without being accused of defecting on the norm-in-question; criticism =/= attack).

 

Problem 5: Everything else

There are other models and problems in the mix—for instance, I have a model surrounding buy-in and commitment that deals with an escalating cycle of asks-and-rewards, or a model of how to effectively leverage a group around you to accomplish ambitious tasks that requires you to first lay down some "topsoil" of simple/trivial/arbitrary activities that starts the growth of an ecology of affordances, or a theory that the strategy of trying things and doing things outstrips the strategy of think-until-you-identify-worthwhile-action, and that rationalists in particular are crippling themselves through decision paralysis/letting the perfect be the enemy of the good when just doing vaguely interesting projects would ultimately gain them more skill and get them further ahead, or a strong sense based off both research and personal experience that physical proximity matters, and that you can't build the correct kind of strength and flexibility and trust into your relationships without actually spending significant amounts of time with one another in meatspace on a regular basis, regardless of whether that makes tactical sense given your object-level projects and goals.

But I'm going to hold off on going into those in detail until people insist on hearing about them or ask questions/pose hesitations that could be answered by them.


Section 2 of 3: Power dynamics

All of the above was meant to point at reasons why I suspect trusting individuals responding to incentives moment-by-moment to be a weaker and less effective strategy than building an intentional community that Actually Asks Things Of Its Members.  It was also meant to justify, at least indirectly, why a strong guiding hand might be necessary given that our community's evolved norms haven't really produced results (in the group houses) commensurate with the promises of EA and rationality.

Ultimately, though, what matters is not the problems and solutions themselves so much as the light they shine on my aesthetics (since, in the actual house, it's those aesthetics that will be used to resolve epistemic gridlock).  In other words, it's not so much those arguments as it is the fact that Duncan finds those arguments compelling.  It's worth noting that the people most closely involved with this project (i.e. my closest advisors and those most likely to actually sign on as housemates) have been encouraged to spend a significant amount of time explicitly vetting me with regards to questions like "does this guy actually think things through," "is this guy likely to be stupid or meta-stupid," "will this guy listen/react/update/pivot in response to evidence or consensus opposition," and "when this guy has intuitions that he can't explain, do they tend to be validated in the end?"

In other words, it's fair to view this whole post as an attempt to prove general trustworthiness (in both domain expertise and overall sanity), because—well—that's what it is.  In milieu like the military, authority figures expect (and get) obedience irrespective of whether or not they've earned their underlings' trust; rationalists tend to have a much higher bar before they're willing to subordinate their decisionmaking processes, yet still that's something this sort of model requires of its members (at least from time to time, in some domains, in a preliminary "try things with benefit of the doubt" sort of way).  I posit that Dragon Army Barracks works (where "works" means "is good and produces both individual and collective results that outstrip other group houses by at least a factor of three") if and only if its members are willing to hold doubt in reserve and act with full force in spite of reservations—if they're willing to trust me more than they trust their own sense of things (at least in the moment, pending later explanation and recalibration on my part or theirs or both).

And since that's a) the central difference between DA and all the other group houses, which are collections of non-subordinate equals, and b) quite the ask, especially in a rationalist community, it's entirely appropriate that it be given the greatest scrutiny.  Likely participants in the final house spent ~64 consecutive hours in my company a couple of weekends ago, specifically to play around with living under my thumb and see whether it's actually a good place to be; they had all of the concerns one would expect and (I hope) had most of those concerns answered to their satisfaction.  The rest of you will have to make do with grilling me in the comments here.

 

"Why was Tyler Durden building an army?  To what purpose?  For what greater good? ...in Tyler we trusted."

 

Power and authority are generally anti-epistemic—for every instance of those-in-power defending themselves against the barbarians at the gates or anti-vaxxers or the rise of Donald Trump, there are a dozen instances of them squashing truth, undermining progress that would make them irrelevant, and aggressively promoting the status quo.

Thus, every attempt by an individual to gather power about themselves is at least suspect, given regular ol' incentive structures and regular ol' fallible humans.  I can (and do) claim to be after a saved world and a bunch of people becoming more the-best-versions-of-themselves-according-to-themselves, but I acknowledge that's exactly the same claim an egomaniac would make, and I acknowledge that the link between "Duncan makes all his housemates wake up together and do pushups" and "the world is incrementally less likely to end in gray goo and agony" is not obvious.

And it doesn't quite solve things to say, "well, this is an optional, consent-based process, and if you don't like it, don't join," because good and moral people have to stop and wonder whether their friends and colleagues with slightly weaker epistemics and slightly less-honed allergies to evil are getting hoodwinked.  In short, if someone's building a coercive trap, it's everyone's problem.

 

"Over and over he thought of the things he did and said in his first practice with his new army. Why couldn't he talk like he always did in his evening practice group? No authority except excellence. Never had to give orders, just made suggestions. But that wouldn't work, not with an army. His informal practice group didn't have to learn to do things together. They didn't have to develop a group feeling; they never had to learn how to hold together and trust each other in battle. They didn't have to respond instantly to command.

And he could go to the other extreme, too. He could be as lax and incompetent as Rose the Nose, if he wanted. He could make stupid mistakes no matter what he did. He had to have discipline, and that meant demanding—and getting—quick, decisive obedience. He had to have a well-trained army, and that meant drilling the soldiers over and over again, long after they thought they had mastered a technique, until it was so natural to them that they didn't have to think about it anymore."

 

But on the flip side, we don't have time to waste.  There's existential risk, for one, and even if you don't buy ex-risk à la AI or bioterrorism or global warming, people's available hours are trickling away at the alarming rate of one hour per hour, and none of us are moving fast enough to get All The Things done before we die.  I personally feel that I am operating far below my healthy sustainable maximum capacity, and I'm not alone in that, and something like Dragon Army could help.

So.  Claims, as clearly as I can state them, in answer to the question "why should a bunch of people sacrifice non-trivial amounts of their autonomy to Duncan?"

1. Somebody ought to run this, and no one else will.  On the meta level, this experiment needs to be run—we have like twenty or thirty instances of the laissez-faire model, and none of the high-standards/hardcore one, and also not very many impressive results coming out of our houses.  Due diligence demands investigation of the opposite hypothesis.  On the object level, it seems uncontroversial to me that there are goods waiting on the other side of the unpleasant valley—goods that a team of leveled-up, coordinated individuals with bonds of mutual trust can seize that the rest of us can't even conceive of, at this point, because we don't have a deep grasp of what new affordances appear once you get there.

2. I'm the least unqualified person around.  Those words are chosen deliberately, for this post on "less wrong."  I have a unique combination of expertise that includes being a rationalist, sixth grade teacher, coach, RA/head of a dormitory, ringleader of a pack of hooligans, member of two honor code committees, curriculum director, obsessive sci-fi/fantasy nerd, writer, builder, martial artist, parkour guru, maker, and generalist.  If anybody's intuitions and S1 models are likely to be capable of distinguishing the uncanny valley from the real deal, I posit mine are.

3. There's never been a safer context for this sort of experiment.  It's 2017, we live in the United States, and all of the people involved are rationalists.  We all know about NVC and double crux, we're all going to do Circling, we all know about Gendlin's Focusing, and we've all read the Sequences (or will soon).  If ever there was a time to say "let's all step out onto the slippery slope, I think we can keep our balance," it's now—there's no group of people better equipped to stop this from going sideways.

4. It does actually require a tyrant. As a part of a debrief during the weekend experiment/dry run, we went around the circle and people talked about concerns/dealbreakers/things they don't want to give up.  One interesting thing that popped up is that, according to consensus, it's literally impossible to find a time of day when the whole group could get together to exercise.  This happened even with each individual being willing to make personal sacrifices and doing things that are somewhat costly.

If, of course, the expectation is that everybody shows up on Tuesday and Thursday evenings, and the cost of not doing so is not being present in the house, suddenly the situation becomes simple and workable.  And yes, this means some kids left behind (ctrl+f), but the whole point of this is to be instrumentally exclusive and consensually high-commitment.  You just need someone to make the actual final call—there are too many threads for the coordination problem of a house of this kind to be solved by committee, and too many circumstances in which it's impossible to make a principled, justifiable decision between 492 almost-indistinguishably-good options.  On top of that, there's a need for there to be some kind of consistent, neutral force that sets course, imposes consistency, resolves disputes/breaks deadlock, and absorbs all of the blame for the fact that it's unpleasant to be forced to do things you know you ought to but don't want to do.

And lastly, we (by which I indicate the people most likely to end up participating) want the house to do stuff—to actually take on projects of ambitious scope, things that require ten or more talented people reliably coordinating for months at a time.  That sort of coordination requires a quarterback on the field, even if the strategizing in the locker room is egalitarian.

5. There isn't really a status quo for power to abusively maintain.  Dragon Army Barracks is not an object-level experiment in making the best house; it's a meta-level experiment attempting (through iteration rather than armchair theorizing) to answer the question "how best does one structure a house environment for growth, self-actualization, productivity, and social synergy?"  It's taken as a given that we'll get things wrong on the first and second and third try; the whole point is to shift from one experiment to the next, gradually accumulating proven-useful norms via consensus mechanisms, and the centralized power is mostly there just to keep the transitions smooth and seamless.  More importantly, the fundamental conceit of the model is "Duncan sees a better way, which might take some time to settle into," but after e.g. six months, if the thing is not clearly positive and at least well on its way to being self-sustaining, everyone ought to abandon it anyway.  In short, my tyranny, if net bad, has a natural time limit, because people aren't going to wait around forever for their results.

6. The experiment has protections built in.  Transparency, operationalization, and informed consent are the name of the game; communication and flexibility are how the machine is maintained.  Like the Constitution, Dragon Army's charter and organization are meant to be "living documents" that constrain change only insofar as they impose reasonable limitations on how wantonly change can be enacted.


Section 3 of 3: Dragon Army Charter (DRAFT)

Statement of purpose:

Dragon Army Barracks is a group housing and intentional community project which exists to support its members socially, emotionally, intellectually, and materially as they endeavor to improve themselves, complete worthwhile projects, and develop new and useful culture, in that order.  In addition to the usual housing commitments (i.e. rent, utilities, shared expenses), its members will make limited and specific commitments of time, attention, and effort averaging roughly 90 hours a month (~1.5hr/day plus occasional weekend activities).

Dragon Army Barracks will have an egalitarian, flat power structure, with the exception of a commander (Duncan Sabien) and a first officer (Eli Tyre).  The commander's role is to create structure by which the agreed-upon norms and standards of the group shall be discussed, decided, and enforced, to manage entry to and exit from the group, and to break epistemic gridlock/make decisions when speed or simplification is required.  The first officer's role is to manage and moderate the process of building consensus around the standards of the Army—what they are, and in what priority they should be met, and with what consequences for failure.  Other "management" positions may come into existence in limited domains (e.g. if a project arises, it may have a leader, and that leader will often not be Duncan or Eli), and will have their scope and powers defined at the point of creation/ratification.

Initial areas of exploration:

The particular object level foci of Dragon Army Barracks will change over time as its members experiment and iterate, but at first it will prioritize the following:

  • Physical proximity (exercising together, preparing and eating meals together, sharing a house and common space)
  • Regular activities for bonding and emotional support (Circling, pair debugging, weekly retrospective, tutoring/study hall)
  • Regular activities for growth and development (talk night, tutoring/study hall, bringing in experts, cross-pollination)
  • Intentional culture (experiments around lexicon, communication, conflict resolution, bets & calibration, personal motivation, distribution of resources & responsibilities, food acquisition & preparation, etc.)
  • Projects with "shippable" products (e.g. talks, blog posts, apps, events; some solo, some partner, some small group, some whole group; ranging from short-term to year-long)
  • Regular (every 6-10 weeks) retreats to learn a skill, partake in an adventure or challenge, or simply change perspective

Dragon Army Barracks will begin with a move-in weekend that will include ~10 hours of group bonding, discussion, and norm-setting.  After that, it will enter an eight-week bootcamp phase, in which each member will participate in at least the following:

  • Whole group exercise (90min, 3x/wk, e.g. Tue/Fri/Sun)
  • Whole group dinner and retrospective (120min, 1x/wk, e.g. Tue evening)
  • Small group baseline skill acquisition/study hall/cross-pollination (90min, 1x/wk)
  • Small group circle-shaped discussion (120min, 1x/wk)
  • Pair debugging or rapport building (45min, 2x/wk)
  • One-on-one check-in with commander (20min, 2x/wk)
  • Chore/house responsibilities (90min distributed)
  • Publishable/shippable solo small-scale project work with weekly public update (100min distributed)

... for a total time commitment of 16h/week or 128 hours total, followed by a whole group retreat and reorientation.  The house will then enter an eight-week trial phase, in which each member will participate in at least the following:

  • Whole group exercise (90min, 3x/wk)
  • Whole group dinner, retrospective, and plotting (150min, 1x/wk)
  • Small group circling and/or pair debugging (120min distributed)
  • Publishable/shippable small group medium-scale project work with weekly public update (180min distributed)
  • One-on-one check-in with commander (20min, 1x/wk)
  • Chore/house responsibilities (60min distributed)
... for a total time commitment of 13h/week or 104 hours total, again followed by a whole group retreat and reorientation.  The house will then enter a third phase where commitments will likely change, but will include at a minimum whole group exercise, whole group dinner, and some specific small-group responsibilities, either social/emotional or project/productive (once again ending with a whole group retreat).  At some point between the second and third phase, the house will also ramp up for its first large-scale project, which is yet to be determined but will be roughly on the scale of putting on a CFAR workshop in terms of time and complexity.

Should the experiment prove successful past its first six months, and worth continuing for a full year or longer, by the end of the first year every Dragon shall have a skill set including, but not limited to:
  • Above-average physical capacity
  • Above-average introspection
  • Above-average planning & execution skill
  • Above-average communication/facilitation skill
  • Above-average calibration/debiasing/rationality knowledge
  • Above-average scientific lab skill/ability to theorize and rigorously investigate claims
  • Average problem-solving/debugging skill
  • Average public speaking skill
  • Average leadership/coordination skill
  • Average teaching and tutoring skill
  • Fundamentals of first aid & survival
  • Fundamentals of financial management
  • At least one of: fundamentals of programming, graphic design, writing, A/V/animation, or similar (employable mental skill)
  • At least one of: fundamentals of woodworking, electrical engineering, welding, plumbing, or similar (employable trade skill)
Furthermore, every Dragon should have participated in:
  • At least six personal growth projects involving the development of new skill (or honing of prior skill)
  • At least three partner- or small-group projects that could not have been completed alone
  • At least one large-scale, whole-army project that either a) had a reasonable chance of impacting the world's most important problems, or b) caused significant personal growth and improvement
  • Daily contributions to evolved house culture
Speaking of evolved house culture...

Because of both a) the expected value of social exploration and b) the cumulative positive effects of being in a group that's trying things regularly and taking experiments seriously, Dragon Army will endeavor to adopt no fewer than one new experimental norm per week.  Each new experimental norm should have an intended goal or result, an informal theoretical backing, and a set re-evaluation time (default three weeks).  There are two routes by which a new experimental norm is put into place:

  • The experiment is proposed by a member, discussed in a whole group setting, and meets the minimum bar for adoption (>60% of the Army supports, with <20% opposed and no hard vetos)
  • The Army has proposed no new experiments in the previous week, and the Commander proposes three options.  The group may then choose one by vote/consensus, or generate three new options, from which the Commander may choose.
Examples of some of the early norms which the house is likely to try out from day one (hit the ground running):
  • The use of a specific gesture to greet fellow Dragons (house salute)
  • Various call-and-response patterns surrounding house norms (e.g. "What's rule number one?" "PROTECT YOURSELF!")
  • Practice using hook, line, and sinker in social situations (three items other than your name for introductions)
  • The anti-Singer rule for open calls-for-help (if Dragon A says "hey, can anyone help me with X?" the responsibility falls on the physically closest housemate to either help or say "Not me/can't do it!" at which point the buck passes to the next physically closest person)
  • An "interrupt" call that any Dragon may use to pause an ongoing interaction for fifteen seconds
  • A "culture of abundance" in which food and leftovers within the house are default available to all, with exceptions deliberately kept as rare as possible
  • A "graffiti board" upon which the Army keeps a running informal record of its mood and thoughts

Dragon Army Code of Conduct
While the norms and standards of Dragon Army will be mutable by design, the following (once revised and ratified) will be the immutable code of conduct for the first eight weeks, and is unlikely to change much after that.

  1. A Dragon will protect itself, i.e. will not submit to pressure causing it to do things that are dangerous or unhealthy, nor wait around passively when in need of help or support (note that this may cause a Dragon to leave the experiment!).
  2. A Dragon will take responsibility for its actions, emotional responses, and the consequences thereof, e.g. if late will not blame bad luck/circumstance, if angry or triggered will not blame the other party.
  3. A Dragon will assume good faith in all interactions with other Dragons and with house norms and activities, i.e. will not engage in strawmanning or the horns effect.
  4. A Dragon will be candid and proactive, e.g. will give other Dragons a chance to hear about and interact with negative models once they notice them forming, or will not sit on an emotional or interpersonal problem until it festers into something worse.
  5. A Dragon will be fully present and supportive when interacting with other Dragons in formal/official contexts, i.e. will not engage in silent defection, undermining, halfheartedness, aloofness, subtle sabotage, or other actions which follow the letter of the law while violating the spirit.  Another way to state this is that a Dragon will practice compartmentalization—will be able to simultaneously hold "I'm deeply skeptical about this" alongside "but I'm actually giving it an honest try," and postpone critique/complaint/suggestion until predetermined checkpoints.  Yet another way to state this is that a Dragon will take experiments seriously, including epistemic humility and actually seeing things through to their ends rather than fiddling midway.
  6. A Dragon will take the outside view seriously, maintain epistemic humility, and make subject-object shifts, i.e. will act as a behaviorist and agree to judge and be judged on the basis of actions and revealed preferences rather than intentions, hypotheses, and assumptions (this one's similar to #2 and hard to put into words, but for example, a Dragon who has been having trouble getting to sleep but has never informed the other Dragons that their actions are keeping them awake will agree that their anger and frustration, while valid internally, may not fairly be vented on those other Dragons, who were never given a chance to correct their behavior).  Another way to state this is that a Dragon will embrace the maxim "don't believe everything that you think."
  7. A Dragon will strive for excellence in all things, modified only by a) prioritization and b) doing what is necessary to protect itself/maximize total growth and output on long time scales.
  8. A Dragon will not defect on other Dragons.
There will be various operationalizations of the above commitments into specific norms (e.g. a Dragon will read all messages and emails within 24 hours, and if a full response is not possible within that window, will send a short response indicating when the longer response may be expected) that will occur once the specific members of the Army have been selected and have individually signed on.  Disputes over violations of the code of conduct, or confusions about its operationalization, will first be addressed one-on-one or in informal small group, and will then move to general discussion, and then to the first officer, and then to the commander.

Note that all of the above is deliberately kept somewhat flexible/vague/open-ended/unsettled, because we are trying not to fall prey to GOODHART'S DEMON.


Random Logistics
  1. The initial filter for attendance will include a one-on-one interview with the commander (Duncan), who will be looking for a) credible intention to put forth effort toward the goal of having a positive impact on the world, b) likeliness of a strong fit with the structure of the house and the other participants, and c) reliability à la financial stability and ability to commit fully to long-term endeavors.  Final decisions will be made by the commander and may be informally questioned/appealed but not overruled by another power.
  2. Once a final list of participants is created, all participants will sign a "free state" contract of the form "I agree to move into a house within five miles of downtown Berkeley (for length of time X with financial obligation Y) sometime in the window of July 1st through September 30th, conditional on at least seven other people signing this same agreement."  At that point, the search for a suitable house will begin, possibly with delegation to participants.
  3. Rents in that area tend to run ~$1100 per room, on average, plus utilities, plus a 10% contribution to the general house fund.  Thus, someone hoping for a single should, in the 85th percentile worst case, be prepared to make a ~$1400/month commitment.  Similarly, someone hoping for a double should be prepared for ~$700/month, and someone hoping for a triple should be prepared for ~$500/month, and someone hoping for a quad should be prepared for ~$350/month.
  4. The initial phase of the experiment is a six month commitment, but leases are generally one year.  Any Dragon who leaves during the experiment is responsible for continuing to pay their share of the lease/utilities/house fund, unless and until they have found a replacement person the house considers acceptable, or have found three potential viable replacement candidates and had each one rejected.  After six months, should the experiment dissolve, the house will revert to being simply a house, and people will bear the normal responsibility of "keep paying until you've found your replacement."  (This will likely be easiest to enforce by simply having as many names as possible on the actual lease.)
  5. Of the ~90hr/month, it is assumed that ~30 are whole-group, ~30 are small group or pair work, and ~30 are independent or voluntarily-paired work.  Furthermore, it is assumed that the commander maintains sole authority over ~15 of those hours (i.e. can require that they be spent in a specific way consistent with the aesthetic above, even in the face of skepticism or opposition).
  6. We will have an internal economy whereby people can trade effort for money and money for time and so on and so forth, because heck yeah.

Conclusion: Obviously this is neither complete nor perfect.  What's wrong, what's missing, what do you think?  I'm going to much more strongly weight the opinions of Berkelyans who are likely to participate, but I'm genuinely interested in hearing from everyone, particularly those who notice red flags (the goal is not to do anything stupid or meta-stupid).  Have fun tearing it up.

(sorry for the abrupt cutoff, but this was meant to be published Monday and I've just ... not ... been ... sleeping ... to get it done)

Existential risk from AI without an intelligence explosion

9 AlexMennen 25 May 2017 04:44PM

[xpost from my blog]

In discussions of existential risk from AI, it is often assumed that the existential catastrophe would follow an intelligence explosion, in which an AI creates a more capable AI, which in turn creates a yet more capable AI, and so on, a feedback loop that eventually produces an AI whose cognitive power vastly surpasses that of humans, which would be able to obtain a decisive strategic advantage over humanity, allowing it to pursue its own goals without effective human interference. Victoria Krakovna points out that many arguments that AI could present an existential risk do not rely on an intelligence explosion. I want to look in sightly more detail at how that could happen. Kaj Sotala also discusses this.

An AI starts an intelligence explosion when its ability to create better AIs surpasses that of human AI researchers by a sufficient margin (provided the AI is motivated to do so). An AI attains a decisive strategic advantage when its ability to optimize the universe surpasses that of humanity by a sufficient margin. Which of these happens first depends on what skills AIs have the advantage at relative to humans. If AIs are better at programming AIs than they are at taking over the world, then an intelligence explosion will happen first, and it will then be able to get a decisive strategic advantage soon after. But if AIs are better at taking over the world than they are at programming AIs, then an AI would get a decisive strategic advantage without an intelligence explosion occurring first.

Since an intelligence explosion happening first is usually considered the default assumption, I'll just sketch a plausibility argument for the reverse. There's a lot of variation in how easy cognitive tasks are for AIs compared to humans. Since programming AIs is not yet a task that AIs can do well, it doesn't seem like it should be a priori surprising if programming AIs turned out to be an extremely difficult task for AIs to accomplish, relative to humans. Taking over the world is also plausibly especially difficult for AIs, but I don't see strong reasons for confidence that it would be harder for AIs than starting an intelligence explosion would be. It's possible that an AI with significantly but not vastly superhuman abilities in some domains could identify some vulnerability that it could exploit to gain power, which humans would never think of. Or an AI could be enough better than humans at forms of engineering other than AI programming (perhaps molecular manufacturing) that it could build physical machines that could out-compete humans, though this would require it to obtain the resources necessary to produce them.

Furthermore, an AI that is capable of producing a more capable AI may refrain from doing so if it is unable to solve the AI alignment problem for itself; that is, if it can create a more intelligent AI, but not one that shares its preferences. This seems unlikely if the AI has an explicit description of its preferences. But if the AI, like humans and most contemporary AI, lacks an explicit description of its preferences, then the difficulty of the AI alignment problem could be an obstacle to an intelligence explosion occurring.

It also seems worth thinking about the policy implications of the differences between existential catastrophes from AI that follow an intelligence explosion versus those that don't. For instance, AIs that attempt to attain a decisive strategic advantage without undergoing an intelligence explosion will exceed human cognitive capabilities by a smaller margin, and thus would likely attain strategic advantages that are less decisive, and would be more likely to fail. Thus containment strategies are probably more useful for addressing risks that don't involve an intelligence explosion, while attempts to contain a post-intelligence explosion AI are probably pretty much hopeless (although it may be worthwhile to find ways to interrupt an intelligence explosion while it is beginning). Risks not involving an intelligence explosion may be more predictable in advance, since they don't involve a rapid increase in the AI's abilities, and would thus be easier to deal with at the last minute, so it might make sense far in advance to focus disproportionately on risks that do involve an intelligence explosion.

It seems likely that AI alignment would be easier for AIs that do not undergo an intelligence explosion, since it is more likely to be possible to monitor and do something about it if it goes wrong, and lower optimization power means lower ability to exploit the difference between the goals the AI was given and the goals that were intended, if we are only able to specify our goals approximately. The first of those reasons applies to any AI that attempts to attain a decisive strategic advantage without first undergoing an intelligence explosion, whereas the second only applies to AIs that do not undergo an intelligence explosion ever. Because of these, it might make sense to attempt to decrease the chance that the first AI to attain a decisive strategic advantage undergoes an intelligence explosion beforehand, as well as the chance that it undergoes an intelligence explosion ever, though preventing the latter may be much more difficult. However, some strategies to achieve this may have undesirable side-effects; for instance, as mentioned earlier, AIs whose preferences are not explicitly described seem more likely to attain a decisive strategic advantage without first undergoing an intelligence explosion, but such AIs are probably more difficult to align with human values.

If AIs get a decisive strategic advantage over humans without an intelligence explosion, then since this would likely involve the decisive strategic advantage being obtained much more slowly, it would be much more likely for multiple, and possibly many, AIs to gain decisive strategic advantages over humans, though not necessarily over each other, resulting in a multipolar outcome. Thus considerations about multipolar versus singleton scenarios also apply to decisive strategic advantage-first versus intelligence explosion-first scenarios.

[Link] The land before metrics

0 arunbharatula 24 May 2017 04:31AM

[Link] Employment and wellbeing

0 arunbharatula 24 May 2017 04:30AM

[Link] Effective learning

0 arunbharatula 24 May 2017 04:28AM

[Link] Relationships and wellbeing

0 arunbharatula 24 May 2017 04:28AM

[Link] Political ideology

0 arunbharatula 24 May 2017 04:27AM

Notes from the Hufflepuff Unconference (Part 1)

14 Raemon 23 May 2017 09:04PM

April 28th, we ran the Hufflepuff Unconference in Berkeley, at the MIRI/CFAR office common space.

There's room for improvement in how the Unconference could have been run, but it succeeded the core things I wanted to accomplish: 

 - Established common knowledge of what problems people were actually interested in working on
 - We had several extensive discussions of some of those problems, with an eye towards building solutions
 - Several people agreed to work together towards concrete plans and experiments to make the community more friendly, as well as build skills relevant to community growth. (With deadlines and one person acting as project manager to make sure real progress was made)
 - We agreed to have a followup unconference in roughly three months, to discuss how those plans and experiments were going

Rough notes are available here. (Thanks to Miranda, Maia and Holden for takin really thorough notes)

This post will summarize some of the key takeaways, some speeches that were given, and my retrospective thoughts on how to approach things going forward.

But first, I'd like to cover a question that a lot of people have been asking about:

What does this all mean for people outside of the Bay?

The answer depends.

I'd personally like it if the overall rationality community got better at social skills, empathy, and working together, sticking with things that need sticking with (and in general, better at recognizing skills other than metacognition). In practice, individual communities can only change in the ways the people involved actually want to change, and there are other skills worth gaining that may be more important depending on your circumstances.

Does Project Hufflepuff make sense for your community?

If you're worried that your community doesn't have an interest in any of these things, my actual honest answer is that doing something "Project Hufflepuff-esque" probably does not make sense. I did not choose to do this because I thought it was the single-most-important thing in the abstract. I did it because it seemed important and I knew of a critical mass of people who I expected to want to work on it. 

If you're living in a sparsely populated area or haven't put a community together, the first steps do not look like this, they look more like putting yourself out there, posting a meetup on Less Wrong and just *trying things*, any things, to get something moving.

If you have enough of a community to step back and take stock of what kind of community you want and how to strategically get there, I think this sort of project can be worth learning from. Maybe you'll decide to tackle something Project-Hufflepuff-like, maybe you'll find something else to focus on. I think the most important thing is have some kind of vision for something you community can do that is worth working together, leveling up to accomplish.

Community Unconferences as One Possible Tool

Community unconferences are a useful tool to get everyone on the same page and spur them on to start working on projects, and you might consider doing something similar. 

They may not be the right tool for you and your group - I think they're most useful in places where there's enough people in your community that they don't all know each other, but do have enough existing trust to get together and brainstorm ideas. 

If you have a sense that Project Hufflepuff is worthwhile for your community but the above disclaimers point towards my current approach not making sense for you, I'm interested in talking about it with you, but the conversation will look less like "Ray has ideas for you to try" and more like "Ray is interested in helping you figure out what ideas to try, and the solution will probably look very different."

Online Spaces

Since I'm actually very uncertain about a lot of this and see it as an experiment, I don't think it makes sense to push for any of the ideas here to directly change Less Wrong itself (at least, yet). But I do think a lot of these concepts translate to online spaces in some fashion, and I think it'd make sense to try out some concepts inspired by this in various smaller online subcommunities.

Table of Contents:

I. Introduction Speech

 - Why are we here?
 - The Mission: Something To Protect
 - The Invisible Badger, or "What The Hell Is a Hufflepuff?"
 - Meta Meetups Usually Suck. Let's Try Not To.

II. Common Knowledge

 - What Do People Actually Want?
 - Lightning Talks

III. Discussing the Problem (Four breakout sessions)

 - Welcoming Newcomers
 - How to handle people who impose costs on others?
 - Styles of Leadership and Running Events
 - Making Helping Fun (or at least lower barrier-to-entry)

IV. Planning Solutions and Next Actions

V. Final Words

I. Introduction: It Takes A Village to Save a World

(A more polished version of my opening speech from the unconference)

[Epistemic Status: This is largely based on intuition, looking at what our community has done and what other communities seem to be able to do. I'm maybe 85% confident in it, but it is my best guess]

In 2012, I got super into the rationality community in New York. I was surrounded by people passionate about thinking better and using that thinking to tackle ambitious projects. And in 2012 we all decided to take on really hard projects that were pretty likely to fail, because the expected value seemed high, and it seemed like even if we failed we'd learn a lot in the process and grow stronger.

That happened - we learned and grew. We became adults together, founding companies and nonprofits and creating holidays from scratch.

But two years later, our projects were either actively failing, or burning us out. Many of us became depressed and demoralized.

There was nobody who was okay enough to actually provide anyone emotional support. Our core community withered.

I ended up making that the dominant theme of the 2014 NYC Solstice, with a call-to-action to get back to basics and take care each other.

I also went to the Berkeley Solstice that year. And... I dunno. In the back of my mind I was assuming "Berkeley won't have that problem - the Bay area has so many people, I can't even imagine how awesome and thriving a community they must have." (Especially since the Bay kept stealing all the Movers and Shakers of NYC).

The theme of the Bay Solstice turned out to be "Hey guys, so people keep coming to the Bay, running on a dream and a promise of community, but that community is not actually there, there's a tiny number of well-connected people who everyone is trying to get time with, and everyone seems lonely and sad. And we don't even know what to do about this."

In 2015, that theme in the Berkeley Solstice was revisited.

So I think that was the initial seed of what would become Project Hufflepuff - noticing that it's not enough to take on cool projects, that it's not enough to just get a bunch of people together and call it a community. Community is something you actively tend to. Insofar as Maslow's hierarchy is real, it's a foundation you need before ambitious projects can be sustainable.

There are other pieces of the puzzle - different lenses that, I believe, point towards a Central Thing. Some examples:

Group houses, individualism and coordination.

I've seen several group houses where, when people decide it no longer makes sense to live in the house, they... just kinda leave. Even if they've literally signed a lease. And everyone involved (the person leaving and those remain), instinctively act as if it's the remaining people's job to fill the leaver's spot, to make rent.

And the first time, this is kind of okay. But then each subsequent person leaving adds to a stressful undertone of "OMG are we even going to be able to afford to live here?". It eventually becomes depressing, and snowballs into a pit that makes newcomers feel like they don't WANT to move into the house.

Nowadays I've seen some people explicitly building into the roommate agreement a clear expectation of how long you stay and who's responsibility it is to find new roommates and pay rent in the meantime. But it's disappointing to me that this is something we needed, that we weren't instinctively paying to attention to how we were imposing costs on each other in the first place. That when we *violated a written contract*, let alone a handshake agreement, that we did not take upon ourselves (or hold each other accountable), to ensure we could fill our end of the bargain.

Friends, and Networking your way to the center

This community puts pressure on people to improve. It's easier to improve when you're surrounded by ambitious people who help or inspire each other level up. There's a sense that there's some cluster of cool-people-who-are-ambitious-and-smart who've been here for a while, and... it seems like everyone is trying to be friends with those people. 

It also seems like people just don't quite get that friendship is a skill, that adult friendships in City Culture can be hard, and it can require special effort to make them happen.

I'm not entirely sure what's going on here - it doesn't make sense to say anyone's obligated to hang out with any particular person (or obligated NOT to), but if 300 people aren't getting the connection they want it seems like *somewhere people are making a systematic mistake.* 

(Since the Unconference, Maia has tackled this particular issue in more detail)

 

The Mission - Something To Protect

 

As I see it, the Rationality Community has three things going on: Truth. Impact. And "Being People".

In some sense, our core focus is the practice of truthseeking. The thing that makes that truthseeking feel *important* is that it's connected to broader goals of impacting the world. And the thing that makes this actually fun and rewarding enough to stick with is a community that meets our needs, where can both flourish as individuals and find the relationships we want.

I think we have made major strides in each of those areas over the past seven years. But we are nowhere near done.

Different people have different intuitions of which of the three are most important. Some see some of them as instrumental, or terminal. There are people for whom Truthseeking is *the point*, and they'd have been doing that even if there wasn't a community to help them with it, and there are people for whom it's just one tool of many that helps them live their life better or plan important projects.

I've observed a tendency to argue about which of these things is most important, or what tradeoffs are worth making. Inclusiveness verses high standards. Truth vs action. Personal happiness vs high acheivement.

I think that kind of argument is a mistake.

We are falling woefully short on all of these things. 

We need something like 10x our current capacity for seeing, and thinking. 10x our capacity for doing. 10x our capacity for *being healthy people together.*

I say "10x" not because all these things are intrinsically equal. The point is not to make a politically neutral push to make all the things sound nice. I have no idea exactly how far short we're falling on each of these because the targets are so far away I can't even see the end, and we are doing a complicated thing that doesn't have clear instructions and might not even be possible.

The point is that all of these are incredibly important, and if we cannot find a way to improve *all* of these, in a way that is *synergistic* with each other, then we will fail.

There is a thing at the center of our community. Not all of us share the exact same perspective on it. For some of us it's not the most important thing. But it's been at the heart of the community since the beginning and I feel comfortable asserting that it is the thing that shapes our culture the most:

The purpose of our community is to make sure this place is okay:

The world isn't okay right now, on a number of levels. And a lot of us believe there is a strong chance it could become dramatically less okay. I've seen people make credible progress on taking responsibility for pieces of our home. But when all is said and done, none of our current projects really give me the confidence that things are going to turn out all right. 

Our community was brought together on a promise, a dream, and we have not yet actually proven ourselves worthy of that dream. And to make that dream a reality we need a lot of things.

We need to be able to criticize, because without criticism, we cannot improve.

If we cannot, I believe we will fail.

We need to be able to talk about ideas that are controversial, or uncomfortable - otherwise our creativity and insight will be crippled.

If we cannot, I believe we will fail.

We need to be able to do those things without alienating people. We need to be able to criticize without making people feel untrusted and discouraged from even taking action. We need to be able to discuss challenging things while earnestly respecting the notion that *talking about ideas gives those ideas power and has concrete effects on social reality*, and sometimes that can hurt people.

If we cannot figure out how to do that, I believe we will fail.

We need more people who are able and willing to try things that have never been done before. To stick with those things long enough to *get good at them*, to see if they can actually work. We need to help each other do impossible things. And we need to remember to check for and do the *possible*, boring, everyday things that are in fact straightforward and simple and not very inspiring. 

If we cannot manage to do that, I believe we will fail.

We need to be able to talk concretely about what the *highest leverage actions in the world are*. We need to prioritize those things, because the world is huge and broken and we are small. I believe we need to help each other through a long journey, building bigger and bigger levers, building connections with people outside our community who are undertaking the same journey through different perspectives.

And in the process, we need to not make it feel like if *you cannot personally work on those highest leverage things, that you are not important.* 

There's the kind of importance where we recognize that some people have scarce skills and drive, and the kind of importance where we remember that *every* person has intrinsic worth, and you owe *nobody* any special skills or prestigious sounding projects for your life to be worthwhile.

This isn't just a philosophical matter - I think it's damaging to our mental health and our collective capacity. 

We need to recognize that the distribution of skills we tend to reward or punish is NOT just about which ones are actually most valuable - sometimes it is simply founder effects and blind spots.

We cannot be a community for everyone - I believe trying to include anyone with a passing interest in us is a fool's errand. But there are many people who had valuable skills to contribute who have turned away, feeling frustrated and un-valued.

If we cannot find a way to accomplish all of these things at once, I believe we will fail.

The thesis of Project Hufflepuff is that it takes (at least) a village to save a world. 

It takes people doing experimental impossible things. It takes caretakers. It takes people helping out with unglorious tasks. It takes technical and emotional and physical skills. And while it does take some people who specialize in each of those things, I think it also needs many people who are least a little bit good at each of them, to pitch in when needed.

Project Hufflepuff is not the only things our community needs, or the most important. But I believe it is one of the necessary things that our community needs, if we're to get to 10x our current Truthseeking, Impact and Human-ing.

If we're to make sure that our home is okay.

The Invisible Badger

"A lone hufflepuff surrounded by slytherins will surely wither as if being leeched dry by vampires."

- Duncan

[Epistemic Status: My evidence for this is largely based on discussions with a few people for whom the badger seems real and valuable, and who report things being different in other communities, as well as some of my general intuitions about society. I'm 75% sure the badger exists, 90% that's it worth leaning into the idea of the badger to see if it works for you, and maybe 55% sure that it's worth trying to see the badger if you can't already make out it's edges.]


 

If I *had* to pick a clear thing that this conference is about without using Harry Potter jargon, I'd say "Interpersonal dynamics surrounding trust, and how those dynamics apply to each of the Impact/Truth/Human focuses of the rationality community."

I'm not super thrilled with that term because I think I'm grasping more for some kind of gestalt. An overall way of seeing and being that's hard to describe and that doesn't come naturally to the sort of person attracted to this community.

Much like the blind folk and the elephant, who each touched a different part of the animal and came away with a different impression (the trunk seems like a snake, the legs seem like a tree), I've been watching several people in the community try to describe things over the past few years. And maybe those things are separate but I feel like they're secretly a part of the same invisible badger.

Hufflepuff is about hard work, and loyalty, and camaraderie. It's about emotional intelligence. It's about seeing value in day to day things that don't directly tie into epic narratives. 

There's a bunch of skills that go into Hufflepuff. And part of want I want is for people to get better at those skills. But It think a mindset, an approach, that is fairly different from the typical rationalist mindset, that makes those skills easier. It's something that's harder when you're being rigorously utilitarian and building models of the world out of game theory and incentives.

Mindspace is deep and wide, and I don't expect that mindset to work for everyone. I don't think everyone should be a Hufflepuff. But I do think it'd be valuable to the community if more people at least had access to this mindset and more of these skills.

So what I'd like, for tonight, is for people to lean into this idea. Maybe in the end you'll find that this doesn't work for you. But I think many people's first instinct is going to be that this is alien and uncomfortable and I think it's worth trying to push past that.

The reason we're doing this conference together is because the Hufflepuff way doesn't really work if people are trying to do it alone - I think it requires trust and camaraderie and persistence to really work. I don't think we can have that required trust all at once, but I think if there are multiple people trying to make it work, who can incrementally trust each other more, I think we can reach a place where things run more smoothly, where we have stronger emotional connections, and where we trust each other enough to take on more ambitious projects than we could if we're all optimizing as individuals.

Meta-Meetups Suck. Let's Not.

This unconference is pretty meta - we're talking about norms and vague community stuff we want to change.

Let me tell you, meta meetups are the worst. Typically you end up going around in circles complaining and wishing there were more things happening and that people were stepping up and maybe if you're lucky you get a wave of enthusiasm that lasts a month or so and a couple things happen but nothing really *changes*.

So. Let's not do that. Here's what I want to accomplish and which seems achievable:

1) Establish common knowledge of important ideas and behavior patterns. 

Sometimes you DON'T need to develop a whole new skill, you just need to notice that your actions are impacting people in a different way, and maybe that's enough for you to decide to change somethings. Or maybe someone has a concept that makes it a lot easier for you to start gaining a new skill on your own.

2) Establish common knowledge of who's interested in trying which new norms, or which new skills. 

We don't actually *know* what the majority of people want here. I can sit here and tell you what *I* think you should want, but ultimately what matters is what things a critical mass of people want to talk about tonight.

Not everyone has to agree that an idea is good to try it out. But there's a lot of skills or norms that only really make sense when a critical mass of other people are trying them. So, maybe of the 40 people here, 25 people are interested in improving their empathy, and maybe another 20 are interested in actively working on friendship skills, or sticking to commitments. Maybe those people can help reinforce each other.

3) Explore ideas for social and skillbuilding experiments we can try, that might help. 

The failure mode of Ravenclaws is to think about things a lot and then not actually get around to doing them. A failure mode of ambitious Ravenclaws, is to think about things a lot and then do them and then assume that because they're smart, that they've thought of everything, and then not listen to feedback when they get things subtly or majorly wrong.

I'd like us to end by thinking of experiments with new norms, or habits we'd like to cultivate. I want us to frame these as experiments, that we try on a smaller scale and maybe promote more if they seem to be working, while keeping in mind that they may not work for everyone.

4) Commit to actions to take.

Since the default action is for them to peter out and fail, I'd like us to spend time bulletproofing them, brainstorming and coming up with trigger-action plans so that they actually have a chance to succeed.

Tabooing "Hufflepuff"

Having said all that talk about The Hufflepuff Way...

...the fact is, much of the reason I've used those towards is to paint a rough picture to attract the sort of person I wanted to attract to this unconference.

It's important that there's a fuzzy, hard-to-define-but-probably-real concept that we're grasping towards, but it's also important not to be talking past each other. Early on in this project I realized that a few people who I thought were on the same page actually meant fairly different things. Some cared more about empathy and friendship. Some cared more about doing things together, and expected deep friendships to arise naturally from that.

So I'd like us to establish a trigger-action-plan right now - for the rest of this unconference, if someone says "Hufflepuff", y'all should say "What do you mean by that?" and then figure out whatever concrete thing you're actually trying to talk about.

II. Common Knowledge

The first part of the unconference was about sharing our current goals, concerns and background knowledge that seemed useful. Most of the specifics are covered in the notes. But I'll talk here about why I included the things I did and what my takeaways were afterwards on how it worked.

Time to Think

The first thing I did was have people sit and think about what they actually wanted to get out of the conference, and what obstacles they could imagine getting in the way of that. I did this because often, I think our culture (ostensibly about helping us think better) doesn't give us time to think, and instead has people were are quick-witted and conversationally dominant end up doing most of the talking. (I wrote a post a year ago about this, the 12 Second Rule). In this case I gave everyone 5 minutes, which is something I've found helpful at small meetups in NYC.

This had mixed results - some people reported that while they can think well by themselves, in a group setting they find it intimidating and their mind starts wandering instead of getting anything done. They found it much more helpful when I eventually let people-who-preferred-to-talk-to-each-other go into another room to talk through their ideas outloud.

I think there's some benefit to both halves of this and I'm not sure how common which set of preferences are. It's certainly true that it's not common for conferences to give people a full 5 minutes to think so I'd expect it to be someone uncomfortable-feeling regardless of whether it was useful.

But an overall outcome of the unconference was that it was somewhat lower energy than I'd wanted, and opening with 5 minutes of silent thinking seemed to contribute to that, so for the next unconference I run, I'm leaning towards a shorter period of time for private thinking (Somewhere between 12 and 60 seconds), followed by "turn to your neighbors and talk through the ideas you have", followed by "each group shares their concepts with the room."

"What is do you want to improve on? What is something you could use help with?"

I wanted people to feel like active participants rather than passive observers, and I didn't want people to just think "it'd be great if other people did X", but to keep an internal locus of control - what can *I* do to steer this community better? I also didn't want people to be thinking entirely individualistically.

I didn't collect feedback on this specific part and am not sure how valuable others found it (if you were at the conference, I'd be interested if you left any thoughts in the comments). Some anonymized things people described:

  • When I make social mistakes, consider it failure; this is unhelpful

  • Help point out what they need help with

  • Have severe akrasia, would like more “get things done” magic tools

  • Getting to know the bay area rationalist community

  • General bitterness/burned out

  • Reduce insecurity/fear around sharing

  • Avoiding spending most words signaling to have read a particular thing; want to communicate more clearly

  • Creating systems that reinforce unnoticed good behaviour

  • Would like to learn how to try at things

  • Find place in rationalist community

  • Staying connected with the group

  • Paying attention to what they want in the moment, in particular when it’s right to not be persistent

  • Would like to know the “landing points” to the community to meet & greet new people

  • Become more approachable, & be more willing to approach others for help; community cohesiveness

  • Have been lonely most of life; want to find a place in a really good healthy community

  • Re: prosocialness, being too low on Maslow’s hierarchy to help others

  • Abundance mindset & not stressing about how to pay rent

  • Cultivate stance of being able to do helpful things (action stance) but also be able to notice difference between laziness and mental health

  • Don’t know how to respect legit safety needs w/o getting overwhelmed by arbitrary preferences; would like to model people better to give them basic respect w/o having to do arbitrary amount of work

  • Starting conversations with new people

  • More rationalist group homes / baugruppe

  • Being able to provide emotional support rather than just logistics help

  • Reaching out to people at all without putting too much pressure on them

  • Cultivate lifelong friendships that aren’t limited to particular time and place

  • Have a block around asking for help bc doesn’t expect to reciprocate; would like to actually just pay people for help w stuff

  • Want to become more involved in the community

  • Learn how to teach other people “ops skills”

  • Connections to people they can teach and who can teach them

Lightning Talks

Lightning talks are a great way to give people an opportunity to not just share ideas, but get some practice at public presentation (which I've found can be a great gateway tool for overall confidence and ability to get things done in the community). Traditionally they are 5 minutes long. CFAR has found that 3.5 minute lightning talks are better than 5 minute talks, because it cuts out some rambling and tangents.

It turned out we had more people than I'd originally planned time for, so we ended up switching to two minute talks. I actually think this was even better, and my plan for next time is do 1-minute timeslots but allow people to sign up for multiple if they think their talk requires it, so people default to giving something short and sweet.

Rough summaries of the lightning talks can be found in the notes.

III. Discussing the Problem

The next section involved two "breakout session" - two 20 minute periods for people to split into smaller groups and talk through problems in detail. This was done in an somewhat impromptu fashion, with people writing down the talks they wanted to do on the whiteboard and then arranging them so most people could go to a discussion that interested them.

The talks were:

 -  Welcoming Newcomers
 -  How to handle people who impose costs on others?
 -  Styles of Leadership and Running Events
 -  Making Helping Fun (or at least lower barrier-to-entry)
 -  Circling session 

There was a suggested discussion about outreach, which I asked to table for a future unconference. My reason was that outreach discussions tend to get extremely meta and seem to be an attractor (people end up focusing on how to bring more people into the community without actually making sure the community is good, and I wanted the unconference to focus on the latter.)

I spent some time drifting between sessions, and was generally impressed both with the practical focus each discussion had, as well as the way they were organically moderated.

Again, more details in the notes.

IV. Planning Solutions and Next Actions

After about an hour of discussion and mingling, we came back to the central common space to describe key highlights from each session, and begin making concrete plans. (Names are crediting people who suggested an idea and who volunteered to make it happen)

Creating Norms for Your Space (Jane Joyce, Tilia Bell)

The "How to handle people who impose costs on other" conversation ended up focusing on minor but repeated costs. One of the hardest things to moderate as an event host is not people who are actively disruptive, but people who just a little bit awkward or annoying - they'd often be happy to change their behavior if they got feedback, but giving feedback feels uncomfortable and it's hard to do in a tactful way. This presents two problems at once: parties/events/social-spaces end up a more awkward/annoying than they need to be, and often what happens is that rather than giving feedback, the hosts stop inviting people doing those minor things, which means a lot of people still working on their social skills end up living in fear of being excluded.

Solving this fully requires a few different things at once, and I'm not sure I have a clear picture of what it looks like, but one stepping stone people came up with was creating explicit norms for a given space, and a practice of reminding people of those norms in a low-key, nonjudgmental way.

I think will require a lot of deliberate effort and practice on the part of hosts to avoid alternate bad outcomes like "the norms get disproportionately enforced on people the hosts like and applied unfairly to people they aren't close with". But I do think it's a step in the right direction to showcase what kind of space you're creating and what the expectations are.

Different spaces can be tailored for different types of people with different needs or goals. (I'll have more to say about this in an upcoming post - doing this right is really hard, I don't actually know of any groups that have done an especially good job of it.)

I *was* impressed with the degree to which everyone in the conversation seemed to be taking into account a lot of different perspectives at once, and looking for solutions that benefited as many people as possible.

Welcoming Committee (Mandy Souza, Tessa Alexanian)

Oftentimes at events you'll see people who are new, or who don't seem comfortable getting involved with the conversation. Many successful communities do a good job of explicitly welcoming those people. Some people at the unconference decided to put together a formal group for making sure this happens more.

The exact details are still under development, but I think the basic idea is to have a network of people who are interested
he idea is to have a group of people who go to different events, playing the role of the welcomer. I think the idea is sort of a "Uber for welcomers" network (i.e. it both provides a place for people running events to go to ask for help with welcoming, and people who are interested in welcoming to find events that need welcomers)

It also included some ideas for better infrastructure, such as reviving "bayrationality.org" to make it easier for newcomers to figure out what events are going on (possibly including links to the codes of conduct for different spaces as well). In the meanwhile, some simple changes were the introduction of a facebook group for Bay Area Rationalist Social Events.

Softskill-sharing Groups (Mike Plotz and Jonathan Wallis)

The leadership styles discussion led to the concept that in order to have a flourishing community, and to be a successful leader, it's valuable to make yourself legible to others, and others more legible to yourself. Even small improvements in an activity as frequent as communication can have huge effects over time, as we make it easier to see each other as we actually are and to clearly exchange our ideas. 

A number of people wanted to improve in this area together, and so we’re working towards establishing a series of workshops with a focus on practice and individual feedback. A longer post on why this is important is coming up, and there will be information on the structure of the event after our first teacher’s meeting. If you would like to help out or participate, please fill out this poll:

https://goo.gl/forms/MzkcsMvD2bKzXCQN2

Circling Explorations (Qiaochu and others)

Much of the discussion at the Unconference, while focused on community, ultimately was explored through an intellectual lens. By contrast, "Circling" is a practice developed by the Authentic Relating community which is focused explicitly on feelings. The basic premise is (sort of) simple: you sit in a circle in a secluded space, and you talk about how you're feeling in the moment. Exactly how this plays out is a bit hard to explain, but the intended result is to become better both at noticing your own feelings and the people around you.

Opinions were divided as to whether this was something that made sense for "rationalists to do on their own", or whether it made more sense to visit more explicitly Circling-focused communities, but several people expressed interest in trying it again.

Making Helping Fun and More Accessible (Suggested by Oliver Habryka)

Ultimately we want a lot of people who are able and excited to help out with challenging projects - to improve our collective group ambition. But to get there, it'd be really helpful to have "gateway helping" - things people can easily pitch in to do that are fun, rewarding, clearly useful but on the "warm fuzzies" side of helping. Oliver suggested this as a way to get people to start identifying as people-who-help.

There were two main sets of habits that worth cultivating:

1) Making it clear to newcomers that they're encouraged to help out with events, and that this is actually a good way to make friends and get more involved. 

2) For hosts and event planners, look for opportunities to offer people things that they can help with, and make sure to publicly praise those who do help out.

Some of this might dovetail nicely with the Welcoming Committee, both as something people can easily get involved with, and if there ends up being a public facing website to introduce people to the community, using that to connect people with events that could use help).

Volunteering-as-Learning, and Big Event Specific Workshops

Sometimes volunteering just requires showing up. But sometimes it requires special skills, and some events might need people who are willing to practice beforehand or learn-by-doing with a commitment to help at multiple events.

A vague cluster of skills that's in high demand is "predict logistical snafus in advance to head them off, and notice logistical snafus happening in realtime so you can do something about them." Earlier this year there was an Ops Workshop that aimed to teach this sort of skill, which went reasonably but didn't really lead into a concrete use for the skills to help them solidify.

One idea was to do Ops workshops (or other specialized training) in the month before a major event like Solstice or EA Global, giving them an opportunity to practice skills and making that particular event run smoother.

(This specific idea is not currently planned for implementation as it was among the more ambitious ones, although Brent Dill's series of "practice setting up a giant dome" beach parties in preparation for Burning Man are pointing in a similar direction)

Making Sure All This Actually Happens (Sarah Spikes, and hopefully everyone!)

To avoid the trap of dreaming big and not actually getting anything done, Sarah Spikes volunteered as project manager, creating an Asana page. People who were interested in committing to a deadline could opt into getting pestered by her to make sure things things got done. 

V. Parting Words

To wrap up the event, I focused on some final concepts that underlie this whole endeavor. 

The thing we're aiming for looks something like this:

In a couple months (hopefully in July), there'll be a followup unconference. The theme will be "Innovation and Excellence", addressing the twofold question "how do we encourage more people to start cool projects", and "how to do we get to a place where longterm projects ultimately reach a high quality state?"

Both elements feel important to me, and they require somewhat different mindsets (both on the part of the people running the projects, and the part of the community members who respond to them). Starting new things is scary and having too high standards can be really intimidating, yet for longterm projects we may want to hold ourselves to increasingly high standards over time.

My current plan (subject to lots of revision) is for this to become a series of community unconferences that happen roughly every 3 months. The Bay area is large enough with different overlapping social groups that it seems worthwhile to get together every few months and have an open-structured event to see people you don't normally see, share ideas, and get on the same page about important things.

Current thoughts for upcoming unconference topics are:

Innovation and Excellence
Personal Epistemic Hygiene
Group Epistemology

An important piece of each unconference will be revisiting things at the previous one, to see if projects, ideas or experiments we talked about were actually carried out and what we learned from them (most likely with anonymous feedback collected beforehand so people who are less comfortable speaking publicly have a chance to express any concerns). I'd also like to build on topics from previous unconferences so they have more chance to sink in and percolate (for example, have at least one talk or discussion about "empathy and trust as related to epistemic hygiene").

Starting and Finishing Unconferences Together

My hope is to get other people involved sooner rather than later so this becomes a "thing we are doing together" rather than a "thing I am doing." One of my goals with this is also to provide a platform where people who are interested in getting more involved with community leadership can take a step further towards that, no matter where they currently stand (ranging anywhere from "give a 30 second lightning talk" to "run a discussion, or give a keynote talk" to "be the primary organizer for the unconference.")

I also hope this is able to percolate into online culture, and to other in-person communities where a critical mass of people think this'd be useful. That said, I want to caution that I consider this all an experiment, motivated by an intuitive sense that we're missing certain things as a culture. That intuitive sense has yet to be validated in any concrete fashion. I think "willingness to try things" is more important than epistemic caution, but epistemic caution is still really important - I recommend collecting lots of feedback and being willing to shift direction if you're trying anything like the stuff suggested here.

(I'll have an upcoming post on "Ways Project Hufflepuff could go horribly wrong")

Most importantly, I hope this provides a mechanism for us to collectively take ideas more seriously that we're ostensibly supposed to be taking seriously. I hope that this translates into the sort of culture that The Craft and The Community was trying to point us towards, and, ideally, eventually, a concrete sense that our community can play a more consistently useful role at making sure the world turns out okay. 

If you have concerns, criticism, or feedback, I encourage you to comment here if you feel comfortable, or on the Unconference Feedback Form. So far I've been erring on the side of move forward and set things in motion, but I'll be shifting for the time being towards "getting feedback and making sure this thing is steering in the right direction."

-

In addition to the people listed throughout the post, I'd like to give particular thanks to Duncan Sabien for general inspiration and a lot of concrete help, Lahwran for giving the most consistent and useful feedback, and Robert Lecnik for hosting the space. 

[Link] Have We Been Interpreting Quantum Mechanics Wrong This Whole Time?

2 korin43 23 May 2017 04:38PM

Physical actions that improve psychological health

6 arunbharatula 23 May 2017 04:33AM

Physical health impacts well-being. However, existing preventative health guidelines are inaccessible to the public because they are highly technical and require specific medical equipment. These notes are not medical advice nor meant to treat any illness. This is a compilation of findings I have come across at one time or another in relation to physical things that relate back to psychological health. I have not systematically reviewed the literature on any of these topics, nor am I an expert nor even familiar with any of them. I am extremely uncertain about the whole thing. But, I figure better to write this up and look stupid than keep it inside and act stupid. The hyperlinks point to the best evidence I could find on the matter. I write to solicit feedback, corrections and advice.

 

Microwaves are safe, but cockroaches and even ants are dangerous, and finally: happiness is dietary. If you want the well-being boosts associated with fruit (careful about fruit juice sugar though!), coffee’s aroma [text] [science news], vanilla yoghurt [news], Sufficient B vitamins and choline (alt), binge drinking or drinking in general, however, I don’t have any easy answers for you. Don’t worry about the smart drugs, nootropics are probably a misnomer. On the other hand, probiotics can treat depression

 

“There is growing evidence that a diet rich in fruits and vegetables is related to greater happiness, life satisfaction, and positive mood as well. This evidence cannot be entirely explained by demographic or health variables including socio-economic status, exercise, smoking, and body mass index, suggesting a causal link.[50] Further studies have found that fruit and vegetable consumption predicted improvements in positive mood the next day, not vice versa. On days when people ate more fruits and vegetables, they reported feeling calmer, happier, and more energetic than normal, and they also felt more positive the next day.”

- Wikipedia

 

If your diet is out of control: Mental contrasting is useful for diabetes self-management, dieting etc. Tangent: During a seminar I attended in Geneva, The World Health Organisation chief dietary authority said that suggesting dietary patterns (e.g. the Mediterranean diet) rather than individual nutrient intake (protein, creatine, carbs) is preferable. But I have yet to identify substantiating evidence. The broad consensus among lay skeptical scrutineers of the field of nutrition is that most truths, even those broadly accepted ones, are still unclear. However, I have yet to analyse the literature myself.

 

Exercise and sport are good for subject well-being, quality of life, depression, anxiety, stress and more. Plus, they are fun. You may not enjoy pleasant, wellbeing related activities. Do those activities anyway. I seldom enjoy correcting my posture. I tend to slouch and I have been specifically advised by specialised physiotherapist to correct for that. But, slouching typically doesn’t cause pain - posture correction is pseudoscience! So is many interventions related to posture correction, like standing desks. On the other hand, I love to get massages - but their benefits are short lived - so get them regularly!

 

I particularly enjoy them after resistance training or 1 minute workouts (high intensity interval training). Be careful about stretching, passive stretching can cause injury, unlike active stretching: 'Passive stretching is when you use an outside force other than your own muscle to move a joint or limb beyond its active range of motion, to put your body into a position that you couldn’t do by yourself (such as when you lean into a wall, or have a partner push you into a deeper stretch). Unfortunately, this is the most common form of stretching used.'

 

However, if you aim to bodybuild, protein supplementation is pseudoscientific broscience. And ‘form’, well, there’s broscience - like squat with your knees outwards but probably lots of credible safety related information one ought to head. For weight loss, if you want a real cheat sheet - weight loss aspirants can get it for a couple of hundred dollar SNP sequencing kit. But, I would be cautious about gene sequence driven health prescription, some services running that business rely on weak evidence. There are other ‘fad’ fitness ideas that are not grounded in science. For instance: 20 second of foam rolling (just as effective as 60 seconds) enhance flexibility (...for no longer than 10 minutes, unless it is done regularly - than it improves long term flexibility) but it is unclear whether they improve athletic performance or post-performance recovery.

 

Stretching for runners, but no other kinds of sports prevents injuries and increase range of motion [wikipedia]. Shoe inserts don’t work reliably either [Wikipedia]. Martial arts therapy is a thing. Physical exercise is good for you. Tai chi, qigong, and meditation (other than mindfulness) such as transcendental meditation are ineffective in treating depression and anxiety. If you are injured, try rehabilitation exercises. Exercise or performance enhancing drugs are both cognitive enhancers. Exercise for chronic lower back pain is a good idea.

 

Environment: Avoid outdoor air pollution near residences due to dementia/other-health risks. And, avoid chimney smoke fireplaces.

 

Anecdotally, hygiene improves self-esteem and well-being. Wipe with wet wipes if you wipe hard enough to cause blood to form, cover the toilet seat with toilet paper or don’t - it doesn’t matter safety wise unless the contaminant is <~1hr old, shower with soap, remove eye mucus, remove earwax (but not the way you think, likely), brush twice a day - with the correct technique, replacing your toothbrush every few months and softly. 'Don't rinse with water straight after toothbrushing'. Floss once a day (with a different piece of floss each flossing session) but do not brush immediately after drinking acidic substances. The effectiveness of Tooth Mousse is questionable. Visit the dentist for a check-up every now and then - I’d say about every year at least (does anyone know how to format this sentence consistent with the rest of the text - it doesn't appear to be a font size or type issue).

 

Consider sleeping with a face mask and earplugs for better sleep,  blow your nose, clean under your nails and trim them. Eye examinations should be conducted every 2-4 years for those under 40, and up to every 6 months for those 65+. There are health concerns around memory foam pillows/mattresses so latex pillows may be preferable for those who prefer a sturdier option than traditional pillows/mattresses Anecdotally, setting alarms to remind you to do things is a simple way to manage your time not just for waking up. Light therapy is also helpful in treating delayed sleep phase disorder (being a night owl!). Oh, and don’t bother loading the dishwasher with pre washed dishes (as long as you clean the filter regularly).

 

There are misconceptions around complementary therapies. The Australian Government reviewed the effective of The Alexander technique, homeopathy, aromatherapy, bowen therapy, buteyko, Feldenkrais, herbalism, homeopathy, iridology, kinesiology, massage therapy, pilates, reflexology, rolfing shiatsu, tai chi, yoga. Only for (Alexander technique, Buteyko, massage therapy (esp. Remedial massage?), tai chi and yoga was there credible (albeit low to moderate quality) evidence that they are useful for certain health conditions.

 

Stressed out reading all this? Pressing on your eyelids gently to temporarily forgo a headache can work. Traumatically stressed out? Video games can treat PTSD. Animal assisted therapy, like service dogs and therapeutic animals are also wonderful.

Thank you!

[Link] Probabilistic Programming and Bayesian Methods for Hackers

1 lifelonglearner 22 May 2017 09:15PM

[Link] Overcoming Algorithm Aversion: People Will Use Imperfect Algorithms If They Can (Even Slightly) Modify Them

3 Stefan_Schubert 22 May 2017 06:31PM

Open thread, May 22 - May 28, 2017

2 Thomas 22 May 2017 05:44AM

If it's worth saying, but not worth its own post, then it goes here.


Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

4. Unflag the two options "Notify me of new top level comments on this article" and "

[Link] Why Most Intentional Communities Fail (And Some Succeed)

4 AspiringRationalist 22 May 2017 03:04AM

[Link] Learning Deep Learning the EASY way, with Keras

2 morganism 21 May 2017 07:48PM

On-line google hangout on approaches to communication around agi risk (2017/5/27 20:00 UTC)

2 whpearson 21 May 2017 12:32PM

We have a number of charities that are working on different aspects of AGI risk

-  The theory of the alignment problem (MIRI/FHI/more)

-  How to think about problems well (CFAR)

However we don't have body dedicated to making and testing a coherent communication strategy to help postpone the development of dangerous AIs.

I'm organising an on-line discussion around what we should do about this issue next saturday.

In order to find out when people can do it, I've created a doodle here. I'm trusting that doodle works well with timezones. The time slots should be between 1200 and 2300  UTC , let me know if they are not.

We'll be using the optimal brainstorming methodology

Give me a message if you want an invite, once the time has been decided.

I will take notes and post them here again.

AGI and Mainstream Culture

4 madhatter 21 May 2017 08:35AM

Hi all,

So, as you may know, the first episode of Doctor Who, "Smile", was about a misaligned AI trying to maximize smiles (ish). And the latest, "Extremis", was about an alien race who instantiated conscious simulations to test battle strategies for invading the Earth, of which the Doctor was a subroutine. 

I thought the common threat of AGI was notable, although I'm guessing it's just a coincidence. More seriously, though, this ties in with an argument I thought of, and want to know your take on: i

If we want to avoid an AI arms race, so that safety research has more time to catch up to AI progress, then we would want to prevent, if at all possible, these issues from becoming more mainstream. The reason is that if AGI in public perception becomes disassociated with Terminator (i.e. laughable, nerdy, and unrealistic) and more like a serious whoever-makes-this-first-can-take-over-the-world situation, then we will get an arms race faster. 

I'm not sure I believe this argument myself. For one thing, being more mainstream has the benefit of attracting more safety research talent, government funding, etc. But maybe we shouldn't be spreading awareness without thinking this through some more.

 

CFAR workshop with new instructors in Seattle, 6/7-6/11

8 Qiaochu_Yuan 20 May 2017 12:18AM

CFAR is running its first workshop in Seattle! 

Over the past several months, CFAR has been training a new batch of instructors, including me. We're now running a workshop, without the core instructors, in Seattle from June 7th to June 11th. You can apply here, and we have an FAQ here

AI safety: three human problems and one AI issue

6 Stuart_Armstrong 19 May 2017 10:48AM

Crossposted at the Intelligent agent foundation.

There have been various attempts to classify the problems in AI safety research. Our old Oracle paper that classified then-theoretical methods of control, to more recent classifications that grow out of modern more concrete problems.

These all serve their purpose, but I think a more enlightening classification of the AI safety problems is to look at what the issues we are trying to solve or avoid. And most of these issues are problems about humans.

Specifically, I feel AI safety issues can be classified as three human problems and one central AI issue. The human problems are:

  • Humans don't know their own values (sub-issue: humans know their values better in retrospect than in prediction).
  • Humans are not agents and don't have stable values (sub-issue: humanity itself is even less of an agent).
  • Humans have poor predictions of an AI's behaviour.

And the central AI issue is:

  • AIs could become extremely powerful.

Obviously if humans were agents and knew their own values and could predict whether a given AI would follow those values or not, there would be not problem. Conversely, if AIs were weak, then the human failings wouldn't matter so much.

The points about human values is relatively straightforward, but what's the problem with humans not being agents? Essentially, humans can be threatened, tricked, seduced, exhausted, drugged, modified, and so on, in order to act seemingly against our interests and values.

If humans were clearly defined agents, then what counts as a trick or a modification would be easy to define and exclude. But since this is not the case, we're reduced to trying to figure out the extent to which something like a heroin injection is a valid way to influence human preferences. This makes both humans susceptible to manipulation, and human values hard to define.

Finally, the issue of humans having poor predictions of AI is more general than it seems. If you want to ensure that an AI has the same behaviour in the testing and training environment, then you're essentially trying to guarantee that you can predict that the testing environment behaviour will be the same as the (presumably safe) training environment behaviour.

 

How to classify methods and problems

That's well and good, but how to various traditional AI methods or problems fit into this framework? This should give us an idea as to whether the framework is useful.

It seems to me that:

 

  • Friendly AI is trying to solve the values problem directly.
  • IRL and Cooperative IRL are also trying to solve the values problem. The greatest weakness of these methods is the not agents problem.
  • Corrigibility/interruptibility are also addressing the issue of humans not knowing their own values, using the sub-issue that human values are clearer in retrospect. These methods also overlap with poor predictions.
  • AI transparency is aimed at getting round the poor predictions problem.
  • Laurent's work on carefully defining the properties of agents is mainly also about solving the poor predictions problem.
  • Low impact and Oracles are aimed squarely at preventing AIs from becoming powerful. Methods that restrict the Oracle's output implicitly accept that humans are not agents.
  • Robustness of the AI to changes between testing and training environment, degradation and corruption, etc... ensures that humans won't be making poor predictions about the AI.
  • Robustness to adversaries is dealing with the sub-issue that humanity is not an agent.
  • The modular approach of Eric Drexler is aimed at preventing AIs from becoming too powerful, while reducing our poor predictions.
  • Logical uncertainty, if solved, would reduce the scope for certain types of poor predictions about AIs.
  • Wireheading, when the AI takes control of reward channel, is a problem that humans don't know their values (and hence use an indirect reward) and that the humans make poor predictions about the AI's actions.
  • Wireheading, when the AI takes control of the human, is as above but also a problem that humans are not agents.
  • Incomplete specifications are either a problem of not knowing our own values (and hence missing something important in the reward/utility) or making poor predictions (when we though that a situation was covered by our specification, but it turned out not to be).
  • AIs modelling human knowledge seem to be mostly about getting round the fact that humans are not agents.

Putting this all in a table:

 

MethodValues
Not Agents
Poor PredictionsPowerful
Friendly AI
X


IRL and CIRL X


Corrigibility/interruptibility X
X
AI transparency

X
Laurent's work

X
Low impact and Oracles
X
X
Robustness

X
Robustness to adversaries
X

Modular approach

X X
Logical uncertainty

X
Wireheading (reward channel) X X X
Wireheading (human) X
X
Incomplete specifications X
X
AIs modelling human knowledge
X

 

Further refinements of the framework

It seems to me that the third category - poor predictions - is the most likely to be expandable. For the moment, it just incorporates all our lack of understanding about how AIs would behave, but this might more useful to subdivide.

Instrumental Rationality Sequence Update (Drive Link to Drafts)

2 lifelonglearner 19 May 2017 04:01AM

Hey all,

Following my post on my planned Instrumental Rationality sequence, I thought it'd be good to give the LW community an update of where I am.

1) Currently collecting papers on habits. Planning to go through a massive sprint of the papers tomorrow. The papers I'm using are available in the Drive folder linked below.

2) I have a publicly viewable Drive folder here of all relevant articles and drafts and things related to this project, if you're curious to see what I've been writing. Feel free to peek around everywhere, but the most relevant docs are this one which is an outline of where I want to go for the sequence and this one which is the compilation of currently sorta-decent posts in a book-like format (although it's quite short right now at only 16 pages).

Anyway, yep, that's where things are at right now.

 

[Link] How To Build A Community Full Of Lonely People

6 maia 17 May 2017 03:25PM

Reaching out to people with the problems of friendly AI

4 Val 16 May 2017 07:30PM

There have been a few attempts to reach out to broader audiences in the past, but mostly in very politically/ideologically loaded topics.

After seeing several examples of how little understanding people have about the difficulties in creating a friendly AI, I'm horrified. And I'm not even talking about a farmer on some hidden ranch, but about people who should know about these things, researchers, software developers meddling with AI research, and so on.

What made me write this post, was a highly voted answer on stackexchange.com, which claims that the danger of superhuman AI is a non-issue, and that the only way for an AI to wipe out humanity is if "some insane human wanted that, and told the AI to find a way to do it". And the poster claims to be working in the AI field.

I've also seen a TEDx talk about AIs. The talker didn't even hear about the paperclip maximizer, and the talk was about the dangers presented by the AIs as depicted in the movies, like the Terminator, where an AI "rebels", but we can hope that AIs would not rebel as they cannot feel emotion, so we should hope the events depicted in such movies will not happen, and all we have to do is for ourselves to be ethical and not deliberately write malicious AI, and then everything will be OK.

The sheer and mind-boggling stupidity of this makes me want to scream.

We should find a way to increase public awareness of the difficulty of the problem. The paperclip maximizer should become part of public consciousness, a part of pop culture. Whenever there is a relevant discussion about the topic, we should mention it. We should increase awareness of old fairy tales with a jinn who misinterprets wishes. Whatever it takes to ingrain the importance of these problems into public consciousness.

There are many people graduating every year who've never heard about these problems. Or if they did, they dismiss it as a non-issue, a contradictory thought experiment which can be dismissed without a second though:

A nuclear bomb isn't smart enough to override its programming, either. If such an AI isn't smart enough to understand people do not want to be starved or killed, then it doesn't have a human level of intelligence at any point, does it? The thought experiment is contradictory.

We don't want our future AI researches to start working with such a mentality.

 

What can we do to raise awareness? We don't have the funding to make a movie which becomes a cult classic. We might start downvoting and commenting on the aforementioned stackexchange post, but that would not solve much if anything.



[Link] Keeping up with deep reinforcement learning research: /r/reinforcementlearning

3 gwern 16 May 2017 07:12PM

The robust beauty of improper linear models

1 Stuart_Armstrong 16 May 2017 03:06PM

It should come as no surprise to people on this list that models often outperform experts. But these are generally finely calibrated models, integrating huge amounts of data, so this seems less surprising. How can the poor experts compete against that?

But sometimes the models are much simpler than that, and still perform better. For instance, the models could be linear, rather than having higher order complexities. These models can still outperform experts, because in practice, despite their beliefs that they are doing a non-linear task, expert decisions can often best be modelled as being entirely linear.

But surely the weights of the linear models are subtle and need to be set exactly? Not really. It seems that if you take a linear model, and weigh the variables by +1 or -1 depending on whether it has a positive or negative impact on the result, then you will get a model that still often outperforms experts. These models with ±1 weights are called improper linear models.

What's going on here? Well, there's been a bit of a dodge. I've been talking about "taking" a linear model, with "variables", and weighing the factors depending on a positive or negative "impact". And to do all that, you need experts. They are the ones that know which variables are important, and know the direction (positive or negative) in which they impact the result. They don't choose these variables by just taking random possibilities and then figuring out what the direction is. Instead, they understand the situation, to some extent, and choose important variables.

So that's the real role of the expert here: knowing what should go into the model, what really makes the underlying dependent variable change. Selecting and coding the variable information, in the terms that are often used.

But, just as experts can be very good at that task, they are human, and humans are terrible at integrating lots of information together. So, having selected the variables, they get regularly outperformed by proper linear models. And when you add the fact that the experts have selected variables of comparable importance, and that these variables are often correlated with each other, it's not surprising that they get outperformed by improper linear models as well.

[Link] A social science without sacred values

1 ChristianKl 16 May 2017 12:26PM

Are causal decision theorists trying to outsmart conditional probabilities?

4 Caspar42 16 May 2017 08:01AM

Presumably, this has been discussed somewhere in the past, but I wonder to which extent causal decision theorists (and many other non-evidential decision theorists, too) are trying to make better predictions than (what they think to be) their own conditional probabilities.

 

To state this question more clearly, let’s look at the generic Newcomb-like problem with two actions a1 and a2 (e.g., one-boxing and two-boxing, cooperating or defecting, not smoking or smoking) and two states s1 and s2 (specifying, e.g., whether there is money in both boxes, whether the other agent cooperates, whether one has cancer). The Newcomb-ness is the result of two properties:

  • No matter the state, it is better to take action a2, i.e. u(a2,s1)>u(a1,s1) and u(a2,s2)>u(a1,s2). (There are also problems without dominance where CDT and EDT nonetheless disagree. For simplicity I will assume dominance, here.)

  • The action cannot causally affect the state, but somehow taking a1 gives us evidence that we’re in the preferable state s1. That is, P(s1|a1)>P(s1|a2) and u(a1,s1)>u(a2,s2).

Then, if the latter two differences are large enough, it may be that

E[u|a1] > E[u|a2].

I.e.

P(s1|a1) * u(s1,a1) + P(s2|a1) * u(s2,a1) > P(s1|a2) * u(s1,a2) + P(s2|a2) * u(s2,a2),

despite the dominance.

 

Now, my question is: After having taken one of the two actions, say a1, but before having observed the state, do causal decision theorists really assign the probability P(s1|a1) (specified in the problem description) to being in state s1?

 

I used to think that this was the case. E.g., the way I learned about Newcomb’s problem is that causal decision theorists understand that, once they have said the words “both boxes for me, please”, they assign very low probability to getting the million. So, if there were a period between saying those words and receiving the payoff, they would bet at odds that reveal that they assign a low probability (namely P(s1,a2)) to money being under both boxes.

 

But now I think that some of the disagreement might implicitly be based on a belief that the conditional probabilities stated in the problem description are wrong, i.e. that you shouldn’t bet on them.

 

The first data point was the discussion of CDT in Pearl’s Causality. In sections 1.3.1 and 4.1.1 he emphasizes that he thinks his do-calculus is the correct way of predicting what happens upon taking some actions. (Note that in non-Newcomb-like situations, P(s|do(a)) and P(s|a) yield the same result, see ch. 3.2.2 of Pearl’s Causality.)

 

The second data point is that the smoking intuition in smoking lesion-type problems may often be based on the intuition that the conditional probabilities get it wrong. (This point is also inspired by Pearl’s discussion, but also by the discussion of an FB post by Johannes Treutlein. Also see the paragraph starting with “Then the above formula for deciding whether to pet the cat suggests...” in the computer scientist intro to logical decision theory on Arbital.)

 

Let’s take a specific version of the smoking lesion as an example. Some have argued that an evidential decision theorist shouldn’t go to the doctor because people who go to the doctor are more likely to be sick. If a1 denotes staying at home (or, rather, going anywhere but a doctor) and s1 denotes being healthy, then, so the argument goes, P(s1|a1) > P(s1|a2). I believe that in all practically relevant versions of this problem this difference in probabilities disappears once we take into account all the evidence we already have. This is known as the tickle defense. A version of it that I agree with is given in section 4.3 of Arif Ahmed’s Evidence, Decision and Causality. Anyway, let’s assume that the tickle defense somehow doesn’t apply, such that even if taking into account our entire knowledge base K, P(s1|a1,K) > P(s1|a2,K).

 

I think the reason why many people think one should go to the doctor might be that while asserting P(s1|a1,K) > P(s1|a2,K), they don’t upshift the probability of being sick when they sit in the waiting room. That is, when offered a bet in the waiting room, they wouldn’t accept exactly the betting odds that P(s1|a1,K) and P(s1|a2,K) suggest they should accept.

 

Maybe what is going on here is that people have some intuitive knowledge that they don’t propagate into their stated conditional probability distribution. E.g., their stated probability distribution may represent observed frequencies among people who make their decision without thinking about CDT vs. EDT. However, intuitively they realize that the correlation in the data doesn’t hold up in this naive way.

 

This would also explain why people are more open to EDT’s recommendation in cases where the causal structure is analogous to that in the smoking lesion, but tickle defenses (or, more generally, ways in which a stated probability distribution could differ from the real/intuitive one) don’t apply, e.g. the psychopath button, betting on the past, or the coin flip creation problem.

 

I’d be interested in your opinions. I also wonder whether this has already been discussed elsewhere.

Acknowledgment

Discussions with Johannes Treutlein informed my view on this topic.

A Month's Worth of Rational Posts - Feedback on my Rationality Feed.

17 deluks917 15 May 2017 02:21PM

For the last two months I have been publishing a feed of rationalist articles. Oriignally the feed was only published on the SSC discord channel SSC Discord (Be charitable, kind and don't treat the place like 4chan). For the last few days I have also been publishing it on my blog deluks917.wordpress.com. I categorize the links and include a brief excerpt, review, and/or teaser. If you would like to see an exampel in practice just check today's post. The average number of links per day, in the last month, has been six. But this number has been higher recently. I have not missed a single day since I started, so I think its likely I will continue doing this. The list of blogs I check is located here: List of Blogs

I am looking for some feedback. At the bottom of this post I am  including a month's worth of posts categorized using the current system. Posts are not nescessarily in any particular order since my categorization system has not been constant over time. Lots of posts were moved around by hand. 

1 -  Should I share the feed the results somewhere other than SSC-discord + my blog? Mindlevelup suggested I write up a weekly roundup. I could share such a roundup via some on lesswrong and SSC. I would estimate the expected number of links in such a psot to be around 35. Links would be posted in chronoligcal order within categories. Alternatively I could share such a post every two weeks. Its also possible to have a mailing list but I currently find this less promising. 

2 - Do the categories make a reasonable amount of sense? What tweaks would you make. I have ocnsidered mixing some of the smaller categories (Math and CS, Amusement into "misc"). 

3 - Are there any blogs I should include/drop from the feed. For example I have been considering dropping ribbonfarm. The highest priority is to get the content thats directly about instrumental/epsitemic rationality. The bar is higher for politics and culture_war. I should note I am not going to personally include any blog without an RSS feed. 

4 - Is anyone willing to write a "Best of rationalist tumblr" post. If I write a weekly/bi-weekly round up I could combine it with an equivalent "best of tumblr" post. The tumblr post would not have to be daily, just weekly or every other week. We could take turns posting the resulting combination to lesswrong/SSC and collecting the juicy karma. However its worth noting that SSC-reddit has some controls on culture_war (outside of the CW thread). Since we want to post to r/SSC we need to keep the density of culture_war to reasonable levels. Lesswrong also has some anti-cw norms.

=== Last Month's Rationality Content === 

**Scott**

http://slatestarcodex.com/2017/05/11/silicon-valley-a-reality-check/ - What a person finds in Silicon Valley mirrors the seeker.

http://slatestarcodex.com/2017/05/09/links-517-rip-van-linkle/ - Links.

http://slatestarcodex.com/2017/04/11/sacred-principles-as-exhaustible-resources/ - Don't deplete the free speech commons.

http://slatestarcodex.com/2017/04/12/clarification-to-sacred-principles-as-exhaustible-resources/  - Clarifications and caveats on Scott's last article on free speech and sacred values.

http://slatestarcodex.com/2017/04/13/chametz/ - A Jewish Vampire Story

http://slatestarcodex.com/2017/04/17/learning-to-love-scientific-consensus/ - Scott Critiques a list of 10 maverick inventors. He then reconsiders his previous science skepticism.

http://slatestarcodex.com/2017/04/21/ssc-journal-club-childhood-trauma-and-cognition/ - A new study challenges the idea that child abuse reduces brain function.

http://slatestarcodex.com/2017/04/25/book-review-the-hungry-brain/ - Scott gives a favorable view of the "establishment" view of nutrition.

http://slatestarcodex.com/2017/04/26/anorexia-and-metabolic-set-point/ - Short Post (for Scott)

https://slatestarscratchpad.tumblr.com/post/160028275801/slatestarscratchpad-wayward-sidekick-you - Scott discusses engaging with ideas you find harmful. He also discusses his attitude toward making his blog as friendly as possible. [culture_war]

http://slatestarcodex.com/2017/05/01/neutral-vs-conservative-the-eternal-struggle/ - Formally neutral institutions have a liberal bias. Conservatives react by seceding and forming their own institutions. The end result is bad for society. [Culture War]

http://slatestarcodex.com/2017/05/04/getting-high-on-your-own-supply/ - "If you optimize for the epistemic culture that’s best for getting elected, but that culture isn’t also the best for running a party or governing a nation, then the fact that your culture affects your elites as well becomes a really big problem." Short for Scott.

http://slatestarcodex.com/2017/05/07/ot75-the-comment-king/ - bi-weekly visible open thread.

http://unsongbook.com/postscript-1-wrap-parties-fan-music/ - Final chapter of Unsong goes up approximately 8pm on Sunday. Unsong will have an epilogue will will go up on Wednesday. Wrap party details. (I will be at the wrap party on sunday).

http://unsongbook.com/book-iv-kings/ - "Somebody had to, no one would / I tried to do the best I could / And now it’s done, and now they can’t ignore us / And even though it all went wrong / I’ll stand against the whole unsong / With nothing on my tongue but HaMephorash"

http://unsongbook.com/chapter-71-but-for-another-gives-its-ease/ - Penultimate chapter of Unsong.

http://unsongbook.com/chapter-70-nor-for-itself-hath-any-care/ - Newest Chapter.

http://unsongbook.com/authors-note-10-hamephorash-hamephorash-party/ - Final Chapter goes up may 14. Bay Area Reading party announced.

http://unsongbook.com/chapter-69-love-seeketh-not-itself-to-please/ - Newest Chapter.

http://unsongbook.com/chapter-68-puts-all-heaven-in-a-rage/ - Newest Chapter.

**Rationalism**

http://lesswrong.com/r/discussion/lw/ozz/gearsness_of_understanding/ - "I want to point at a property of models that isn't about what they're modeling. It interacts with the clarity of what they're modeling, but only in the same way that smudged lines in a roadmap interact with the clarity of the roadmap. This property is how deterministically interconnected the variables of the model are.". The theory is applied to multiple explicit examples.

https://thepdv.wordpress.com/2017/05/11/how-i-use-beeminder/ - Short but gives details. Beeminder is the only productivity system that worked for the author.

https://putanumonit.com/2017/05/09/time-well-spent/ - Akrasia and procrastination. A review of some of the rationalist thinking on the topic. Jacob's personal take and his system for tracking his productivity.

https://putanumonit.com/2017/05/09/time-well-spent/ - Akrasia and procrastination. A review of some of the rationalist thinking on the topic. Jacob's personal take and his system for tracking his productivity.

http://kajsotala.fi/2017/05/cognitive-core-systems-explaining-intuitions-behind-belief-in-souls-free-will-and-creation-myths/ - Description of four core systems human, and other animals, are born with. An explanation of why these systems lead to belief in souls. Short.

https://mindlevelup.wordpress.com/2017/05/06/taking-criticism/ - Reframing criticism so that it makes sense to the author (who is bad at taking criticism). A Q&A between the author and himself.

http://lesswrong.com/r/discussion/lw/oz1/soft_skills_for_running_meetups_for_beginners/ - Concrete advice for running meetups. Not especially focused on beginning organizers. Written by the person who organized Solstice.

http://effective-altruism.com/ea/19t/mental_health_resource_for_ea_community/ - A breakdown of the most useful information about Mania and Psychosis. Extremely practical advice. Julia Wise.

http://bearlamp.com.au/working-with-multiple-problems-at-once - Problems add up and you run out of time. How do you get out? Very practical.

http://agentyduck.blogspot.com/2017/05/creativity-taps.html - Practical ideas for exercising creativity.

http://lesswrong.com/r/discussion/lw/oyk/acting_on_your_intended_preferences_what_does/ - What does it look like in practice to pursue your goals. A series of practical questions to ask your. Links to a previous series of blog post are included.

https://thingofthings.wordpress.com/2017/05/03/why-do-all-the-rationalists-live-in-the-bay-area/ - Benefits of living in the Bay. The Bay is a top place for software engineers even accounting for cost of living, Rationalist institutions are in the Bay, there are social and economic benefits to being around other community members.

https://qualiacomputing.com/2017/05/04/the-most-important-philosophical-question/ - “Is happiness a spiritual trick, or is spirituality a happiness trick?”

http://particularvirtue.blogspot.com/2017/05/how-to-build-community-full-of-lonely.html - Why so many rationalists feel lonely and concrete suggestions for improving social groups. Advice is given to people who are popular, lonely or organizers. Very practical.

https://hivewired.wordpress.com/2017/05/07/announcing-entropycon-12017/ - We beat smallpox, we will beat death, we can try to beat entropy. A humorous mantra against nihilism.

https://mindlevelup.wordpress.com/2017/04/30/there-is-no-akrasia/ - The author argues that akrasia isn't a "thing" its a "sorta-coherent concept". He also argues that "akrasia" is not a useful concept and can be harmful.

http://bearlamp.com.au/experiments-iterations-and-the-scientific-method/ - A Graph of the scientific method in practice. The author works through his quantified self in practice and discusses his experiences.

https://everythingstudies.wordpress.com/2017/04/29/all-the-worlds-a-trading-zone/ - Cultures with different norms and languages can interact successfully.

http://kajsotala.fi/2017/04/relationship-compatibility-as-patterns-of-emotional-association/ - What is relationship "chemistry"?

http://lesswrong.com/lw/oyc/nate_soares_replacing_guilt_series_compiled_in/ - Ebook. 45 blog posts on replacing guilt and shame with a stronger motivation.

http://mindingourway.com/assuming-positive-intent/ - "If you're actively working hard to make the world a better place, then we're on the same team. If you're committed to letting evidence and reason guide your actions, then I consider you friends, comrades in arms, and kin."

http://bearlamp.com.au/quantified-self-tracking-with-a-form/ - Practical advice based on Elo's personal experience.

http://lesswrong.com/r/discussion/lw/ovc/background_reading_the_real_hufflepuff_sequence/ - Links and Descriptions of rationalist articles about group norms and dynamics.

https://everythingstudies.wordpress.com/2017/04/24/people-are-different/ - "We need to understand, accept and respect differences, that one size does not fit all, but to (and from) each their own."

http://bearlamp.com.au/yak-shaving-2/ - "A question worth asking is whether you are in your life at present causing a build up of problems, a decrease of problems, or roughly keeping them about the same level."

http://lesswrong.com/r/discussion/lw/oxk/i_updated_the_list_of_rationalist_blogs_on_the/ - Up to date list of rationalist blogs.

https://aellagirl.com/2017/05/02/internet-communities-otters-vs-possums/ - Possums: people who like a specific culture. Otters are people who like most cultures. What happens when the percentage of otters in a community increases?

https://aellagirl.com/2017/04/24/how-i-lost-my-faith/ - "People sometimes ask the question of why it took so long. Really I’m amazed that it happened at all. Before we even approach the aspect of “good arguments against religion”, you have to understand exactly how much is sacrificed by the loss of religion."

http://particularvirtue.blogspot.com/2017/04/on-social-spaces.html - Twitter, Tumblr, Facebook etc. PV responds to Zvi's articles about facebook. PV defends tumblr and facebook and has some criticisms of twitter. Several examples are given where ratioanalist groups tried to change platforms.

http://www.overcomingbias.com/2017/04/superhumans-live-among-us.html - Some human polymaths really are superhuman. But they don't have the track record to prove it.

https://thezvi.wordpress.com/2017/04/22/against-facebook/ - Sections: 1. A model breaking down how Facebook actually works. 2. An experiment with my News Feed. 3. Living with the Algorithm. 4. See First, Facebook’s most friendly feature. 5. Facebook is an evil monopolistic pariah Moloch. 6. Facebook is bad for you and Facebook is ruining your life. 7. Facebook is destroying discourse and the public record. 8. Facebook is out to get you.

https://thezvi.wordpress.com/2017/04/22/against-facebook-comparison-to-alternatives-and-call-to-action/ - Zvi's advice for managing your information streams and discussion platforms. Facebook can mostly be replaced.

https://rationalconspiracy.com/2017/04/22/moving-to-the-bay-area/ - Downsides of the Bay. Extensively sourced. Cost of living, traffic, public transit, crime, cleanliness.

https://nintil.com/2017/04/18/still-not-a-zombie-replies-to-commenters/ - Thoughts on consciousness and identity.

http://bearlamp.com.au/an-inquiry-into-memory-of-humans/ - The reader is asked to try various interesting memory exercises.

https://www.jefftk.com/p/how-to-make-housing-cheaper - 9 ways to make housing cheaper.

http://lesswrong.com/r/discussion/lw/owb/straw_hufflepuffs_and_lone_heroes/ - Should Harry have joined Hufflepuff in HPMOR? Harry had reasons to be a lone hero, do you?

http://lesswrong.com/lw/owa/lesswrong_analytics_february_2009_to_january_2017/ - Activity graphs of lesswrong over time, which posts had the most views, links to source code and further reading.

https://thezvi.wordpress.com/2017/04/23/help-us-find-your-blog-and-others/ - Zvi will read a post from your blog and consider adding you to his RSS feed.

https://thingofthings.wordpress.com/2017/04/11/book-post-for-march/ - Books on parenting.

https://boardgamesandrationality.wordpress.com/2017/04/24/first-blog-post/ - Dealing With Secret Information in boardgames and real life.

http://www.overcomingbias.com/2017/04/mormon-transhumanists.html - The relationship between religious community and technological change. Long for Overcoming Bias.

https://putanumonit.com/2017/04/15/bad-religion/ - "Rationality is a really unsatisfactory religion. But it’s a great life hack."

https://thezvi.wordpress.com/2017/04/12/escalator-action/ - Should we walk on elevator?

https://putanumonit.com/2017/04/21/book-review-too-like-the-lightning/ - The world of Jacob's dreams, thought on AI, a book review.

**EA**

http://effective-altruism.com/ea/19y/understanding_charity_evaluation/ - A detailed breakdown of how charity evaluation works in practice. Openly somewhat speculative.

http://blog.givewell.org/2017/05/11/update-on-our-views-on-cataract-surgery/ - previously Givewell had unsuccessfully tried to find recommendable cataract surgery charities. The biggest issues were “room for funding” and “lack of high quality monitoring data”. However they believe that cataract surgery is a promising intervention and they are doing more analysis.

https://80000hours.org/2017/05/how-much-do-hedge-fund-traders-earn/ - Detailed report on career trajectories and earnings. "We found that junior traders typically earn $300k – $3m per year, and it’s possible to reach these roles in 4 – 8 years."

https://www.givedirectly.org/blog-post?id=7612753271623522521 - 8 News links about GiveDirectly, Basic Income and cash transfer.

https://80000hours.org/2017/05/most-people-report-believing-its-incredibly-cheap-to-save-lives-in-the-developing-world/ - Details of a study on how Much Americans think it costs to save a life. Discussion of why people gave such optimistic answers. "It turns out that most Americans believe a child can be prevented from dying of preventable diseases for very little – less than $100."

https://www.thelifeyoucansave.org/Blog/ID/1355/Are-Giving-Games-a-Better-Way-to-Teach-Philanthropy - Literature review on "philanthropy games". Covers both traditional student philanthropy courses and the much shorter "giving game".

https://www.givedirectly.org/blog-post?id=8255610968755843534 - Links to news stories about Effective Altruism

http://benjaminrosshoffman.com/an-openai-board-seat-is-surprisingly-expensive/ - " In exchange for a board seat, the Open Philanthropy Project is aligning itself socially with OpenAI, by taking the position of a material supporter of the project."

https://www.givedirectly.org/blog-post?id=5010525406506746433 - Links to News Articles about Give Directly, Basic Income and Cash Transfer.

https://www.givedirectly.org/blog-post?id=121797500310578692 - Report on a program to give cash to coffee farmers in eastern Uganda.

http://effective-altruism.com/ea/19d/update_on_effective_altruism_funds/ - Details from the first round of funding, community feedback, Mistakes and Updates.

http://lesswrong.com/r/discussion/lw/ox4/effective_altruism_is_selfrecommending/ - Open Philanthropy project has a closed validation loop. A detailed timeline of GiveWell/Open-Philanthropy is given and many untested assumptions are pointed out. A conceptual connection is made to confidence games.

http://lesswrong.com/r/discussion/lw/oxd/the_2017_effective_altruism_survey_please_take/ - Take the survey :)

https://www.givingwhatwecan.org/post/2017/04/career-of-professor-alan-fenwick/ - Retrospective on the career of the director of the Schistosomiasis Institute.

http://www.openphilanthropy.org/blog/new-report-early-field-growth - The history of attempts to grow new fields of research or advocacy.

https://www.givedirectly.org/blog-post?id=4406309858976986548 - news links about GiveDirectly, Basic Income and Cash Transfers

https://intelligence.org/2017/04/30/2017-updates-and-strategy/ - Outreach, expansion, detailed research plan, state of the AI-risk community.

http://blog.givewell.org/2017/05/04/why-givewell-is-partnering-with-idinsight/ - IDinsight is an international NGO that aims to help its clients develop and use rigorous evidence to improve social impact. Summary, Background, goals, initial plans.

https://www.thelifeyoucansave.org/Blog/ID/1354/A-Shift-in-Priorities-at-the-Giving-Game-Project - Finding sustainable funding, Providing measurable outcomes, improving follow ups with participants.

https://80000hours.org/2017/05/most-people-report-believing-its-incredibly-cheap-to-save-lives-in-the-developing-world/ - Details of a study on how Much Americans think it costs to save a life. Discussion of why people gave such optimistic answers. "It turns out that most Americans believe a child can be prevented from dying of preventable diseases for very little – less than $100."

https://www.thelifeyoucansave.org/Blog/ID/1355/Are-Giving-Games-a-Better-Way-to-Teach-Philanthropy - Literature review on "philanthropy games". Covers both traditional student philanthropy courses and the much shorter "giving game".

http://www.openphilanthropy.org/blog/why-are-us-corporate-cage-free-campaigns-succeeding - The article contains a timeline of cagefree reform. Some background reasons given are: Undercover investigations, College engagement, Corporate engagement, Ballot measures, Gestation crate pledges, European precedent.

https://www.givingwhatwecan.org/post/2017/04/a-successor-to-the-giving-what-we-can-trust/ - The Giving What we Can Trust has joined with the "Effective Altruism Funds" (run by the Center for Effective Altruism).

http://lesswrong.com/r/discussion/lw/oyf/bad_intent_is_a_behavior_not_a_feeling/ - Response to Nate Soares, application to EA. "If you try to control others’ actions, and don’t limit yourself to doing that by honestly informing them, then you’ll end up with a strategy that distorts the truth, whether or not you meant to."

**Ai_risk**

http://effective-altruism.com/ea/19c/intro_to_caring_about_ai_alignment_as_an_ea_cause/ - By Nate Soares. A modified transcript of the talk he gave at Google on the problem of Ai Alignment.

http://lukemuehlhauser.com/monkey-classification-errors/ , http://lukemuehlhauser.com/adversarial-examples-for-pigeons/ - Adversarial examples for monkeys and pigeons respectively.

https://intelligence.org/2017/05/10/may-2017-newsletter/ - Research updates, MIRI hiring, General news links about AI

https://intelligence.org/2017/04/12/ensuring/ - Nate Soares gives a talk at Google about "Ensuring smarter-than-human intelligence has a positive outcome". An outline of the talk is included.

https://intelligence.org/2017/04/07/decisions-are-for-making-bad-outcomes-inconsistent/ - An extended discussion of Soares's latest paper "Cheating Death in Damascus".

**Research**

https://everythingstudies.wordpress.com/2017/05/12/the-eurovision-song-contest-taste-landscape/ - Analysis of Voting patterns in the Eurovision Contest. Alliances and voting Blocs are analyzed in depth.

https://srconstantin.wordpress.com/2017/05/12/do-pineal-gland-extracts-promote-longevity-well-maybe/ - Analysis of hormonal systems and their effect on metabolism and longevity.

https://acesounderglass.com/2017/05/11/an-opportunity-to-throw-money-at-the-problem-of-medical-science/ - Help crowdfund a randomized controlled trial. A promising Sepsis treatment needs a RCT but the method is very cheap and unpatentable. So there is no financial incentive for a company to fund the study.

https://randomcriticalanalysis.wordpress.com/2017/05/09/towards-a-general-factor-of-consumption/ - Factor Analysis leads to a general factor of consumption. Discussion of the data and analysis of the model. Very thorough.

https://randomcriticalanalysis.wordpress.com/2017/04/13/disposable-income-also-explains-us-health-expenditures-quite-well/ - Long Article, lots of graphs. "I argued consumption, specifically Actual Individual Consumption, is an exceptionally strong predictor of national health expenditures (NHE) and largely explains high US health expenditures.  I found AIC to be a much more robust predictor of NHE than GDP... I think it useful to also demonstrate these patterns as it relates to household disposable income"

https://randomcriticalanalysis.wordpress.com/2017/04/15/some-useful-data-on-the-dispersion-characteristics-of-us-health-expenditures/ - US Health spending is highly concentrated in a small fraction of the population. Is this true for other countries?

https://randomcriticalanalysis.wordpress.com/2017/04/17/on-popular-health-utilization-metrics/ - An extremely graph dense article responding to a widely cited paper claiming that "high utilization cannot explain high US health expenditures."

https://randomcriticalanalysis.wordpress.com/2017/04/28/health-consumption-and-household-disposable-income-outside-of-the-oecd/ - Another part in the series on healthcare expenses. Extending the analysis to non-OECD countries. Lots of graphs.

https://randomcriticalanalysis.wordpress.com/2017/05/09/towards-a-general-factor-of-consumption/ - Factor Analysis leads to a general factor of consumption. Discussion of the data and analysis of the model. Very thorough.

https://srconstantin.wordpress.com/2017/04/12/parenting-and-heritability-overview/ - Detailed literature review on heritability and what parenting can affect. A significant number of references are included.

https://nintil.com/2017/04/23/links-7/ - Psychology, Economics, Philosophy, AI

http://lesswrong.com/r/discussion/lw/ox8/unstaging_developmental_psychology/ - A mathematical model of stages of psychological development. The linked technical paper is very impressive. Starting from an abstract theory the authors managed to create a psychological theory that was concrete enough to apply in practice.

**Math and CS**

http://andrewgelman.com/2017/05/10/everybody-lies-seth-stevens-davidowitz/ - A fairly positive review of Seth's book on learning from data.

http://eli.thegreenplace.net/2017/adventures-in-jit-compilation-part-4-in-python/ - Writing a JIT compiler in Python. Discusses both using native python code and the PeachPy library. Performance consideration are explicitly not discussed.

http://eli.thegreenplace.net/2017/book-review-essentials-of-programming-languages-by-d-friedman-and-m-wand/ - Short review. "This book is a detailed overview of some fundamental ideas in the design of programming languages. It teaches by presenting toy languages that demonstrate these ideas, with a full interpreter for every language"

http://eli.thegreenplace.net/2017/adventures-in-jit-compilation-part-3-llvm/ - LLVM can dramatically speed up straightforward source code.

http://www.scottaaronson.com/blog/?p=3221 - Machine Learning, Quantum Mechanics, Google Calendar

**Politics and Economics**

http://noahpinionblog.blogspot.com/2017/04/ricardo-reis-defends-macro_13.html - Macro is defended from a number of common criticisms. A large number of modern papers are cited (including 8 job market papers). Some addressed criticisms include: Macro relies on representative agents, Macro ignores inequality, Macro ignores finance and Macro ignores data and focuses mainly on theory.

http://econlog.econlib.org/archives/2017/04/economic_system.html - What are the fundamental questions an economic system must answer?

http://andrewgelman.com/2017/04/18/reputational-incentives-post-publication-review-two-partial-solutions-misinformation-problem/ - Gelman gives a list of important erroneous analysis in the news and scientific journals. He then considers if negative reputational incentives or post-publication peer review will solve the problem.

https://srconstantin.wordpress.com/2017/05/09/how-much-work-is-real/ - What fraction of jobs are genuinely productive?

https://hivewired.wordpress.com/2017/05/06/yes-this-is-a-hill-worth-dying-on/ - The Nazis were human too. Even if a hill is worth dying on its probably not worth killing for. Discussion of good universal norms. [Culture War]

https://srconstantin.wordpress.com/2017/05/09/chronic-fatigue-syndrome/ - Literature Analysis on Chronic Fatigue Syndrome. Extremely thorough.

https://www.gwern.net/newsletter/2017/04 - A Month's worth of links. Ai, Recent evolution, heritability and other topics.

https://thingofthings.wordpress.com/2017/05/05/the-cluster-structure-of-genderspace/ - For many traits the bell curves for men and women are quite close. Visualizations of Cohen's D. Discussion of trans specific medical interventions.

https://www.jefftk.com/p/replace-infrastructure-wholesale - Can you just dig up a city and replace all the infrastructure in a week?

https://thingofthings.wordpress.com/2017/04/19/deradicalizing-the-romanceless/ - Ozy discusses the problem of (male) involuntarily celibacy.

http://noahpinionblog.blogspot.com/2017/04/the-siren-song-of-homogeneity.html - The alt-right is about racial homogeneity. Smith Reviews the data studying whether a homogeneous society increases trust and social capital. Smith discusses the Japanese culture and his time in Japan. Smith considers the arbitrariness of racial categories despite admitting that race has a biological reality. Smith flips around some alt right slogans. [Extreme high quality engagement with opposing ideas. Culture War]

https://thezvi.wordpress.com/2017/04/16/united-we-blame/ - A list of articles about United, Zvi's thoughts on United, general ideas about airlines.

http://noahpinionblog.blogspot.com/2017/04/why-101-model-doesnt-work-for-labor.html - Noah Smith gives many reasons why the simple supply/demand model can't work for labor economics.

https://thingofthings.wordpress.com/2017/04/14/concerning-archive-of-our-own/ - Ozy defends the moderation policy of the fanction archive A03. [Culture War]

https://thingofthings.wordpress.com/2017/04/13/fantasies-are-okay/ - When are fantasies ok? What about sexual fantasies? [Culture War]

https://srconstantin.wordpress.com/2017/04/25/on-drama/ - Ritual, The Psychology of Adolf Hitler, the dangerous emotion of High Drama, The Rite of Spring.

https://qualiacomputing.com/2017/04/26/psychedelic-science-2017-take-aways-impressions-and-whats-next/ - Notes on the 2017 Psychedelic Science conference.

**Amusement**

http://kajsotala.fi/2017/04/fixing-the-4x-end-game-boringness-by-simulating-legibility/ - "4X games (e.g. Civilization, Master of Orion) have a well-known problem where, once you get sufficiently far ahead, you’ve basically already won and the game stops being very interesting."

https://putanumonit.com/2017/05/12/dark-fiction/ - Jacob does some Kabbalahistic Analysis on the Story of Jacob, Unsong Style.

https://protokol2020.wordpress.com/2017/04/30/several-big-numbers-to-sort/ - 12 Amusing definitions of big numbers.

http://existentialcomics.com/comic/183 - The Life of Francis

http://existentialcomics.com/comic/181 - A Presocratic Get Together.

https://protokol2020.wordpress.com/2017/05/07/problem-with-perspective/ - A 3D geometry problem.

http://existentialcomics.com/comic/184 - Wittgenstein in the Great War(edited)

http://existentialcomics.com/comic/182 - Captain Metaphysics and the Postmodern Peril

**Adjacent**

https://medium.com/@freddiedeboer/conservatives-are-wrong-about-everything-except-predicting-their-own-place-in-the-culture-e5c036fdcdc5 - Conservatives correctly predicted the effects of gay acceptance and no fault divorce. They have also been proven correct about liberal bias in academia and the media. [Culture War]

https://medium.com/@freddiedeboer/franchises-that-are-appropriate-for-children-are-inherently-limited-in-scope-8170e76a16e2 - Superhero movies have an intended audience that includes children. This drastically limits what techniques they can use and themes they can explore. Freddie goes into the details.

https://fredrikdeboer.com/2017/05/11/study-of-the-week-rebutting-academically-adrift-with-its-own-mechanism/ - Freddie wrote his dissertation on the College Learning Assessment, the primary source in "Academically Adrift".

https://medium.com/@freddiedeboer/politics-as-politics-12ab43429e64 - Politics as “group affiliation” vs politics as politics. Annoying atheists aren’t as bad as fundamentalist Christians even if more annoying atheists exist in educated leftwing spaces. Freddie’s clash with the identitarian left despite huge agreement on the object level. Freddie is a socialist not a liberal. [Culture War]

https://www.ribbonfarm.com/2017/05/09/priest-guru-nerd-king/ - Facebook, Governance, Doctrine, Strategy, Tactics and Operations. Fairly short post for Ribbonfarm.

https://fredrikdeboer.com/2017/05/09/lets-take-a-deep-dive-into-that-times-article-on-school-choice/ - A critique of the problems in the Time's well cited article on school choice. Points out issues with selection bias, lack of theory and the fact that "not everyone can be average".

https://fredrikdeboer.com/2017/05/09/lets-take-a-deep-dive-into-that-times-article-on-school-choice/ - A critique of the problems in the Time's well cited article on school choice. Points out issues with selection bias, lack of theory and the fact that "not everyone can be average".

http://marginalrevolution.com/marginalrevolution/2017/05/conversation-garry-kasparov.html - "We talked about AI, his new book Deep Thinking: Where Machine Intelligence Ends and Human Creativity Begins, why he has become more optimistic, how education will have to adjust to smart software, Russian history and Putin, his favorites in Russian and American literature, Tarkovsky..."

http://econlog.econlib.org/archives/2017/04/iq_with_conscie.html - "My fellow IQ realists are, on average, a scary bunch.  People who vocally defend the power of IQ are vastly more likely than normal people to advocate extreme human rights violations." There are interesting comments here: https://redd.it/6697sh .

http://econlog.econlib.org/archives/2017/04/iq_with_conscie_1.html - Short follow up to the above article.(edited)

http://marginalrevolution.com/marginalrevolution/2017/04/what-would-people-do-if-they-had-superpowers.html - Link to a paper showing 94% of people said they would use superpowers selfishly.

http://waitbutwhy.com/2017/04/neuralink.html - Elon Musk Wants to Build a wizard hat for the brain. Lots of details on the science behind Neuralink.

http://marginalrevolution.com/marginalrevolution/2017/04/dont-people-care-economic-inequality.html - Most Americans don’t mind inequality nearly as much as pundits and academics suggest.

http://marginalrevolution.com/marginalrevolution/2017/04/two-rationality-tests.html - What would you ask to determine if someone is rational? What would Tyler ask?(edited)

http://tim.blog/2017/05/04/exploring-smart-drugs-fasting-and-fat-loss-dr-rhonda-patrick/ - “Avoiding all stress isn’t the answer to fighting aging; it’s about building resiliency to environmental stress.”

http://wakingup.libsyn.com/what-should-we-eat - "Sam Harris speaks with Gary Taubes about his career as a science journalist, the difficulty of studying nutrition and public health scientifically, the growing epidemics of obesity and diabetes, the role of hormones in weight gain, the controversies surrounding his work, and other topics."(edited)

http://www.econtalk.org/archives/2017/05/jennifer_pahlka.html - Code for America. Bringing technology into the government sector.

http://heterodoxacademy.org/resources/viewpoint-diversity-experience/ - A six step process to appreciating viewpoint diversity. I am not sure this site will be the most useful to rationalists , on the object level, but its interesting to see what Haidt came up with.

http://www.econtalk.org/archives/2017/04/elizabeth_pape.html - Elizabeth Pape on Manufacturing and Selling Women's Clothing and Elizabeth Suzann(edited)

http://www.mrmoneymustache.com/2017/04/25/there-are-no-guarantees/ - Avoid Contracts. Don't work another year "just in case".

http://marginalrevolution.com/marginalrevolution/2017/04/saturday-assorted-links-109.html - Assorted Links on politics, Derrida, Shaolin Monks.

http://econlog.econlib.org/archives/2017/04/earth_20.html - Bryan Caplan was a guest on freakanomics Radio. The topic was  "Earth 2.0: Is Income Inequality Inevitable?".

https://www.ribbonfarm.com/2017/04/18/entrepreneurship-is-metaphysical-labor/ - Metaphysics as Intellectual Ergonomics. Entrepreneurship is Applied Metaphysics.

https://www.ribbonfarm.com/2017/04/13/idiots-scaring-themselves-in-the-dark/ - Getting Lost. "The uncanny. This is the emotion of eeriness, spookiness, creepiness"

**Podcast**

http://rationallyspeakingpodcast.org/show/rs-182-spencer-greenberg-on-how-online-research-can-be-faste.html - Podcast. Spencer Greenberg on "How online research can be faster, better, and more useful".

https://medium.com/conversations-with-tyler/patrick-collison-stripe-podcast-tyler-cowen-books-3e43cfe42d10 - Patrick Collison, co founder of Stripe, interviews Tyler.

http://tim.blog/2017/04/11/cory-booker/ - Podcast with US Senator Cory Booker. "Street Fights, 10-Day Hunger Strikes, and Creative Problem-Solving"

http://econlog.econlib.org/archives/2017/04/the_undermotiva_1.html - Two Case studies on libertarians who changed their views for bad reasons.(edited)

http://www.stitcher.com/podcast/vox/the-ezra-klein-show/e/death-sex-and-moneys-anna-sale-on-bringing-empathy-to-politics-50101701 - Interview with the host of the WNYC podcast Death, Sex, and Money.

http://marginalrevolution.com/marginalrevolution/2017/05/econtalk-podcast-russ-roberts-complacent-class.html - "Cowen argues that the United States has become complacent and the result is a loss of dynamism in the economy and in American life, generally. Cowen provides a rich mix of data, speculation, and creativity in support of his claims."

http://tim.blog/2017/04/16/marie-kondo/ - Podcast. "Marie Kondo is a Japanese organizing consultant, author, and entrepreneur."

http://www.econtalk.org/archives/2017/04/rana_foroohar_o.html - Podcast. Rana Foroohar on the Financial Sector and Makers and Takers

http://www.stitcher.com/podcast/vox/the-ezra-klein-show/e/cal-newport-on-doing-deep-work-and-escaping-social-media-49878016 - Cal Newport on doing Deep Work and escaping social media.

https://www.samharris.org/podcast/item/forbidden-knowledge - Podcast with Charles Murray. Controversy over The Bell Curve, the validity and significance of IQ as a measure of intelligence, the problem of social stratification, the rise of Trump. [culture war](edited)

http://www.stitcher.com/podcast/vox/the-ezra-klein-show/e/elizabeth-warren-on-what-barack-obama-got-wrong-49949167 - Ezra Klein Podcast with Elizabeth Warren.

http://marginalrevolution.com/marginalrevolution/2017/04/stubborn-attachments-podcast-ft-alphaville.html - Pocast with Tyler Cowen on Stubborn Attachments. "I outline a true and objectively valid case for a free and prosperous society, and consider the importance of economic growth for political philosophy, how and why the political spectrum should be reconfigured, how we should think about existential risk, what is right and wrong in Parfit and Nozick and Singer and effective altruism, how to get around the Arrow Impossibility Theorem, to what extent individual rights can be absolute, how much to discount the future, when redistribution is justified, whether we must be agnostic about the distant future, and most of all why we need to “think big.”"

http://www.themoneyillusion.com/?p=32435 - Notes on three podcasts. Faster RGDP growth, Monetary Policy, Tyler Cowen's philosophical views.(edited)

http://www.stitcher.com/podcast/vox/the-ezra-klein-show/e/vc-bill-gurley-on-transforming-health-care-50030526 - A conversation about which healthcare systems are possible in the USA and the future of Obamacare.

https://www.currentaffairs.org/2017/05/campus-politics-and-the-administrative-mind - The nature of College Bureaucracy. Focuses on protests and Title 9. [Culture war]

http://www.stitcher.com/podcast/vox/the-ezra-klein-show/e/cory-booker-returns-live-to-talk-trust-trump-and-basic-incomes-50054271 - "Booker and I dig into America’s crisis of trust. Faith in both political figures and political institutions has plummeted in recent decades, and the product is, among other things, Trump’s presidency. So what does Booker think can be done about it?"

http://www.stitcher.com/podcast/vox/the-ezra-klein-show/e/cal-newport-on-doing-deep-work-and-escaping-social-media-49878016 - Cal Newport on doing Deep Work and escaping social media.

http://tim.blog/2017/04/22/dorian-yates/ - Bodybuilding Champion. High Intensity Training, Injury Prevention, and Building Maximum Muscle.

Open thread, May 15 - May 21, 2017

1 Elo 15 May 2017 07:06AM

If it's worth saying, but not worth its own post, then it goes here.


Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

4. Unflag the two options "Notify me of new top level comments on this article" and "

[Link] Anthropic uncertainty in the Evidential Blackmail problem

3 Johannes_Treutlein 14 May 2017 04:43PM

Making decisions in a real computer - an argument for meta-decision theory research

2 whpearson 13 May 2017 11:18PM

Decision theory is being used as the basis for AI safety work. This currently involves maximising expected utility of specific actions. Maximising expected utility is woefully inefficient for performing very rapid paced unimportant decisions, which occur frequently in computing. But these fast paced decisions will still need to be made in a way that is purpose oriented in an AI.

This article presents an argument that we should explore meta-decision theories to allow the efficient solution of these problems. Meta-decision theories are also more human-like and could have a different central problems to first order decision theories.

continue reading »

[Link] Surfing Uncertainty: Prediction, Action, and the Embodied Mind - The Future of Prediction

0 morganism 13 May 2017 08:59PM

[Link] Reality has a surprising amount of detail

14 jsalvatier 13 May 2017 08:02PM

View more: Next