Further discussion of CFAR’s focus on AI safety, and the good things folks wanted from “cause neutrality”
In the days since we published our previous post, a number of people have come up to me and expressed concerns about our new mission. Several of these had the form “I, too, think that AI safety is incredibly important — and that is why I think CFAR should remain cause-neutral, so it can bring in more varied participants who might be made wary by an explicit focus on AI.”
I would here like to reply to these people and others, and to clarify what is and isn’t entailed by our new focus on AI safety.
In 2007, psychology researchers Michal Kosinski and David Stillwell released a personality testing app on Facebook app called myPersonality. The app ended up being used by 4 million Facebook users, most of whom consented to their personality question answers and some information from their Facebook profiles to be used for research purposes.
The very large sample size and matching data from Facebook profiles make it possible to investigate many questions about personality differences that were previously inaccessible. Koskinski and Stillwell have used it in a number of interesting publications, which I highly recommend (e.g. ,  ).
In this post, I focus on what the dataset tells us about how big five personality traits vary by geographic region in the United States.
Drugs that affect the nervous system get administered systemically. It's easy to imagine that we could do much more if we could stimulate one nerve at a time, and in patterns designed to have particular effects on the body.
"Neural coding" can detect the nerve impulses that indicate that a paralyzed person intends to move a limb, and build prosthetics that respond to the mind the way a real limb would. A company called BrainGate is already making these. You can see a paralyzed person using a robotic arm with her mind here.
A fair number of diseases that don't seem "neurological", like rheumatoid arthritis and ulcerative colitis, can actually be treated by stimulating the vagus nerve. The nervous system is tightly associated with the immune and endocrine systems, which is probably why autoimmune diseases are so associated with psychiatric comorbidities; it also means that the nervous system might be an angle towards treating autoimmune diseases. There is a "cholinergic anti-inflammatory pathway", involving the vagus nerve, which inactivates macrophages when they're exposed to the neurotransmitter acetylcholine, and thus lessens the immune response. Turning this pathway on electronically is thus a prospective treatment for autoimmune or inflammatory diseases. Vagus nerve stimulation has been tested and found successful in rheumatoid arthritis patients, in rat models of inflammatory bowel disease, and in dog experiments on chronic heart failure; vagus nerve activity mediates pancreatitis in mice; and vagus nerve stimulation attenuates the inflammatory response (cytokine release and shock) to the bacterial poison endotoxin.
We'd need much more detailed maps of where exactly nerves innervate various organs and which neurotransmitters they use; we'd need to record patterns of neural activity to detect which nerve signals modulate which diseases and experimentally determine causal relationships between neural signals and organ functions; we'd need to build small electronic interfaces (cuffs and chips) for use on peripheral nerves; we'd need lots of improvements in small-scale and non-invasive sensor technology (optogenetics, neural dust, ultrasound and electromagnetic imaging); and we'd need better tools for real-time, quantitative measurements of hormone and neurotransmitter release from nerves and organs.
A lot of this seems to clearly need hardware and software engineers, and signal-processing/image-processing/machine-learning people, in addition to traditional biologists and doctors. In the general case, neural modulation of organ function is Big Science in the way brain mapping or genomics is. You need to know where the nerves are, and what they're doing, in real time. This is likely going to need specialized software which outpaces what labs are currently capable of.
Bioelectronics seems potentially important not just for disease treatment today, but for more speculative goals like brain uploads or intelligence enhancement. It's a locally useful step along the path of understanding what the brain is actually doing, at a finer-grained level than the connectome alone can indicate, which may very well be relevant to AI.
It's tricky for non-academic software people (like myself and many LessWrong readers) to get involved in biomedical technology, but I predict that this is going to be one of the opportunities that needs us most, and if you're interested, it's worth watching this space to see when it gets out of the stage of university labs and DARPA projects and into commercialization.
To illustrate an old point - that it's hard to distinguish between extortion and trade negotiations - here's a schematic diagram of extorsion, alternating actions by player B (blackmailer/extorter//blue) and V (victim//violet):
The extorter can let the default Def happen, or can instead do a threat, ending up in point T. Then the victim can resist (Res) or surrender (Sur). If the victim resists, the extorter has the option of carrying out their threat (C) or not doing so (¬C).
You need a few conditions to make this into a extorsion situation:
- Sur has to be the best outcome for B (or else the extortion has no point).
- To make sure that T is a true threat, Sur has to be worse than Def for V, and C has to be worse that ¬C and Def for B.
- And to make this into extorsion, C has to be worse than Def for V.
A mere threat doesn't make this into extorsion. Indeed, trade negotiations are a series of repeated threats from both sides, making offers with the implicit threat that they will walk away from the deal entirely if that offer is not accepted.
But if C is worse that Def for V, then this seems a true extortion: V will end up worse if they resist a extorter who carries out their threats, and they would have much preferred that the extorter not be able to make credible threats in the first place. And the only reason the extorter made the threats, was to force the victim to surrender.
Fairness and equity
Note that this is not about fairness or niceness. It's perfectly possible to extort someone into giving you fair treatment (depending on how you see the default point, many of the boycotts during the civil right movement would count as extortion).
The all important default
This model seems clear; so why is it so hard to identify extortion in real life? One key issue is disagreement over the default point. Consider the following situations:
- During the Cuban missile crisis, it was clear to the Americans that the default was no nuclear missiles in Cuba, and the soviets were recklessly violating this default. To the USSR, the default was obviously that countries could station nuclear missiles on their allies' territories (like the Americans were doing in Turkey). Then, once the blockade was in place and the soviet ships were on their way, the default seems to be nuclear war (meaning that practically nothing could count as extortion in this situation).
- Take a management-union dispute, where management wants to cut pay. The unions can argue that this violates long-standing company policy of decent pay. Management can retort that the default is actually to be a profitable company, and that their industry is currently in decline, so declining pay should be the default. After a bit of negotiating, the two seem to reach the framework of a decent understanding - is this now the default for further negotiations?
- "You must give me something to satisfy my population!" "Now, I'm fine with this, but when our army arrives, I'm not sure I can control them." "Well, I'll try and talk to my president, but he's crazy! Throw me some sort of bone here, will you?" All these types of arguments are an attempt to shift the default in their favour. Sure, the negotiator isn't threatening you, they just needs your help to contain the default behaviour of their people/army/president.
- Two people are negotiating to trade water and food. If no trade is reached, they will both die. Can "both dying" be considered a reasonable default? Or is "both trade at least enough to ensure mutual survival" a default, seeing as "both dying" is not an outcome either would ever want? Can a situation that will never realistically happen be considered a default?
- Take the purest example of blackmail (extortion via information): a photographer snaps shots of an adulterous couple, and sells the photos back to them for much money. Blackmail! But what if they just happened to be in a shot that the photographer was taking of other people? Then the photographer is suppressing the photos to help the couple, and charging a reasonable fee for that (what if the fee is unreasonable? does it make a difference?). But what if the photographer deliberately hangs around areas where trysts often happen, to get some extra cash? Or what if they don't do that deliberately, but photographers who don't hang around these areas can't turn a profit and change jobs, so that only these ones are left?
These examples should be sufficient to illustrate the degree of the problem, and also show how defaults are often forged by implicit and explicit norms, so extortions are clearest cut when they also include norm violation, trust violation, or other dubious elements. behaviours. In a sense, picking the default is the important thing; picking exactly the right default is less important, since once the default is known, people can adjust their expectations and behaviour in consequence.
There’s a story I like, about this little kid who wants to be a writer. So she writes a story and shows it to her teacher.
“You misspelt the word ‘ocean’”, says the teacher.
“No I didn’t!”, says the kid.
The teacher looks a bit apologetic, but persists: “‘Ocean’ is spelt with a ‘c’ rather than an ‘sh’; this makes sense, because the ‘e’ after the ‘c’ changes its sound…”
“No I didn’t!” interrupts the kid.
“Look,” says the teacher, “I get it that it hurts to notice mistakes. But that which can be destroyed by the truth should be! You did, in fact, misspell the word ‘ocean’.”
“I did not!” says the kid, whereupon she bursts into tears, and runs away and hides in the closet, repeating again and again: “I did not misspell the word! I can too be a writer!”.
Today, my paper "Is caviar a risk factor for being a millionaire?" was published in the Christmas Edition of the BMJ (formerly the British Medical Journal). The paper is available at http://www.bmj.com/content/355/bmj.i6536 but it is unfortunately behind a paywall. I am hoping to upload an open access version to a preprint server but this needs to be confirmed with the journal first.
In this paper, I argue that the term "risk factor" is ambiguous, and that this ambiguity causes pervasive methodological confusion in the epidemiological literature. I argue that many epidemiological papers essentially use an audio recorder to determine whether a tree falling in the forest makes a sound, without being clear about which definition of "sound" they are considering.
Even worse, I argue that epidemiologists often try to avoid claiming that their results say anything about causality, by hiding behind "prediction models". When they do this. they often still control extensively for "confounding", a term which only has a meaning in causal models. I argue that this is analogous to stating that you are interested in whether trees falling in the forest causes any human to perceive the qualia of hearing, and then spending your methods section discussing whether the audio recorder was working properly.
Due to space constraints and other considerations, I am unable to state these analogies explicitly in the paper, but it does include a call for a taboo on the word risk factor, and a reference to Rationality: AI to Zombies. To my knowledge, this is the first reference to the book in the medical literature.
I will give a short talk about this paper at the Less Wrong meetup at the MIRI/CFAR office in Berkeley at 6:30pm tonight.
(I apologize for this short, rushed announcement, I was planning to post a full writeup but I was not expecting this paper to be published for another week)
I came across a 2015 blog post by Vitalik Buterin that contains some ideas similar to Paul Christiano's recent Crowdsourcing moderation without sacrificing quality. The basic idea in both is that it would be nice to have a panel of trusted moderators carefully pore over every comment and decide on its quality, but since that is too expensive, we can instead use some tools to predict moderator decisions, and have the trusted moderators look at only a small subset of comments in order to calibrate the prediction tools. In Paul's proposal the prediction tool is machine learning (mainly using individual votes as features), and in Vitalik's proposal it's prediction markets where people bet on what the moderators would decide if they were to review each comment.
It seems worth thinking about how to combine the two proposals to get the best of both worlds. One fairly obvious idea is to let people both vote on comments as an expression of their own opinions, and also place bets about moderator decisions, and use ML to set baseline odds, which would reduce how much the forum would have to pay out to incentivize accurate prediction markets. The hoped for outcome is that the ML algorithm would make correct decisions most of the time, but people can bet against it when they see it making mistakes, and moderators would review comments that have the greatest disagreements between ML and people or between different bettors in general. Another part of Vitalik's proposal is that each commenter has to make an initial bet that moderators would decide that their comment is good. The article notes that such a bet can also be viewed as a refundable deposit. Such forced bets / refundable deposits would help solve a security problem with Paul's ML-based proposal.
Are there better ways to combine these prediction tools to help with forum moderation? Are there other prediction tools that can be used instead or in addition to these?
The most useful thinking skill I've taught myself, which I think should be more widely practiced, is writing what I call "fact posts." I write a bunch of these on my blog. (I write fact posts about pregnancy and childbirth here.)
To write a fact post, you start with an empirical question, or a general topic. Something like "How common are hate crimes?" or "Are epidurals really dangerous?" or "What causes manufacturing job loss?"
It's okay if this is a topic you know very little about. This is an exercise in original seeing and showing your reasoning, not finding the official last word on a topic or doing the best analysis in the world.
Then you open up a Google doc and start taking notes.
You look for quantitative data from conventionally reliable sources. CDC data for incidences of diseases and other health risks in the US; WHO data for global health issues; Bureau of Labor Statistics data for US employment; and so on. Published scientific journal articles, especially from reputable journals and large randomized studies.
You explicitly do not look for opinion, even expert opinion. You avoid news, and you're wary of think-tank white papers. You're looking for raw information. You are taking a sola scriptura approach, for better and for worse.
And then you start letting the data show you things.
You see things that are surprising or odd, and you note that.
You see facts that seem to be inconsistent with each other, and you look into the data sources and methodology until you clear up the mystery.
You orient towards the random, the unfamiliar, the things that are totally unfamiliar to your experience. One of the major exports of Germany is valves? When was the last time I even thought about valves? Why valves, what do you use valves in? OK, show me a list of all the different kinds of machine parts, by percent of total exports.
And so, you dig in a little bit, to this part of the world that you hadn't looked at before. You cultivate the ability to spin up a lightweight sort of fannish obsessive curiosity when something seems like it might be a big deal.
And you take casual notes and impressions (though keeping track of all the numbers and their sources in your notes).
You do a little bit of arithmetic to compare things to familiar reference points. How does this source of risk compare to the risk of smoking or going horseback riding? How does the effect size of this drug compare to the effect size of psychotherapy?
You don't really want to do statistics. You might take percents, means, standard deviations, maybe a Cohen's d here and there, but nothing fancy. You're just trying to figure out what's going on.
It's often a good idea to rank things by raw scale. What is responsible for the bulk of deaths, the bulk of money moved, etc? What is big? Then pay attention more to things, and ask more questions about things, that are big. (Or disproportionately high-impact.)
You may find that this process gives you contrarian beliefs, but often you won't, you'll just have a strongly fact-based assessment of why you believe the usual thing.
There's a quality of ordinariness about fact-based beliefs. It's not that they're never surprising -- they often are. But if you do fact-checking frequently enough, you begin to have a sense of the world overall that stays in place, even as you discover new facts, instead of swinging wildly around at every new stimulus. For example, after doing lots and lots of reading of the biomedical literature, I have sort of a "sense of the world" of biomedical science -- what sorts of things I expect to see, and what sorts of things I don't. My "sense of the world" isn't that the world itself is boring -- I actually believe in a world rich in discoveries and low-hanging fruit -- but the sense itself has stabilized, feels like "yeah, that's how things are" rather than "omg what is even going on."
In areas where I'm less familiar, I feel more like "omg what is even going on", which sometimes motivates me to go accumulate facts.
Once you've accumulated a bunch of facts, and they've "spoken to you" with some conclusions or answers to your question, you write them up on a blog, so that other people can check your reasoning. If your mind gets changed, or you learn more, you write a follow-up post. You should, on any topic where you continue to learn over time, feel embarrassed by the naivety of your early posts. This is fine. This is how learning works.
The advantage of fact posts is that they give you the ability to form independent opinions based on evidence. It's a sort of practice of the skill of seeing. They likely aren't the optimal way to get the most accurate beliefs -- listening to the best experts would almost certainly be better -- but you, personally, may not know who the best experts are, or may be overwhelmed by the swirl of controversy. Fact posts give you a relatively low-effort way of coming to informed opinions. They make you into the proverbial 'educated layman.'
Being an 'educated layman' makes you much more fertile in generating ideas, for research, business, fiction, or anything else. Having facts floating around in your head means you'll naturally think of problems to solve, questions to ask, opportunities to fix things in the world, applications for your technical skills.
Ideally, a group of people writing fact posts on related topics, could learn from each other, and share how they think. I have the strong intuition that this is valuable. It's a bit more active than a "journal club", and quite a bit more casual than "research". It's just the activity of learning and showing one's work in public.
View more: Next