Realistic epistemic expectations
When I state a position and offer evidence for it, people sometimes complain that the evidence that I've given doesn't suffice to establish my position. The situation is usually that I'm not trying to give a rigorous argument for my position, and I don't intend to claim that the evidence that I provide suffices to establish my position.
My goal in these cases is to offer a high-level summary of my thinking, and to provide enough evidence so that readers have reason to Bayesian update and to find the view sufficiently intriguing to investigate further.
In general, when a position is non-obvious, a single conversation is nowhere near enough time to convince a rational person that it's very likely to be true. As Burgundy recently wrote:
When you ask Carl Shulman a question on AI, and he starts giving you facts instead of a straight answer, he is revealing part of his book. The thing you are hearing from Carl Shulman is really only the tip of the iceberg because he cannot talk fast enough. His real answer to your question involves the totality of his knowledge of AI, or perhaps the totality of the contents of his brain.
If I were to restrict myself to making claims that I could substantiate in a mere ~2 hours, that would preclude the possibility of me sharing the vast majority of what I know.
In math, one can give rigorous proofs starting from very simple axioms, as Gauss described:
I mean the word proof not in the sense of lawyers, who set two half proofs equal to a whole one, but in the sense of mathematicians, where 1/2 proof = 0, and it is demanded for proof that every doubt becomes impossible'.
Even within math, as a practical matter, proofs that appear to be right are sometimes undercut by subtle errors. But outside of math – the only reliable tool that one has at one's disposal is Bayesian inference. In 2009, charity evaluator GiveWell made very strong efforts to apply careful reasoning to identify its top rated charity, and gave a "conservative" cost-effectiveness estimate of $545/life saved, which turned out to have been wildly optimistic. Argumentation that looks solid on the surface often breaks down on close scrutiny. This is closely related to why GiveWell emphasizes the need to look at giving opportunities from many angles, and gives more weight to robustness of evidence than to careful chains of argumentation.
Eliezer named this website Less Wrong for a reason – one can never be certain of anything – all rational beliefs reflect degrees of confidence. I believe that discussion advances rationality the most when it involves sharing perspectives and evidence, rather than argumentation.
Taking the reins at MIRI
Hi all. In a few hours I'll be taking over as executive director at MIRI. The LessWrong community has played a key role in MIRI's history, and I hope to retain and build your support as (with more and more people joining the global conversation about long-term AI risks & benefits) MIRI moves towards the mainstream.
Below I've cross-posted my introductory post on the MIRI blog, which went live a few hours ago. The short version is: there are very exciting times ahead, and I'm honored to be here. Many of you already know me in person or through my blog posts, but for those of you who want to get to know me better, I'll be running an AMA on the effective altruism forum at 3PM Pacific on Thursday June 11th.
I extend to all of you my thanks and appreciation for the support that so many members of this community have given to MIRI throughout the years.
Strategies and tools for getting through a break up
Background:
I was very recently (3 weeks now) in a relationship that lasted for 5.5 years. My partner had been fantastic through all those years and we were suffering no conflict, no fights, no strain or tension. My partner also was prone to depression, and is/was going through an episode of depression. I am usually a major source of support at these times. Six months ago we opened our relationship. I wasn't dating anyone (mostly due to busy-ness), and my partner was, though not seriously. I felt him pulling away somewhat, which I (correctly) attributed mostly to depression and which nonetheless caused me some occasional moments of jealousy. But I was overall extremely happy with this relationship, very committed, and still very much in love as well. It was quite a surprise when my partner broke up with me one Wednesday evening.
After we had a good cry together, the next morning I woke up and immediately started researching what the literature said about breaking up. My goals were threefold:
- Stop feeling so sad in the immediate moment
- "Get over" my partner
- Internalize any gains I had made over the course of our relationship or any lessons I had learned from the break up
I made most of my gains in the first few days, by day 3 I was 50% over it. Two weeks later I was 90% over the relationship, with a few hold-over habits and tendencies (like feeling responsible for improving his emotional state) which are currently too strong but which will serve me well in our continuing friendship. My ex, on the other hand (no doubt partially due to the depression) is fine most of the time but unpredictably becomes extremely sad for hours on end. Originally this was guilt at having hurt me but now it is mostly nostalgia+isolation based. I hope to continue being close friends and I've been doing my best to support him emotionally, at the distance of a friend. Below are the states of mind and strategies that allowed me to get over it more quickly and with good personal growth.
Note: mileage may vary. I have low neuroticism and a slightly higher than average base level of happiness. You might not get over the relationship in 2 weeks, but your getting-over-it will certainly be sped up from their default speed.
Strategies (in order of importance)
1. Decide you don't want to get back in the relationship. Decide that it is over and given the opportunity, you will not get back with this person. If you were the breaker-upper, you can skip this step.
Until you can do this, it is unlikely that you will get over it. It's hard to ignore an impulse that you agree with wholeheartedly. If you're always hoping for an opportunity or an argument or a situation that will bring you back together, much of your mental energy will go towards formulating those arguments, planning for that situation, imagining that opportunity. Some of the below strategies can still be used, but spend some serious time on this first one. It's the foundation of everything else. There are some facts that can help you convince the logical part of brain that this is the correct attitude.
- People in on-and-off relationships are less satisfied, feel more anxiety about their relationship status, and continue to cycle on-and-off even after couples add additional constraints like cohabitation or marriage
- People in tumultuous relationships are much less happy than singles
- Wanting to stay in a relationship is reinforced by many biases (status quo bias, ambiguity effect, choice supportive bias, loss aversion, mere-exposure effect, ostrich effect). For someone to break through all those biases and end things, they must be extremely unhappy. If your continuing relationship makes someone you love extremely unhappy, it is a disservice again to capitalize on those biases in a moment of weakness and return to the relationship.
- Being in a relationship with someone who isn't excited about and pleased by you is settling for an inferior quality of relationship. The amazing number of date-able people in the world means settling for this is not an optimal decision. Contrast this to a tribal situation where replacing a lost mate was difficult or impossible. All these feelings of wanting to get back together evolved in a situation of scarcity, but we live in a world of plenty.
- Intermittent rewards are the most powerful, so an on-again-off-again relationship has the power to make you commit to things you would never commit to given a new relationship. The more hot-and-cold your partner is, the more rewarding the relationship seems and the less likely you are to be happy in the long term. Only you can end that tantalizing possibility of intermittent rewards by resolving not to partake if the opportunity arises.
- Even if some extenuating circumstance could explain away their intention to break up (depression, bipolar, long-distance, etc), it is belittling to your ex-partner to try to invalidate their stated feelings. Do not fall into the trap of feeling that you know more about a person's inner state than they do. Take it at face value and act accordingly. Even if this is only a temporary state of mind for them, it is unlikely that they will never ever again be in the same state of mind.
2. Talk to other people about the good things that came of your break-up. (This can also help you arrive at #1, not wanting to get back together)
I speculate that benefits from this come from three places. First, talking about good thinks makes you notice good things and talking in a positive attitude makes you feel positive. Second, it re-emphasizes to your brain that losing your significant other does not mean losing your social support network. Third, it acts as a mild commitment mechanism - it would be a loss of face to go on about how great you're doing outside the relationship and later have to explain you jumped back in at the first opportunity.
You do not need to be purely positive. If you are feeling sadness, it sometimes helps to talk about this. But don't dwell only on the sadness when you talk. When I was talking to my very close friends about all aspects of my feelings, I still tried to say two positive things for every negative thing. For example: "It was a surprise, which was jarring and unpleasant and upended my life plans in these ways. But being a surprise, I didn't have time to dread and dwell on it beforehand. And breaking up sooner is preferable to a long decline in happiness for both parties, so its better to break up as soon as it becomes clear to either party that the path is headed downhill, even if it is surprising to the other party."
Talk about the positives as often as possible without alienating people. The people you talk to do not need to be serious close friends. I spend a collective hour and a half talking to two OKCupid dates about how many good things came from the break up. (Both dates had been scheduled before actually breaking up, both people had met me once prior, and both dates went surprisingly well due to sympathy, escalating self-disclosure, and positive tone. I signaled that I am an emotionally healthy person dealing well with an understandably difficult situation).
If you feel that you don't have any candidates for good listeners either because the break up was due to some mistake or infidelity of yours, or because you are socially isolated/anxious, writing is an effective alternative to talking. Study participants recovered quicker when they spent 15 minutes writing about the positive aspects of their break up, participants with three 15 minute sessions did better still. And it can benefit anyone to keep a running list of positives to can bring up out in conversation.
3. Create a social support system
Identify who in your social network can still be relied on as a confidant and/or a neutral listener. You would be surprised at who still cares about you. In my breakup, my primary confidant was my ex's cousin, who also happens to be my housemate and close friend. His mom and best friend, both in other states, also made the effort to inquire about my state of mind. Most of the time, even people who you consider your partner's friends still feel enough allegiance to you and enough sympathy to be good listeners and through listening they can become your friends.
If you don't currently have a support system, make one! OKCupid is a great resource for meeting friends outside of just dating, and people are way way more likely to want to meet you if you message them with a "just looking for friends" type message. People you aren't currently close to but who you know and like can become better friends if you are willing to reveal personal/vulnerable stories. Escalating self-disclosure+symmetrical vulnerability=feelings of friendship. Break ups are a great time for this to happen because you've got a big vulnerability, and one which almost everyone has experienced. Everyone has stories to share and advice to give on the topic of breaking up.
4. Intentionally practice differentiation
One of the most painful parts of a break up is that so much of your sense-of-self is tied into your relationship. You will be basically rebuilding your sense of self. Depending on the length and the committed-ness of the relationship, you may be rebuilding it from the ground up. Think of this as an opportunity. You can rebuild it an any way you desire. All the things you used to like before your relationship, all the interests and hobbies you once cared about, those can be reincorporated into your new, differentiated sense of self. You can do all the things you once wished you did.
Spend at least 5 minutes thinking about what your best self looks like. What kind of person do you wish to be? This is a great opportunity to make some resolutions. Because you have a fresh start, and because these resolutions are about self-identification, they are much more likely to stick. Just be sure to frame them in relation to your sense-of-self: not 'I will exercise,' instead 'I'm a fit active person, the kind of person who exercises' not 'I want to improve my Spanish fluency' but 'I'm a Spanish speaking polygot, the kind of person who is making an big effort to become fluent.'
Language is also a good tool to practice differentiation. Try not to use the word "we," "us," of "our," even in your head. From now on, it is "s/he and I," "me and him/her," or "mine and his/hers." Practice using the word "ex" a lot. Memories are re-formulated and overwritten each time we revisit them, so in your memories make sure to think of you two as separate independent people and not as a unit.
5. Make use of the following mental frameworks to re-frame your thinking:
Over the relationship vs. over the person
You do not have to stop having romantic, tender, or lustful feelings about your ex to get over the relationship. Those type of feelings are not easily controlled, but you can have those same feelings for good friends or crushes without it destroying your ability to have a meaningful platonic relationship, why should this be different?
Being over the relationship means:
- Not feeling as though you are missing out on being part of a relationship.
- Not dwelling/ruminating/obsessing about your ex-partner (includes both positive, negative and neutral thoughts "they're so great" and "I hate them and hope they die" and "I wonder what they are up to".
- Not wishing to be back with your ex-partner.
- Not making plans that include consideration of your ex-partner because these considerations are no longer important (this includes considerations like "this will make him/her feel sorry I'm gone," or "this will show him/her that I'm totally over it")
- Being able to interact with people without your ex-partner at your side and not feel weird about it, especially things you used to do together (eg. a shared hobby or at a party)
- In very lucky peaceful-breakup situations, being able to interact with your ex-partner and maybe even their current romantic interests without it being too horribly weird and unpleasant.
On the other hand, being over a person means experiencing no pull towards that person, romantic, emotional, or sexual. If your break up was messy, you can be over the person without being over the relationship. This is often when people turn to messy and unsatisfying rebound relationships. It is far far more important to be over the relationship, and some of us (me included) will just have to make peace with never being over the person, with the help of knowing that having a crush on someone does not necessarily have the power to make you miserable or destroy your friendship.
Obsessive thinking and cravings
If you used a brain scanner to look at a person who has been recently broken up with, and then you used the same brain scanner to look at someone who recently sobered up from an addictive drug, their brain activity would be very similar. So similar, in fact, that some neurologists speculate that addiction hijacks the circuits for romantic obsession (there is a very plausible evolutionary reason for romantic obsession to exist in early human tribal societies. Addiction, less so).
In cases of addiction/craving, you can't just force your mind to stop thinking thoughts you don't like. But you can change your relationship with those thoughts. Recognize when they happen. Identify them as a craving rather than a true need. Recognize that, when satisfied, cravings temporarily diminish and then grow stronger (you've rewarded your brain for that behavior). These are thoughts without substance. The impulse they drive you towards will increase, rather than decrease, unpleasant feelings.
When I first broke up, I had a couple very unpleasant hours of rumination, thinking uncontrollably about the same topics over and over despite those topics being painful. At some point I realized that continuing to merely think about the break up was also addictive. My craving circuits just picked the one set of thoughts I couldn't argue against so that my brain could go on obsessively dwelling without me being able to pull a logic override. These thoughts SEEM like goal oriented thinking, they FEEL productive, but they are a wolf in sheep's clothing.
In my specific case, my brain was concern trolling me. Concern trolling on the internet is when someone expresses sympathy and concern while actually having ulterior motives (eg on a body-positive website, fat shaming with: "I'm so glad you're happy but I'm concerned that people will think less of you because of your weight"). In my case, I was worrying about my ex's depression and his state of mind, which are very hard thoughts to quash. Empathy and caring are good, right? And he really was going through a hard time. Maybe I should call and check up on him.... My brain was concern trolling me.
Depending on how your relationship ended, your brain could be trolling in other ways. Flaming seems to be a popular set of unstoppable thoughts. If you can't argue with the thought that the jerk is a horrible person, then THAT is the easiest way for your brain's addictive circuits to happily go on obsessing about this break up. Nostalgia is also a popular option. If the memories were good, then it's hard to argue with those thoughts. If you're a well trained rationalist, you might notice that you are feeling confused and then burn up many brain cycles trying to resolve your confusion by making sense of a fact, despite it not being a rational thing. Your addictive circuits can even hijack good rationalist habits. Other common ruminations are problem solving, simulating possible futures, regret, counter-factual thinking.
As I said, you can't force these parts of your brain to just shut up. That's not how craving works. But you can take away their power by recognizing that all your ruminating is just these circuits hijacking your normal thought process. Say to yourself "I feeling an urge to call and yell at him/her, but so what. Its just a meaningless craving."
What you lose
There is a great sense of loss that comes with the end of a relationship. For some people, it is a similar feeling to actually being in mourning. Revisiting memories becomes painful, things you used to do together are suddenly tinged with sadness.
I found it helpful to think of my relationship as a book. A book with some really powerful life-changing passages in the early chapters, a good rising action, great characters. A book which made me a better person by reading it. But a book with a stupid deus ex machina ending that totally invalidated the foreshadowing in the best passages. Finishing the book can be frustrating and saddening, but the first chapters of book still exist. Knowing that the ending sucks isn't going to stop the first chapters from being awesome and entertaining and powerful. And I could revisit those first chapters any time I liked. I could just read my favorite parts without needing to read the whole stupid ending.
You don't lose your memories. You don't lose your personal growth. Any gains you made while you were with someone, anything new that they introduced you to, or helped you to improve on, or nagged at you till you had a new better habit, you get to keep all of those. That show you used to watch together, it is still there and you still get to watch it and care about it without him/her. The bar you used to visit together is still there too. All those photos are still great pictures of both of you in interesting places. Depending on the situation of the break up, your mutual friends are still around. Even your ex still exists and is still the same person you liked before, and breaking up doesn't mean you'll never see them again unless that's what you guys want/need.
The only thing you definitely lose at the end of a relationship is the future of that relationship. You are losing something that hasn't happened yet, something which never existed. The only thing you are losing is what you imagined someday having. It's something similar to the endowment effect: you assumed this future was yours so you assigned it a lot of value. But it never was yours, you've lost something which doesn't exist. It's still a painful experience, but realizing all of this helped me a lot.
Additional Reading:
http://wiki.lesswrong.com/wiki/Dealing_with_a_Major_Personal_Crisis
Addendum:
Comparisons and self-esteem:
Brains are built to compare and optimize, so one difficult problem I've faced in the months after the break up was seeing my ex date other people. I had trouble because my unconscious impulse is to think "he has chosen them over me." This thinking pattern is instant, unconscious, and hard to break. And it comes with a big hit to either self esteem or my willingness to humanize these actual humans he is dating.
It was helpful to remind myself that the break up occurred because the relationship was broken. There is a heavy opportunity cost to date someone with whom it can never work out or with whom you are not happy. That opportunity cost is the freedom to seek a better relationship. So I shouldn't be comparing myself to any flesh-and-blood person. He chose opportunity and freedom over me. And its just not possible to compare yourself to a a concept like that in a way that makes sense. The people that come as a result that choice are irrelevant.
Milestones:
It took me 2 weeks to be over this particular relationship, it took me a month and a half to not wish I was in some relationship, to get excited and happy about being single. It was 3 months before dating and experiencing new people started to sound like it might be fun/interesting.
Long Tail of Sadness:
During the period after the break up, for about 3 months, I had to be extra careful to have enough sleep, drink enough water, get sunshine, eat enough, and meditate. If my physical state was normal, I almost always felt great, acted normal, and rarely thought about my ex. But if I let myself get into a physical state which would normally cause a generalized bad mood, I would more often find myself ruminating on the break up. Sleep is medicine.
Concept Safety: Producing similar AI-human concept spaces
I'm currently reading through some relevant literature for preparing my FLI grant proposal on the topic of concept learning and AI safety. I figured that I might as well write down the research ideas I get while doing so, so as to get some feedback and clarify my thoughts. I will posting these in a series of "Concept Safety"-titled articles.
A frequently-raised worry about AI is that it may reason in ways which are very different from us, and understand the world in a very alien manner. For example, Armstrong, Sandberg & Bostrom (2012) consider the possibility of restricting an AI via "rule-based motivational control" and programming it to follow restrictions like "stay within this lead box here", but they raise worries about the difficulty of rigorously defining "this lead box here". To address this, they go on to consider the possibility of making an AI internalize human concepts via feedback, with the AI being told whether or not some behavior is good or bad and then constructing a corresponding world-model based on that. The authors are however worried that this may fail, because
Humans seem quite adept at constructing the correct generalisations – most of us have correctly deduced what we should/should not be doing in general situations (whether or not we follow those rules). But humans share a common of genetic design, which the OAI would likely not have. Sharing, for instance, derives partially from genetic predisposition to reciprocal altruism: the OAI may not integrate the same concept as a human child would. Though reinforcement learning has a good track record, it is neither a panacea nor a guarantee that the OAIs generalisations agree with ours.
Addressing this, a possibility that I raised in Sotala (2015) was that possibly the concept-learning mechanisms in the human brain are actually relatively simple, and that we could replicate the human concept learning process by replicating those rules. I'll start this post by discussing a closely related hypothesis: that given a specific learning or reasoning task and a certain kind of data, there is an optimal way to organize the data that will naturally emerge. If this were the case, then AI and human reasoning might naturally tend to learn the same kinds of concepts, even if they were using very different mechanisms. Later on the post, I will discuss how one might try to verify that similar representations had in fact been learned, and how to set up a system to make them even more similar.
Word embedding
A particularly fascinating branch of recent research relates to the learning of word embeddings, which are mappings of words to very high-dimensional vectors. It turns out that if you train a system on one of several kinds of tasks, such as being able to classify sentences as valid or invalid, this builds up a space of word vectors that reflects the relationships between the words. For example, there seems to be a male/female dimension to words, so that there's a "female vector" that we can add to the word "man" to get "woman" - or, equivalently, which we can subtract from "woman" to get "man". And it so happens (Mikolov, Yih & Zweig 2013) that we can also get from the word "king" to the word "queen" by adding the same vector to "king". In general, we can (roughly) get to the male/female version of any word vector by adding or subtracting this one difference vector!
Why would this happen? Well, a learner that needs to classify sentences as valid or invalid needs to classify the sentence "the king sat on his throne" as valid while classifying the sentence "the king sat on her throne" as invalid. So including a gender dimension on the built-up representation makes sense.
But gender isn't the only kind of relationship that gets reflected in the geometry of the word space. Here are a few more:

It turns out (Mikolov et al. 2013) that with the right kind of training mechanism, a lot of relationships that we're intuitively aware of become automatically learned and represented in the concept geometry. And like Olah (2014) comments:
It’s important to appreciate that all of these properties of W are side effects. We didn’t try to have similar words be close together. We didn’t try to have analogies encoded with difference vectors. All we tried to do was perform a simple task, like predicting whether a sentence was valid. These properties more or less popped out of the optimization process.
This seems to be a great strength of neural networks: they learn better ways to represent data, automatically. Representing data well, in turn, seems to be essential to success at many machine learning problems. Word embeddings are just a particularly striking example of learning a representation.
It gets even more interesting, for we can use these for translation. Since Olah has already written an excellent exposition of this, I'll just quote him:
We can learn to embed words from two different languages in a single, shared space. In this case, we learn to embed English and Mandarin Chinese words in the same space.
We train two word embeddings, Wen and Wzh in a manner similar to how we did above. However, we know that certain English words and Chinese words have similar meanings. So, we optimize for an additional property: words that we know are close translations should be close together.
Of course, we observe that the words we knew had similar meanings end up close together. Since we optimized for that, it’s not surprising. More interesting is that words we didn’t know were translations end up close together.
In light of our previous experiences with word embeddings, this may not seem too surprising. Word embeddings pull similar words together, so if an English and Chinese word we know to mean similar things are near each other, their synonyms will also end up near each other. We also know that things like gender differences tend to end up being represented with a constant difference vector. It seems like forcing enough points to line up should force these difference vectors to be the same in both the English and Chinese embeddings. A result of this would be that if we know that two male versions of words translate to each other, we should also get the female words to translate to each other.
Intuitively, it feels a bit like the two languages have a similar ‘shape’ and that by forcing them to line up at different points, they overlap and other points get pulled into the right positions.
After this, it gets even more interesting. Suppose you had this space of word vectors, and then you also had a system which translated images into vectors in the same space. If you have images of dogs, you put them near the word vector for dog. If you have images of Clippy you put them near word vector for "paperclip". And so on.
You do that, and then you take some class of images the image-classifier was never trained on, like images of cats. You ask it to place the cat-image somewhere in the vector space. Where does it end up?
You guessed it: in the rough region of the "cat" words. Olah once more:
This was done by members of the Stanford group with only 8 known classes (and 2 unknown classes). The results are already quite impressive. But with so few known classes, there are very few points to interpolate the relationship between images and semantic space off of.
The Google group did a much larger version – instead of 8 categories, they used 1,000 – around the same time (Frome et al. (2013)) and has followed up with a new variation (Norouzi et al. (2014)). Both are based on a very powerful image classification model (from Krizehvsky et al. (2012)), but embed images into the word embedding space in different ways.
The results are impressive. While they may not get images of unknown classes to the precise vector representing that class, they are able to get to the right neighborhood. So, if you ask it to classify images of unknown classes and the classes are fairly different, it can distinguish between the different classes.
Even though I’ve never seen a Aesculapian snake or an Armadillo before, if you show me a picture of one and a picture of the other, I can tell you which is which because I have a general idea of what sort of animal is associated with each word. These networks can accomplish the same thing.
These algorithms made no attempt of being biologically realistic in any way. They didn't try classifying data the way the brain does it: they just tried classifying data using whatever worked. And it turned out that this was enough to start constructing a multimodal representation space where a lot of the relationships between entities were similar to the way humans understand the world.
How useful is this?
"Well, that's cool", you might now say. "But those word spaces were constructed from human linguistic data, for the purpose of predicting human sentences. Of course they're going to classify the world in the same way as humans do: they're basically learning the human representation of the world. That doesn't mean that an autonomously learning AI, with its own learning faculties and systems, is necessarily going to learn a similar internal representation, or to have similar concepts."
This is a fair criticism. But it is mildly suggestive of the possibility that an AI that was trained to understand the world via feedback from human operators would end up building a similar conceptual space. At least assuming that we chose the right learning algorithms.
When we train a language model to classify sentences by labeling some of them as valid and others as invalid, there's a hidden structure implicit in our answers: the structure of how we understand the world, and of how we think of the meaning of words. The language model extracts that hidden structure and begins to classify previously unseen things in terms of those implicit reasoning patterns. Similarly, if we gave an AI feedback about what kinds of actions counted as "leaving the box" and which ones didn't, there would be a certain way of viewing and conceptualizing the world implied by that feedback, one which the AI could learn.
Comparing representations
"Hmm, maaaaaaaaybe", is your skeptical answer. "But how would you ever know? Like, you can test the AI in your training situation, but how do you know that it's actually acquired a similar-enough representation and not something wildly off? And it's one thing to look at those vector spaces and claim that there are human-like relationships among the different items, but that's still a little hand-wavy. We don't actually know that the human brain does anything remotely similar to represent concepts."
Here we turn, for a moment, to neuroscience.
Multivariate Cross-Classification (MVCC) is a clever neuroscience methodology used for figuring out whether different neural representations of the same thing have something in common. For example, we may be interested in whether the visual and tactile representation of a banana have something in common.
We can test this by having several test subjects look at pictures of objects such as apples and bananas while sitting in a brain scanner. We then feed the scans of their brains into a machine learning classifier and teach it to distinguish between the neural activity of looking at an apple, versus the neural activity of looking at a banana. Next we have our test subjects (still sitting in the brain scanners) touch some bananas and apples, and ask our machine learning classifier to guess whether the resulting neural activity is the result of touching a banana or an apple. If the classifier - which has not been trained on the "touch" representations, only on the "sight" representations - manages to achieve a better-than-chance performance on this latter task, then we can conclude that the neural representation for e.g. "the sight of a banana" has something in common with the neural representation for "the touch of a banana".
A particularly fascinating experiment of this type is that of Shinkareva et al. (2011), who showed their test subjects both the written words for different tools and dwellings, and, separately, line-drawing images of the same tools and dwellings. A machine-learning classifier was both trained on image-evoked activity and made to predict word-evoked activity and vice versa, and achieved a high accuracy on category classification for both tasks. Even more interestingly, the representations seemed to be similar between subjects. Training the classifier on the word representations of all but one participant, and then having it classify the image representation of the left-out participant, also achieved a reliable (p<0.05) category classification for 8 out of 12 participants. This suggests a relatively similar concept space between humans of a similar background.
We can now hypothesize some ways of testing the similarity of the AI's concept space with that of humans. Possibly the most interesting one might be to develop a translation between a human's and an AI's internal representations of concepts. Take a human's neural activation when they're thinking of some concept, and then take the AI's internal activation when it is thinking of the same concept, and plot them in a shared space similar to the English-Mandarin translation. To what extent do the two concept geometries have similar shapes, allowing one to take a human's neural activation of the word "cat" to find the AI's internal representation of the word "cat"? To the extent that this is possible, one could probably establish that the two share highly similar concept systems.
One could also try to more explicitly optimize for such a similarity. For instance, one could train the AI to make predictions of different concepts, with the additional constraint that its internal representation must be such that a machine-learning classifier trained on a human's neural representations will correctly identify concept-clusters within the AI. This might force internal similarities on the representation beyond the ones that would already be formed from similarities in the data.
Next post in series: The problem of alien concepts.
16 types of useful predictions
How often do you make predictions (either about future events, or about information that you don't yet have)? If you're a regular Less Wrong reader you're probably familiar with the idea that you should make your beliefs pay rent by saying, "Here's what I expect to see if my belief is correct, and here's how confident I am," and that you should then update your beliefs accordingly, depending on how your predictions turn out.
And yet… my impression is that few of us actually make predictions on a regular basis. Certainly, for me, there has always been a gap between how useful I think predictions are, in theory, and how often I make them.
I don't think this is just laziness. I think it's simply not a trivial task to find predictions to make that will help you improve your models of a domain you care about.
At this point I should clarify that there are two main goals predictions can help with:
- Improved Calibration (e.g., realizing that I'm only correct about Domain X 70% of the time, not 90% of the time as I had mistakenly thought).
- Improved Accuracy (e.g., going from being correct in Domain X 70% of the time to being correct 90% of the time)
If your goal is just to become better calibrated in general, it doesn't much matter what kinds of predictions you make. So calibration exercises typically grab questions with easily obtainable answers, like "How tall is Mount Everest?" or "Will Don Draper die before the end of Mad Men?" See, for example, the Credence Game, Prediction Book, and this recent post. And calibration training really does work.
But even though making predictions about trivia will improve my general calibration skill, it won't help me improve my models of the world. That is, it won't help me become more accurate, at least not in any domains I care about. If I answer a lot of questions about the heights of mountains, I might become more accurate about that topic, but that's not very helpful to me.
So I think the difficulty in prediction-making is this: The set {questions whose answers you can easily look up, or otherwise obtain} is a small subset of all possible questions. And the set {questions whose answers I care about} is also a small subset of all possible questions. And the intersection between those two subsets is much smaller still, and not easily identifiable. As a result, prediction-making tends to seem too effortful, or not fruitful enough to justify the effort it requires.

But the intersection's not empty. It just requires some strategic thought to determine which answerable questions have some bearing on issues you care about, or -- approaching the problem from the opposite direction -- how to take issues you care about and turn them into answerable questions.
I've been making a concerted effort to hunt for members of that intersection. Here are 16 types of predictions that I personally use to improve my judgment on issues I care about. (I'm sure there are plenty more, though, and hope you'll share your own as well.)
- Predict how long a task will take you. This one's a given, considering how common and impactful the planning fallacy is.
Examples: "How long will it take to write this blog post?" "How long until our company's profitable?" - Predict how you'll feel in an upcoming situation. Affective forecasting – our ability to predict how we'll feel – has some well known flaws.
Examples: "How much will I enjoy this party?" "Will I feel better if I leave the house?" "If I don't get this job, will I still feel bad about it two weeks later?" - Predict your performance on a task or goal.
One thing this helps me notice is when I've been trying the same kind of approach repeatedly without success. Even just the act of making the prediction can spark the realization that I need a better game plan.
Examples: "Will I stick to my workout plan for at least a month?" "How well will this event I'm organizing go?" "How much work will I get done today?" "Can I successfully convince Bob of my opinion on this issue?" - Predict how your audience will react to a particular social media post (on Facebook, Twitter, Tumblr, a blog, etc.).
This is a good way to hone your judgment about how to create successful content, as well as your understanding of your friends' (or readers') personalities and worldviews.
Examples: "Will this video get an unusually high number of likes?" "Will linking to this article spark a fight in the comments?" - When you try a new activity or technique, predict how much value you'll get out of it.
I've noticed I tend to be inaccurate in both directions in this domain. There are certain kinds of life hacks I feel sure are going to solve all my problems (and they rarely do). Conversely, I am overly skeptical of activities that are outside my comfort zone, and often end up pleasantly surprised once I try them.
Examples: "How much will Pomodoros boost my productivity?" "How much will I enjoy swing dancing?" - When you make a purchase, predict how much value you'll get out of it.
Research on money and happiness shows two main things: (1) as a general rule, money doesn't buy happiness, but also that (2) there are a bunch of exceptions to this rule. So there seems to be lots of potential to improve your prediction skill here, and spend your money more effectively than the average person.
Examples: "How much will I wear these new shoes?" "How often will I use my club membership?" "In two months, will I think it was worth it to have repainted the kitchen?" "In two months, will I feel that I'm still getting pleasure from my new car?" - Predict how someone will answer a question about themselves.
I often notice assumptions I'm been making about other people, and I like to check those assumptions when I can. Ideally I get interesting feedback both about the object-level question, and about my overall model of the person.
Examples: "Does it bother you when our meetings run over the scheduled time?" "Did you consider yourself popular in high school?" "Do you think it's okay to lie in order to protect someone's feelings?" - Predict how much progress you can make on a problem in five minutes.
I often have the impression that a problem is intractable, or that I've already worked on it and have considered all of the obvious solutions. But then when I decide (or when someone prompts me) to sit down and brainstorm for five minutes, I am surprised to come away with a promising new approach to the problem.
Example: "I feel like I've tried everything to fix my sleep, and nothing works. If I sit down now and spend five minutes thinking, will I be able to generate at least one new idea that's promising enough to try?" - Predict whether the data in your memory supports your impression.
Memory is awfully fallible, and I have been surprised at how often I am unable to generate specific examples to support a confident impression of mine (or how often the specific examples I generate actually contradict my impression).
Examples: "I have the impression that people who leave academia tend to be glad they did. If I try to list a bunch of the people I know who left academia, and how happy they are, what will the approximate ratio of happy/unhappy people be?"
"It feels like Bob never takes my advice. If I sit down and try to think of examples of Bob taking my advice, how many will I be able to come up with?" - Pick one expert source and predict how they will answer a question.
This is a quick shortcut to testing a claim or settling a dispute.
Examples: "Will Cochrane Medical support the claim that Vitamin D promotes hair growth?" "Will Bob, who has run several companies like ours, agree that our starting salary is too low?" - When you meet someone new, take note of your first impressions of him. Predict how likely it is that, once you've gotten to know him better, you will consider your first impressions of him to have been accurate.
A variant of this one, suggested to me by CFAR alum Lauren Lee, is to make predictions about someone before you meet him, based on what you know about him ahead of time.
Examples: "All I know about this guy I'm about to meet is that he's a banker; I'm moderately confident that he'll seem cocky." "Based on the one conversation I've had with Lisa, she seems really insightful – I predict that I'll still have that impression of her once I know her better." - Predict how your Facebook friends will respond to a poll.
Examples: I often post social etiquette questions on Facebook. For example, I recently did a poll asking, "If a conversation is going awkwardly, does it make things better or worse for the other person to comment on the awkwardness?" I confidently predicted most people would say "worse," and I was wrong. - Predict how well you understand someone's position by trying to paraphrase it back to him.
The illusion of transparency is pernicious.
Examples: "You said you think running a workshop next month is a bad idea; I'm guessing you think that's because we don't have enough time to advertise, is that correct?"
"I know you think eating meat is morally unproblematic; is that because you think that animals don't suffer?" - When you have a disagreement with someone, predict how likely it is that a neutral third party will side with you after the issue is explained to her.
For best results, don't reveal which of you is on which side when you're explaining the issue to your arbiter.
Example: "So, at work today, Bob and I disagreed about whether it's appropriate for interns to attend hiring meetings; what do you think?" - Predict whether a surprising piece of news will turn out to be true.
This is a good way to hone your bullshit detector and improve your overall "common sense" models of the world.
Examples: "This headline says some scientists uploaded a worm's brain -- after I read the article, will the headline seem like an accurate representation of what really happened?"
"This viral video purports to show strangers being prompted to kiss; will it turn out to have been staged?" - Predict whether a quick online search will turn up any credible sources supporting a particular claim.
Example: "Bob says that watches always stop working shortly after he puts them on – if I spend a few minutes searching online, will I be able to find any credible sources saying that this is a real phenomenon?"
I have one additional, general thought on how to get the most out of predictions:
Rationalists tend to focus on the importance of objective metrics. And as you may have noticed, a lot of the examples I listed above fail that criterion. For example, "Predict whether a fight will break out in the comments? Well, there's no objective way to say whether something officially counts as a 'fight' or not…" Or, "Predict whether I'll be able to find credible sources supporting X? Well, who's to say what a credible source is, and what counts as 'supporting' X?"
And indeed, objective metrics are preferable, all else equal. But all else isn't equal. Subjective metrics are much easier to generate, and they're far from useless. Most of the time it will be clear enough, once you see the results, whether your prediction basically came true or not -- even if you haven't pinned down a precise, objectively measurable success criterion ahead of time. Usually the result will be a common sense "yes," or a common sense "no." And sometimes it'll be "um...sort of?", but that can be an interestingly surprising result too, if you had strongly predicted the results would point clearly one way or the other.
Along similar lines, I usually don't assign numerical probabilities to my predictions. I just take note of where my confidence falls on a qualitative "very confident," "pretty confident," "weakly confident" scale (which might correspond to something like 90%/75%/60% probabilities, if I had to put numbers on it).
There's probably some additional value you can extract by writing down quantitative confidence levels, and by devising objective metrics that are impossible to game, rather than just relying on your subjective impressions. But in most cases I don't think that additional value is worth the cost you incur from turning predictions into an onerous task. In other words, don't let the perfect be the enemy of the good. Or in other other words: the biggest problem with your predictions right now is that they don't exist.
Twenty basic rules for intelligent money management
1. Start investing early in life.
.jpg)
The power of compound interest means you will have much more money at retirement if you start investing early in your career. For example, imagine that at age eighteen you invest $1,000 and earn an 8% return per year. At age seventy you will have $54,706. In contrast, if you make the same investment at age fifty you will have a paltry $4,661 when you turn seventy.
Many people who haven't saved for retirement panic upon reaching middle age. So if you are young don't think that saving today will help you only when you retire, but know that such savings will give you greater peace of mind when you turn forty.
When evaluating potential marriage partners give bonus points to those who have a history of saving. Do this not because you want to marry into wealth, but because you should want to marry someone who has discipline, intelligence and foresight.
[LINK] Terry Pratchett is dead
I'm sure I'm not the only one who greatly admired him. The theme of his stories was progress; they were set in a fantasy world, it's true, but one that was frequently a direct analogy to our own past, and where the golden age was always right now. The recent books made this ever more obvious.
We have lost a great man today, but it's the way he died that makes me uncomfortable. Terry Pratchett had early-onset Alzheimer's, and while I doubt it would have mattered, he couldn't have chosen cryonics even if he wanted to. He campaigned for voluntary euthanasia in cases like his. I will refrain from speculating on whether his unexpected death was wholly natural; whether it was or wasn't, I can't see this having a better outcome. In short...
There is, for each of us, a one-ninth chance of developing Alzheimer's if we live long enough. Many of us may have relatives that are already showing signs, and in the current regime these relatives cannot be cryonically stored even if they wish to try; by the time they die, there will be little purpose in doing so. For cryonics to help for neurodegenerative disorders, it needs to be applied before they become fatal.
Is there anything we can do to change that? Are there countries in which that generalisation is false?
[POLL] LessWrong group on YourMorals.org (2015)
The regular research has had interesting results like showing a distinct pattern of cognitive traits and values associated with libertarian politics, but there's no reason one can't use it for investigating LWers in more detail; for example, going through the results, "we can see that many of us consider purity/respect to be far less morally significant than most", and we collectively seem to have Conscientiousness issues. (I also drew on it recently for a gay marriage comment.) If there were more data, it might be interesting to look at the results and see where LWers diverge the most from libertarians (the mainstream group we seem most psychologically similar to), but unfortunately for a lot of the tests, there's too little to bother with (LW n<10). Maybe more people could take it.
Big 5: http://www.yourmorals.org/bigfive_process.php
(You can see some of my results at http://www.gwern.net/Links#profile )
Report -- Allocating risk mitigation across time
I've just released a Future of Humanity Institute technical report, written as part of the Global Priorities Project.
Abstract:
This article is about priority-setting for work aiming to reduce existential risk. Its chief claim is that all else being equal we should prefer work earlier and prefer to work on risks that might come early. This is because we are uncertain about when we will have to face different risks, because we expect diminishing returns of extra work, and because we expect that more people will work on these risks in the future.
I explore this claim both qualitatively and with explicit models. I consider its implications for two questions: first, “When is it best to do different kinds of work?”; second, “Which risks should we focus on?”.
As a major application, I look at the case of risk from artificial intelligence. The best strategies for reducing this risk depend on when the risk is coming. I argue that we may be underinvesting in scenarios where AI comes soon even though these scenarios are relatively unlikely, because we will not have time later to address them.
You can read the full report here: Allocating risk mitigation across time.
Announcing the Complice Less Wrong Study Hall
(If you're familiar with the backstory of the LWSH, you can skip to paragraph 5. If you just want the link to the chat, click here: LWSH on Complice)
The Less Wrong Study Hall was created as a tinychat room in March 2013, following Mqrius and ShannonFriedman's desire to create a virtual context for productivity. In retrospect, I think it's hilarious that a bunch of the comments ended up being a discussion of whether LW had the numbers to get a room that consistently had someone in it. The funny part is that they were based around the assumption that people would spend about 1h/day in it.
Once it was created, it was so effective that people started spending their entire day doing pomodoros (with 32minsWork+8minsBreak) in the LWSH and now often even stay logged in while doing chores away from their computers, just for cadence of focus and the sense of company. So there's almost always someone there, and often 5-10 people.
A week in, a call was put out for volunteers to program a replacement for the much-maligned tinychat. As it turns out though, video chat is a hard problem.
So nearly 2 years later, people are still using the tinychat.
But a few weeks ago, I discovered that you can embed the tinychat applet into an arbitrary page. I immediately set out to integrate LWSH into Complice, the productivity app I've been building for over a year, which counts many rationalists among its alpha & beta users.
The focal point of Complice is its today page, which consists of a list of everything you're planning to accomplish that day, colorized by goal. Plus a pomodoro timer. My habit for a long time has been to have this open next to LWSH. So what I basically did was integrate these two pages. On the left, you have a list of your own tasks. On the right, a list of other users in the room, with whatever task they're doing next. Then below all of that, the chatroom.
(Something important to note: I'm not planning to point existing Complice users, who may not be LWers, at the LW Study Hall. Any Complice user can create their own coworking room by going to complice.co/createroom)
With this integration, I've solved many of the core problems that people wanted addressed for the study hall:
- an actual ding sound beyond people typing in the chat
- synchronized pomodoro time visibility
- pomos that automatically start, so breaks don't run over
- Intentions — what am I working on this pomo?
- a list of what other users are working on
- the ability to show off how many pomos you've done
- better welcoming & explanation of group norms
There are a couple other requested features that I can definitely solve but decided could come after this launch:
- rooms with different pomodoro durations
- member profiles
- the ability to precommit to showing up at a certain time (maybe through Beeminder?!)
The following points were brought up in the Programming the LW Study Hall post or on the List of desired features on the github/nnmm/lwsh wiki, but can't be fixed without replacing tinychat:
- efficient with respect to bandwidth and CPU
- page layout with videos lined up down the left for use on the side of monitors
- chat history
- encryption
- everything else that generally sucks about tinychat
It's also worth noting that if you were to think of the entirety of Complice as an addition to LWSH... well, it would definitely look like feature creep, but at any rate there would be several other notable improvements:
- daily emails prompting you to decide what you're going to do that day
- a historical record of what you've done, with guided weekly, monthly, and yearly reviews
- optional accountability partner who gets emails with what you've done every day (the LWSH might be a great place to find partners!)
(This article posted to Main because that's where the rest of the LWSH posts are, and this represents a substantial update.)

Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)