The trouble with Bayes (draft)
Prerequisites
This post requires some knowledge of Bayesian and Frequentist statistics, as well as probability. It is intended to explain one of the more advanced concepts in statistical theory--Bayesian non-consistency--to non-statisticians, and although the level required is much less than would be required to read some of the original papers on the topic[1], some considerable background is still required.
The Bayesian dream
Bayesian methods are enjoying a well-deserved growth of popularity in the sciences. However, most practitioners of Bayesian inference, including most statisticians, see it as a practical tool. Bayesian inference has many desirable properties for a data analysis procedure: it allows for intuitive treatment of complex statistical models, which include models with non-iid data, random effects, high-dimensional regularization, covariance estimation, outliers, and missing data. Problems which have been the subject of Ph. D. theses and entire careers in the Frequentist school, such as mixture models and the many-armed bandit problem, can be satisfactorily handled by introductory-level Bayesian statistics.
A more extreme point of view, the flavor of subjective Bayes best exemplified by Jaynes' famous book [2], and also by an sizable contingent of philosophers of science, elevates Bayesian reasoning to the methodology for probabilistic reasoning, in every domain, for every problem. One merely needs to encode one's beliefs as a prior distribution, and Bayesian inference will yield the optimal decision or inference.
To a philosophical Bayesian, the epistemological grounding of most statistics (including "pragmatic Bayes") is abysmal. The practice of data analysis is either dictated by arbitrary tradition and protocol on the one hand, or consists of users creatively employing a diverse "toolbox" of methods justified by a diverse mixture of incompatible theoretical principles like the minimax principle, invariance, asymptotics, maximum likelihood or *gasp* "Bayesian optimality." The result: a million possible methods exist for any given problem, and a million interpretations exist for any data set, all depending on how one frames the problem. Given one million different interpretations for the data, which one should *you* believe?
Why the ambiguity? Take the textbook problem of determining whether a coin is fair or weighted, based on the data obtained from, say, flipping it 10 times. Keep in mind, a principled approach to statistics decides the rule for decision-making before you see the data. So, what rule whould you use for your decision? One rule is, "declare it's weighted, if either 10/10 flips are heads or 0/10 flips are heads." Another rule is, "always declare it to be weighted." Or, "always declare it to be fair." All in all, there are 10 possible outcomes (supposing we only care about the total) and therefore there are 2^10 possible decision rules. We can probably rule out most of them as nonsensical, like, "declare it to be weighted if 5/10 are heads, and fair otherwise" since 5/10 seems like the fairest outcome possible. But among the remaining possibilities, there is no obvious way to choose the "best" rule. After all, the performance of the rule, defined as the probability you will make the correct conclusion from the data, depends on the unknown state of the world, i.e. the true probability of flipping heads for that particular the coin.
The Bayesian approach "cuts" the Gordion knot of choosing the best rule, by assuming a prior distribution over the unknown state of the world. Under this prior distribution, one can compute the average perfomance of any decision rule, and choose the best one. For example, suppose your prior is that with probability 99.9999%, the coin is fair. Then the best decision rule would be to "always declare it to be fair!"
The Bayesian approach gives you the optimal decision rule for the problem, as soon as you come up with a model for the data and a prior for your model. But when you are looking at data analysis problems in the real world (as opposed to a probability textbook), the choice of model is rarely unambiguous. Hence, for me, the standard Bayesian approach does not go far enough--if there are a million models you could choose from, you still get a million different conclusions as a Bayesian.
Hence, one could argue that a "pragmatic" Bayesian who thinks up a new model for every problem is just as epistemologically suspect as any Frequentist. Only the strongest form of subjective Bayesianism can one escape this ambiguity. The dream for the subjective Bayesian dream is to start out in life with a single model. A single prior. For the entire world. This "world prior" would contain all the entirety of one's own life experience, and the grand total of human knowledge. Surely, writing out this prior is impossible. But the point is that a true Bayesian must behave (at least approximately) as if they were driven by such a universal prior. In principle, having such an universal prior (at least conceptually) solves the problem of choosing models and priors for problems: the priors and models you choose for particular problems are determined by the posterior of your universal prior. For example, why did you decide on a linear model for your economics data? It's because according to your universal posterior, you particular economic data is well-described by such a model with high-probability.
The main practical consequence of the universal prior is that your inferences in one problem should be consistent which your inferences in another, related problem. Even if the subjective Bayesian never writes out a "grand model", their integrated approach to data analysis for related problems still distinguishes their approach from the piecemeal approach of frequentists, who tend to treat each data analysis problem as if it occurs in an isolated universe. (So I claim, though I cannot point to any real example of such a subjective Bayesian.)
Yet, even if the subjective Bayesian ideal could be realized, many philosophers of science (e.g. Deborah Mayo) would consider it just as ambiguous as non-Bayesian approaches, since even if you have an unambiguous proecdure for forming personal priors, your priors are still going to differ from mine. I don't consider this a defect, since my worldview necessarily does differ from yours. My ultimate goal is to make the best decision for myself. That said, such egocentrism, even if rationally motivated, may indeed be poorly suited for a collaborative enterprise like science.
For me, the most far more troublesome objection to the "Bayesian dream" is the question, "How would actually you go about constructing this prior that represents all of your beliefs?" Looking in the Bayesian literature, one does not find any convincing examples of any user of Bayesian inference managing to actually encode all (or even a tiny portion) of their beliefs in the form of the prior--in fact, for the most part, we see alarmingly little thought or justification being put into the construction of the priors.
Nevertheless, I myself remained one of these "hardcore Bayesians", at least from a philosophical point of view, ever since I started learning about statistics. My faith in the "Bayesian dream" persisted even after spending three years in the Ph. D. program in Stanford (a department with a heavy bias towards Frequentism) and even after I personally started doing research in frequentist methods. (I see frequentist inference as a poor man's approximation for the ideal Bayesian inference.) Though I was aware of the Bayesian non-consistency results, I largely dismissed them as mathematical pathologies. And while we were still a long way from achieving universal inference, I held the optimistic view that improved technology and theory might one day finally make the "Bayesian dream" achievable. However, I could not find a way to ignore one particular example on Wasserman's blog[3], due to its relevance to very practical problems in causal inference. Eventually I thought of an even simpler counterexample, which devastated my faith in the possibility of constructing a universal prior. Perhaps a fellow Bayesian can find a solution to this quagmire, but I am not holding my breath.
The root of the problem is the extreme degree of ignorance we have about our world, the degree of surprisingness of many true scientific discoveries, and the relative ease with which we accept these surprises. If we consider this behavior rational (which I do), then the subjective Bayesian is obligated to construct a prior which captures this behavior. Yet, the diversity of possible surprises the model must be able to accommodate makes it practically impossible (if not mathematically impossible) to construct such a prior. The alternative is to reject all possibility of surprise, and refuse to update any faster than a universal prior would (extremely slowly), which strikes me as a rather poor epistemological policy.
In the rest of the post, I'll motivate my example, sketch out a few mathematical details (explaining them as best I can to a general audience), then discuss the implications.
Introduction: Cancer classification
Biology and medicine are currently adapting to the wealth of information we can obtain by using high-throughput assays: technologies which can rapidly read the DNA of an individual, measure the concentration of messenger RNA, metabolites, and proteins. In the early days of this "large-scale" approach to biology which began with the Human Genome Project, some optimists had hoped that such an unprecedented torrent of raw data would allow scientists to quickly "crack the genetic code." By now, any such optimism has been washed away by the overwhelming complexity and uncertainty of human biology--a complexity which has been made clearer than ever by the flood of data--and replaced with a sober appreciation that in the new "big data" paradigm, making a discovery becomes a much easier task than understanding any of those discoveries.
Enter the application of machine learning to this large-scale biological data. Scientists take these massive datasets containing patient outcomes, demographic characteristics, and high-dimensional genetic, neurological, and metabolic data, and analyze them using algorithms like support vector machines, logistic regression and decision trees to learn predictive models to relate key biological variables, "biomarkers", to outcomes of interest.
To give a specific example, take a look at this abstract from the Shipp. et. al. paper on detecting survival rates for cancer patients [4]:
Diffuse large B-cell lymphoma (DLBCL), the most common lymphoid malignancy in adults, is curable in less than 50% of patients. Prognostic models based on pre-treatment characteristics, such as the International Prognostic Index (IPI), are currently used to predict outcome in DLBCL. However, clinical outcome models identify neither the molecular basis of clinical heterogeneity, nor specific therapeutic targets. We analyzed the expression of 6,817 genes in diagnostic tumor specimens from DLBCL patients who received cyclophosphamide, adriamycin, vincristine and prednisone (CHOP)-based chemotherapy, and applied a supervised learning prediction method to identify cured versus fatal or refractory disease. The algorithm classified two categories of patients with very different five-year overall survival rates (70% versus 12%). The model also effectively delineated patients within specific IPI risk categories who were likely to be cured or to die of their disease. Genes implicated in DLBCL outcome included some that regulate responses to B-cell−receptor signaling, critical serine/threonine phosphorylation pathways and apoptosis. Our data indicate that supervised learning classification techniques can predict outcome in DLBCL and identify rational targets for intervention.
The term "supervised learning" refers to any algorithm for learning a predictive model for predicting some outcome Y(could be either categorical or numeric) from covariates or features X. In this particular paper, the authors used a relatively simple linear model (which they called "weighted voting") for prediction.
A linear model is fairly easy to interpret: it produces a single "score variable" via a weighted average of a number of predictor variables. Then it predicts the outcome (say "survival" or "no survival") based on a rule like, "Predict survival if the score is larger than 0." Yet, far more advanced machine learning models have been developed, including "deep neural networks" which are winning all of the image recognition and machine translation competitions at the moment. These "deep neural networks" are especially notorious for being difficult to interpret. Along with similarly complicated models, neural networks are often called "black box models": although you can get miraculously accurate answers out of the "box", peering inside won't give you much of a clue as to how it actually works.
Now it is time for the first thought experiment. Suppose a follow-up paper to the Shipp paper reports dramatically improved prediction for survival outcomes of lymphoma patients. The authors of this follow-up paper trained their model on a "training sample" of 500 patients, then used it to predict the five-year outcome of chemotherapy patients, on a "test sample" of 1000 patients. It correctly predicts the outcome ("survival" vs "no survival") on 990 of the 1000 patients.
Question 1: what is your opinion on the predictive accuracy of this model on the population of chemotherapy patients? Suppose that publication bias is not an issue (the authors of this paper designed the study in advance and committed to publishing) and suppose that the test sample of 1000 patients is "representative" of the entire population of chemotherapy patients.
Question 2: does your judgment depend on the complexity of the model they used? What if the authors used an extremely complex and counterintuitive model, and cannot even offer any justification or explanation for why it works? (Nevertheless, their peers have independently confirmed the predictive accuracy of the model.)
A Frequentist approach
The Frequentist answer to the thought experiment is as follows. The accuracy of the model is a probability p which we wish to estimate. The number of successes on the 1000 test patients is Binomial(p, 1000). Based on the data, one can construct a confidence interal: say, we are 99% confident that the accuracy is above 83%. What does 99% confident mean? I won't try to explain, but simply say that in this particular situation, "I'm pretty sure" that the accuracy of the model is above 83%.
A Bayesian approach
The Bayesian interjects, "Hah! You can't explain what your confidence interval actually means!" He puts a uniform prior on the probability p. The posterior distribution of p, conditional on the data, is Beta(991, 11). This gives a 99% credible interval that p is in [0.978, 0.995]. You can actually interpret the interval in probabilistic terms, and it gives a much tighter interval as well. Seems like a Bayesian victory...?
A subjective Bayesian approach
As I have argued before, a Bayesian approach which comes up with a model after hearing about the problem is bound to suffer from the same inconsistency and arbitariness as any non-Bayesian approach. You might assume a uniform distribution for p in this problem... but yet another paper comes along with a similar prediction model? You would need a join distribution for the current model and the new model. What if a theory comes along that could help explain the success of the current method? The parameter p might take a new meaning in this context.
So as a subjective Bayesian, I argue that slapping a uniform prior on the accuracy is the wrong approach. But I'll stop short of actually constructing a Bayesian model of the entire world: let's say we want to restrict our attention to this particular issue of cancer prediction. We want to model the dynamics behind cancer and cancer treatment in humans. Needless to say, the model is still ridiculously complicated. However, I don't think it's out of reach of the efforts of a well-funded, large collaborative effort of scientists.
Roughly speaking, the model can be divided into a distribution over theories of human biology, and conditional on the theory of biology, a course-grained model of an individual patient. The model would not include every cell, every molecule, etc., but it would contain many latent variables in addition to the variables measured in any particular cancer study. Let's call the variables actually measured in the study, X, and also the survival outcome, Y.
Now here is the epistemologically correct way to answer the thought experiment. Take a look at the X's and Y's of the patients in the training and test set. Update your probabilistic model of human biology based on the data. Then take a look at the actual form of the classifier: it's a function f() mapping X's to Y's. The accuracy of the classsifer is no longer parameter: it's a quantity Pr[f(X) = Y] which has a distribution under your posterior. That is, for any given "theory of human biology", Pr[f(X) = Y] has a fixed value: now, over the distribution of possible theories of human biology (based on the data of the current study as well as all previous studies and your own beliefs) Pr[f(X) = Y] has a distribution, and therefore, an average. But what will this posterior give you? Will you get something similar to the interval [0.978, 0.995] you got from the "practical Bayes" approach?
Who knows? But I would guess in all likelihood not. My guess you would get a very different interval from [0.978, 0.995], because in this complex model there is no direct link from the empirical success rate of prediction, and the quantity Pr[f(X) = Y]. But my intuition for this fact comes from the following simpler framework.
A non-parametric Bayesian approach
Instead of reasoning about a gand Bayesian model of biology, I now take a middle ground, and suggesting that while we don't need to capture the entire latent dynamics of cancer, we should at the very least we should try to include the X's and the Y's in the model, instead of merely abstracting the whole experiment as a Binomial trial (as did the frequentist and pragmatic Bayesian.) Hence we need a prior over joint distributions of (X, Y). And yes, I do mean a prior distribution over probability distributions: we are saying that (X, Y) has some unknown joint distribution, which we treat as being drawn at random from a large collection of distributions. This is therefore a non-parametric Bayes approach: the term non-parametric means that the number of the parameters in the model is not finite.
Since in this case Y is a binary outcome, a joint distribution can be decomposed as a marginal distribution over X, and a function g(x) giving the conditional probability that Y=1 given X=x. The marginal distribution is not so interesting or important for us, since it simple reflects the composition of the population of patients. For the purpose of this example, let us say that the marginal is known (e.g., a finite distribution over the population of US cancer patients). What we want to know is the probability of patient survival, and this is given by the function g(X) for the particular patient's X. Hence, we will mainly deal with constructing a prior over g(X).
To construct a prior, we need to think of intuitive properties of the survival probability function g(x). If x is similar to x', then we expect the survival probabilities to be similar. Hence the prior on g(x) should be over random, smooth functions. But we need to choose the smoothness so that the prior does not consist of almost-constant functions. Suppose for now that we choose particular class of smooth functions (e.g. functions with a certain Lipschitz norm) and choose our prior to to be uniform over functions of that smoothness. We could go further and put a prior on the smoothness hyperparameter, but for now we won't.
Now, although I assert my faithfulness to the Bayesian ideal, I still want to think about how whatever prior we choose would allow use to answer some simple though experiments. Why is that? I hold that the ideal Bayesian inference should capture and refine what I take to be "rational behavior." Hence, if a prior produces irrational outcomes, I reject that prior as not reflecting my beliefs.
Take the following thought experiment: we simply want to estimate the expected value of Y, E[Y]. Hence, we draw 100 patients independently with replacement from the population and record their outcomes: suppose the sum is 80 out of 100. The Frequentist (and prgamatic Bayesian) would end up concluding that with high probability/confidence/whatever, the expected value of Y is around 0.8, and I would hold that an ideal rationalist come up with a similar belief. But what would our non-parametric model say? We would draw a random function g(x) conditional on our particular observations: we get a quantity E[g(X)] for each instantiation of g(x): the distribution of E[g(X)]'s over the posterior allows us to make credible intervals for E[Y].
But what do we end up getting? Either one of two things happens. Either you choose too little smoothness, and E[g(X)] ends up concentrating at around 0.5, no matter what data you put into the model. This is the phenomenon of Bayesian non-consistency, and a detailed explanation can be found in several of the listed references: but to put it briefly, sampling at a few isolated points gives you too little information on the rest of the function. This example is not as pathological as the ones used in the literature: if you sample infinitely many points, you will eventually get the posterior to concentrate around the true value of E[Y], but all the same, the convergence is ridiculously slow. Alternatively, use a super-high smoothness, and the posterior of E[g(X)] has a nice interval around the sample value just like in the Binomial example. But now if you look at your posterior draws of g(x), you'll notice the functions are basically constants. Putting a prior on smoothness doesn't change things: the posterior on smoothness doesn't change, since you don't actually have enough data to determine the smoothness of the function. The posterior average of E[g(X)] is no longer always 0.5: it gets a little bit affected by the data, since within the 10% mass of the posterior corresponding to the smooth prior, the average of E[g(X)] is responding to the data. But you are still almost as slow as before in converging to the truth.
At the time that I started thinking about the above "uniform sampling" example, I was stil convinced of a Bayesian resolution. Obviously, using a uniform prior over smooth functions is too naive: you can tell by seeing that the prior distribution over E[g(X)] is already highly concentrated around 0.5. How about a hierarchical model, where first we draw a parameter p from the uniform distribution, and then draw g(x) from the uniform distribution over smooth functions with mean value equal to p? This gets you non-constant g(x) in the posterior, while your posteriors of E[g(X)] converge to the truth as quickly as in the Binomial example. Arguing backwards, I would say that such a prior comes closer to capturing my beliefs.
But then I thought, what about more complicated problems than computing E[Y]? What if you have to compute the expectation of Y conditional on some complicated function of X taking on a certain value: i.e. E[Y|f(X) = 1]? In the frequentist world, you can easily compute E[Y|f(X)=1] by rejection sampling: get a sample of individuals, average the Y's of the individuals whose X's satisfy f(X) = 1. But how could you formulate a prior that has the same property? For a finite collection of functions f, {f1,...,f100}, say, you might be able to construct a prior for g(x) so that the posterior for E[g(X)|fi = 1] converges to the truth for every i in {1,..,100}. I don't know how to do so, but perhaps you know. But the frequentist intervals work for every function f! Can you construct a prior which can do the same?
I am happy to argue that a true Bayesian would not need consistency for every possible f in the mathematical universe. It is cool that frequentist inference works for such a general collection: but it may well be unnecessary for the world we live in. In other words, there may be functions f which are so ridiculous, that even if you showed me that empirically, E[Y|f(X)=1] = 0.9, based on data from 1 million patients, I would not believe that E[Y|f(X)=1] was close to 0.9. It is a counterintuitive conclusion, but one that I am prepared to accept.
Yet, the set of f's which are not so ridiculous, which in fact I might accept to be reasonable based on conventional science, may be so large as to render impossible the construction of a prior which could accommodate them all. But the Bayesian dream makes the far stronger demand that our prior capture not just our current understanding of science but to match the flexibility of rational thought. I hold that given the appropriate evidence, rationalists can be persuaded to accept truths which they could not even imagine beforehand. Thinking about how we could possibly construct a prior to mimic this behavior, the Bayesian dream seems distant indeed.
Discussion
To be updated later... perhaps responding to some of your comments!
[1] Diaconis and Freedman, "On the Consistency of Bayes Estimates"
[2] ET Jaynes, Probability: the Logic of Science
[3] https://normaldeviate.wordpress.com/2012/08/28/robins-and-wasserman-respond-to-a-nobel-prize-winner/
[4] Shipp et al. "Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning." Nature
[Link] Rationality and Mental Illness in the Huffington Post
Just published an article in the The Huffington Post about using rationality-informed strategies to manage my mental illness. Hope this helps people think more rationally about this topic.
The application of the secretary problem to real life dating
The following problem is best when not described by me:
https://en.wikipedia.org/wiki/Secretary_problem
Although there are many variations, the basic problem can be stated as follows:
There is a single secretarial position to fill.
There are n applicants for the position, and the value of n is known.
The applicants, if seen altogether, can be ranked from best to worst unambiguously.
The applicants are interviewed sequentially in random order, with each order being equally likely.
Immediately after an interview, the interviewed applicant is either accepted or rejected, and the decision is irrevocable.
The decision to accept or reject an applicant can be based only on the relative ranks of the applicants interviewed so far.
The objective of the general solution is to have the highest probability of selecting the best applicant of the whole group. This is the same as maximizing the expected payoff, with payoff defined to be one for the best applicant and zero otherwise.
Application
After reading that you can probably see the application to real life. There are a series of bad and good assumptions following, some are fair, some are not going to be representative of you. I am going to try to name them all as I go so that you can adapt them with better ones for yourself. Assuming that you plan to have children and you will probably be doing so like billions of humans have done so far in a monogamous relationship while married (the entire set of assumptions does not break down for poly relationships or relationship-anarchy, but it gets more complicated). These assumptions help us populate the Secretary problem with numbers in relation to dating for the purpose of children.
If you assume that a biological female's clock ends at 40. (in that its hard and not healthy for the baby if you try to have a kid past that age), that is effectively the end of the pure and simple biological purpose of relationships. (environment, IVF and adoption aside for a moment). (yes there are a few more years on that)
For the purpose of this exercise – as a guy – you can add a few years for the potential age gap you would tolerate. (i.e. my parents are 7 years apart, but that seems like a big understanding and maturity gap – they don't even like the same music), I personally expect I could tolerate an age gap of 4-5 years.
If you make the assumption that you start your dating life around the ages of 16-18. that gives you about [40-18=22] 22-24 (+5 for me as a male), years of expected dating potential time.
If you estimate the number of kids you want to have, and count either:
3 years for each kid OR
2 years for each kid (+1 kid – AKA 2 years)
(Twins will throw this number off, but estimate that they take longer to recover from, or more time raising them to manageable age before you have time to have another kid)
My worked example is myself – as a child of 3, with two siblings of my own I am going to plan to have 3 children. Or 8-9 years of child-having time. If we subtract that from the number above we end up with 11-16 (16-21 for me being a male) years of dating time.
Also if you happen to know someone with a number of siblings (or children) and a family dynamic that you like; then you should consider that number of children for yourself. Remember that as a grown-up you are probably travelling through the world with your siblings beside you. Which can be beneficial (or detrimental) as well, I would be using the known working model of yourself or the people around you to try to predict whether you will benefit or be at a disadvantage by having siblings. As they say; You can't pick your family - for better and worse. You can pick your friends, if you want them to be as close as a default family - that connection goes both ways - it is possible to cultivate friends that are closer than some families. However you choose to live your life is up to you.
Assume that once you find the right person - getting married (the process of organising a wedding from the day you have the engagement rings on fingers); and falling pregnant (successfully starting a viable pregnancy) takes at least a year. Maybe two depending on how long you want to be "we just got married and we aren't having kids just yet". It looks like 9-15 (15-20 for male adjusted) years of dating.
With my 9-15 years; I estimate a good relationship of working out whether I want to marry someone, is between 6 months and 2 years, (considering as a guy I will probably be proposing and putting an engagement ring on someone's finger - I get higher say about how long this might take than my significant other does.), (This is about the time it takes to evaluate whether you should put the ring on someone's finger). For a total of 4 serious relationships on the low and long end and 30 serious relationships on the upper end. (7-40 male adjusted relationships)
Of course that's not how real life works. Some relationships will be longer and some will be shorter. I am fairly confident that all my relationships will fall around those numbers.
I have a lucky circumstance; I have already had a few serious relationships (substitute your own numbers in here). With my existing relationships I can estimate how long I usually spend in a relationship. (2year + 6 year + 2month + 2month /4 = 2.1 years). Which is to say that I probably have a maximum and total of around 7-15 relationships before I gotta stop expecting to have kids, or start compromising on having 3 kids.
A solution to the secretary equation
A known solution that gives you the best possible candidate the most of the time is to try out 1/e candidates (or roughly 36%), then choose the next candidate that is better than the existing candidates. For my numbers that means to go through 3-7 relationships and then choose the next relationship that is better than all the ones before.
I don't quite like that. It depends on how big your set is; as to what the chance of you having the best candidate in the first 1/e trials and then sticking it out till the last candidate, and settling on them. (this strategy has a ((1/n)*(1/e)) chance of just giving you the last person in the set - which is another opportunity cost risk - what if they are rubbish? Compromise on the age gap, the number of kids or the partners quality...) If the set is 7, the chance that the best candidate is in the first 1/e is 5.26% (if the set is 15 - the chance is much lower at 2.45%).
Opportunity cost
Each further relationship you have might be costing you another 2 years to get further out of touch with the next generation (kids these days!) I tend to think about how old I will be when my kids are 15-20 am I growing rapidly out of touch with the next younger generation? Two years is a very big opportunity spend - another 2 years could see you successfully running a startup and achieving lifelong stability at the cost of the opportunity to have another kid. I don't say this to crush you with fear of inaction; but it should factor in along with other details of your situation.
A solution to the risk of having the best candidate in your test phase; or to the risk of lost opportunity - is to lower the bar; instead of choosing the next candidate that is better than all the other candidates; choose the next candidate that is better than 90% of the candidates so far. Incidentally this probably happens in real life quite often. In a stroke of, "you'll do"...
Where it breaks down
Real life is more complicated than that. I would like to think that subsequent relationships that I get into will already not suffer the stupid mistakes of the last ones; As well as the potential opportunity cost of exploration. The more time you spend looking for different partners – you might lose your early soul mate, or might waste time looking for a better one when you can follow a "good enough" policy. No one likes to know they are "good enough", but we do race the clock in our lifetimes. Life is what happens when you are busy making plans.
As someone with experience will know - we probably test and rule out bad partners in a single conversation, where we don't even get so far as a date. Or don't last more than a week. (I. E the experience set is growing through various means).
People have a tendency to overrate the quality of a relationship while they are in it, versus the ones that already failed.
Did I do something wrong?
“I got married early - did I do something wrong (or irrational)?”
No. equations are not real life. It might have been nice to have the equation, but you obviously didn't need it. Also this equation assumes a monogamous relationship. In real life people have overlapping relationships, you can date a few people and you can be poly. These are all factors that can change the simple assumptions of the equation.
Where does the equation stop working?
Real life is hard. It doesn't fall neatly into line, it’s complicated, it’s ugly, it’s rough and smooth and clunky. But people still get by. Don’t be afraid to break the rule.
Disclaimer: If this equation is the only thing you are using to evaluate a relationship - it’s not going to go very well for you. I consider this and many other techniques as part of my toolbox for evaluating decisions.
Should I break up with my partner?
What? no! Following an equation is not a good reason to live your life.
Does your partner make you miserable? Then yes you should break up.
Do you feel like they are not ready to have kids yet and you want to settle down? Tough call. Even if they were agents also doing the equation; An equation is not real life. Go by your brain; go by your gut. Don’t go by just one equation.
Expect another post soon about reasonable considerations that should be made when evaluating relationships.
The given problem makes the assumption that you are able to evaluate partners in the sense that the secretary problem expects. Humans are not all strategic and can’t really do that. This is why the world is not going to perfectly follow this equation. Life is complicated; there are several metrics that make a good partner and they don’t always trade off between one another.
----------
Meta: writing time - 3 hours over a week; 5+ conversations with people about the idea, bothering a handful of programmers and mathematicians for commentary on my thoughts, and generally a whole bunch of fun talking about it. This post was started on the slack channel when someone asked a related question.
My table of contents for other posts in my series.
Let me know if this post was helpful or if it worked for you or why not.
The Temptation to Bubble
"Never discuss religion or politics."
I was raised in a large family of fundamentalist Christians. Growing up in my house, where discussing politics and religion were the main course of life, the above proverb was said often -- as an expression of regret, shock, or self-flagellation. Later, the experience impressed a deep lesson about bubbling up that even intelligent and rational people fall into. And I ... I am often tempted, so tempted, to give in.
Religion and political identity were the languages of love in my house. Affirming the finer points of a friend's identical values was a natural ritual, like sharing coffee or a meal together, and so soothing we attributed the afterglow to God himself. We can use some religious nonsense to illustrate, but please keep in mind, there's a much more interesting point here than "certain religious views are wrong".
A point of controversy was an especially excellent topic of mutual comfort. How could anyone else be *so* stupid as to believe we came from monkeys and monkeys came from *nothing*! that exploded a gazillion years ago, especially given all the young earth creation evidence that they stubbornly ignored. They obviously just wanted to sin and needed an excuse. Agreeing about something like this, you both felt smarter than the hostile world, and you had someone to help defend you against that hostility. We invented byzantine scaffolding for our shared delusions to keep the conversation interested and agree with each other in ever more creative ways. We esteemed each other, and ourselves, much more.
This safety bubble from the real world would allow denial of anything too painful. Losing a loved one to cancer? God will heal them. God mysteriously decided not this time? They're in Heaven. Did your incredible stupidity lose you your job, your wife, your reputation? God would forgive you and rescue you from the consequences. You could probably find a Bible verse to justify anything you're doing. Ironically, this artificial shell of safety, which kept us from ever facing the pain and finality reality often has, made us all the more fragile inside. The bubble became necessary to psychologically survive.
In this flow of happy mirror neuron dances, minor disagreements felt like a slap on the face. The shock afterward burned harder than a hand-print across the face.
25 years and, what seems like 86 billion light years of questioning, testing, and learning from that world-view, can see even beyond religion, people fall into bubbles so easily. The political conservatives only post articles from conservative blogs. The liberals post from liberal news sources. None have ever gone hunting on the opposing side for ways to test their own beliefs even once. Ever debate someone over a bill that they haven't even read? All their info comes from the pravda wing of their preferred political party / street gang, none of it is first hand knowledge. They're in a bubble.
Three of the most popular religions that worship the same God will each tell you the others are counterfeits, despite the shared moral codes, values, rituals and traditions. Apple fanboys who wholesale swallowed the lies about their OS / machines being immune from viruses, without ever having read one article of an IT security blog. It's not just confirmation bias at work, people live in an artificial information bubble of information sources that affirm their identity, soothe their egos, and never test any idea that they have. Scientific controversies create bubbles no less. But it doesn't even take a controversy, just a preferred source of information -- news, blogs, books, authors. Even if such sources attempt to present an idea or argument from the others who disagree, they do not present it with sufficient force.
Even Google will gladly do this for you by customizing your search results by location, demographic, past searches, etc, to filter out things you may not want to see, providing a convenient invisible bubble for you even if you don't want it!
If you're rational, there's daily work to break the bubbles by actually looking for ways to test the beliefs you care about. The more you care about them, the more they should be tested.
Problem is, the bigger our information sharing capabilities are, the harder it is to find quality information. Facebook propaganda posts get repeated over and over. Re-tweets. Blog reposts. Academic "science" papers that have never been replicated, but are in the news headlines everywhere. The more you actually dig into the agitprop looking for a few gems, or at least sources of interesting information, the more you realize even the questions have been framed wrongly, especially over controversial things. Without searching for high quality evidence about a thing, I resign myself to "no opinion" until I care enough to do the work.
And now you don't fit in anyone's bubble. Not in politics, not in religion, not even in technical arenas where people bubble up also. Take politics ... it's not that I'm a liberal and I miss the company of my conservative friends, or the other way around. Like the "underground man" I feel I actually understand the values and arguments from both sides, leading to wanting to tear the whole system apart and invent new ways or angles of addressing the problems.
But try to have a conversation, for example, about the trade-offs of huge military superiority the US has created: costs and murder vs eventually conceding dominance to who knows who, as they say-- you either wear the merciless boot or live with it on your neck. Approaching the topic this way, and you may be seen as a weak peacenik who dishonors our hero troops or as a monster who gladly trades blood for oil; you're not even understood as having no firm conclusion.
Okay, so don't throw your pearls before swine you say. But you know, you're going to have to do it quite a few times just to find out where the pig-pen ends and information close to the raw sources and unbiased data begin. If you want to hear interesting new ideas from other minds, you're going to have to accept that they are biased and often come from inside their bubble. If you want to test your own beliefs, actively seek to disprove what you think, you will have to wade through oceans of bullshit and agitprop to find the one pearl that shifts your awareness. There is no getting around the work.
Then there are these kinds of situations: my father has also left the fundamentalist fold, but he has gone deeply into New Age mysticism instead of the more skeptical method I've taken. I really want to preserve our closeness and friendship. I know I can't change his mind, but he really likes to talk about this stuff so to stay close I should really try hard to understand his perspective and ideas. But even asking to define terms like "higher consciousness" or explain experiences of "higher awareness" or try to understand the predictions about human "evolutionary" steps coming up ... and he falls back to "it can't be described" or "it's beyond our present intelligence to grasp" or even "beyond rational thought" to understand. So I can artificially nod along not understanding a damn word about it, or I can try to get some kind of hook into his ideas and totally burst his bubble, without even trying. Bursting someone's bubble is not cool. If you burst their bubble, they will cry. If only inwardly. Burst their bubble, and they will try to burst yours, not to help you but from pain.
Problem is, trying to burst your own bubble, you're breaking everyone else's bubbles left and right.
There is the temptation to seek out your own bubble just for temporary comfort ... just how many skeptical videos about SpiritScience or creationism or religion am I going to watch? The scale of evidence is already tipped so far, investing more time to learn more details that nudge it 0.0001% toward 100% isn't about anything other than emotional soothing. Emotional soothing is dangerous; it's reinforcing my bubbles that I will now have to work all the harder to burst, to test, and to train myself to have no emotional investment in any provisional belief.
But it is so, so tempting, when you see yet another propaganda post for the republicrips or bloodocrat gang, vast scientific conspiracy posts, watch your friends and family shut down mid-conversation, so tempting to go read another Sagan book that teaches me nothing new but makes me feel good about my current provisional beliefs. It's tempting to think about blocking friends who run a pravda outlet over facebook, or even shut down your facebook account. It's tempting to give up on family in their own bubble and artificially nod along to concepts that have no meaning.
To some extent, I am even giving in by writing this ... I would like to see many other rationalists feel the same way and affirm my perspective and struggle with this, and that reinforces my bubble, doesn't it? There are probably psychological limits and needs that make some degree of it minimal. We're compelled to eat, but if give ourselves over to that instinct without regard or care it will eventually kill us.
Don't bubble, don't give into the temptation, keep working to burst the bubbles that accrete around you. It's exhausting, it's painful, and it's the only thing keeping your eyes open to reality.
And friend, as you need it here and there, come here and I'll agree with you about something we both already have mountains of evidence for and almost none against. ;)
[LINK] Deep Learning Machine Teaches Itself Chess in 72 Hours
Lai has created an artificial intelligence machine called Giraffe that has taught itself to play chess by evaluating positions much more like humans and in an entirely different way to conventional chess engines.
Straight out of the box, the new machine plays at the same level as the best conventional chess engines, many of which have been fine-tuned over many years. On a human level, it is equivalent to FIDE International Master status, placing it within the top 2.2 percent of tournament chess players.
The technology behind Lai’s new machine is a neural network. [...] His network consists of four layers that together examine each position on the board in three different ways.
The first looks at the global state of the game, such as the number and type of pieces on each side, which side is to move, castling rights and so on. The second looks at piece-centric features such as the location of each piece on each side, while the final aspect is to map the squares that each piece attacks and defends.
[...]
Lai generated his dataset by randomly choosing five million positions from a database of computer chess games. He then created greater variety by adding a random legal move to each position before using it for training. In total he generated 175 million positions in this way.
[...]
One disadvantage of Giraffe is that neural networks are much slower than other types of data processing. Lai says Giraffe takes about 10 times longer than a conventional chess engine to search the same number of positions.
But even with this disadvantage, it is competitive. “Giraffe is able to play at the level of an FIDE International Master on a modern mainstream PC,” says Lai. By comparison, the top engines play at super-Grandmaster level.
[...]
Ref: arxiv.org/abs/1509.01549 : Giraffe: Using Deep Reinforcement Learning to Play Chess
Film about Stanislav Petrov
I searched around but didn't see any mention of this. There's a film being released next week about Stanislav Petrov, the man who saved the world.
The Man Who Saved the World
http://www.imdb.com/title/tt2277106/
Due for limited theatrical release in the USA on 18 September 2015.
http://themanwhosavedtheworldmovie.com/#seethemovie
Will show in New York, Los Angeles, Detroit, Portland.
Previous discussion of Stanislav Petrov:
http://lesswrong.com/lw/jq/926_is_petrov_day/
Flowsheet Logic and Notecard Logic
(Disclaimer: The following perspectives are based in my experience with policy debate which is fifteen years out of date. The meta-level point should stand regardless.)
If you are not familiar with U.S. high school debate club ("policy debate" or "cross-examination debate"), here is the gist of it: two teams argue over a topic, and a judge determines who has won.
When we get into the details, there are a lot of problems with the format. Almost everything wrong with policy debate appears in this image:

This is a "flowsheet", and it is used to track threads of argument between the successive epochs of the debate round. The judge and the debators keep their own flowsheets to make sense of what's going on.
I am sure that there is a skillful, positive way of using flowsheets, but I have never seen it used in any way other than the following:
After the Affirmative side lays out their proposal, the Negative throws out a shotgun blast of more-or-less applicable arguments drawn from their giant plastic tote containing pre-prepared arguments. The Affirmative then counters the Negative's arguments using their own set of pre-prepared counter-arguments. Crucially, all of the Negative arguments must be met. Look at the Flowsheet image again, and notice how each "argument" has an arrow which carries it rightward. If any of these arrows make it to the right side of the page - the end of the round - without being addressed, then the judge will typically consider the round to be won by the side who originated that arrow.
So it doesn't actually matter if an argument receives a good counterargument. It only matters that the other team has addressed it appropriately.
Furthermore, merely addressing the argument with ad hoc counterargument is usually not sufficient. If the Negative makes an argument which contains five separate logical fallacies, and the Affirmative points all of these out and then moves on, the judge may not actually consider the Negative argument to have been refuted - because the Affirmative did not cite any Evidence.
Evidence, in policy debate, is a term of art, and it means "something printed out from a reputable media source and taped onto a notecard." You can't say "water is wet" in a policy debate round without backing it up with a notecard quoting a news source corroborating the wetness of water. So, skillfully pointing out those logical fallacies is meaningless if you don't have the Evidence to back up your claims.
Skilled policy debators can be very good - impressively good - at the mental operations of juggling all these argument threads in their mind and pulling out the appropriate notecard evidence. My entire social circle in high school was composed of serious debators, many of whom were brilliant at it.
Having observed some of these people for the ensuing decade, I sometimes suspect that policy debate damaged their reasoning ability. If I were entirely simplistic about it, I would say that policy debate has destroyed their ability to think and argue rationally. These people essentially still argue the same way, by mental flowsheet, acting as though argument can proceed only via notecard exchange. If they have addressed an argument, they consider it to be refuted. If they question an argument's source ("Wikipedia? Really?"), they consider it to be refuted. If their opponent ignores one of their inconsequential points, they consider themselves to have won. They do not seem to possess any faculty for discerning whether or not one argument actually defeats another. It is the equivalent of a child whose vision of sword fighting is focused on the clicking together of the blades, with no consideration for the intent of cutting the enemy.
Policy debate is to actual healthy argumentation as checkers is to actual warfare. Key components of the object being gamified are ignored or abstracted away until the remaining simulacrum no longer represents the original.
I actually see Notecard Logic and Flowsheet Logic everywhere. That's why I have to back off from my assertion that policy debate destroyed anybody's reasoning ability - I think it may have simply reinforced and hypertrophied the default human argumentation algorithm.
Flowsheet Logic is the tendency to think that you have defeated an argument because you have addressed it. It is the overall sense that you can't lose an argument as long as none of your opponent's statements go unchallenged, even if none of your challenges are substantial/meaningful/logical. It is the belief that if you can originate more threads of argument against your opponent than they can fend off, you have won, even if none of your arguments actually matters individually. I see Flowsheet Logic tendencies expressed all the time.
Notecard Logic is the tendency to treat evidence as binary. Either you have evidence to back up your assertion - even if that evidence takes the form of an article from [insert partisan rag] - or else you are just "making things up to defend your point of view". There is no concession to Bayesian updating, credibility, or degrees of belief in Notecard Logic. "Bob is a flobnostic. I can prove this because I can link you to an article that says it. So what if I can't explain what a flobnostic is." I see Notecard Logic tendencies expressed all the time.
Once you have developed a mental paintbrush handle for these tendencies, you may see them more as well. This awareness should allow you to discern more clearly whether you - or your interlocutor - or someone else entirely - is engaging in these practices. Hopefully this awareness paints a "negative space" of superior argumentation for you.
Proper posture for mental arts
I'd like to start by way of analogy. I think it'll make the link to rationality easier to understand if I give context first.
I sometimes teach the martial art of aikido. The way I was originally taught, you had to learn how to "feel the flow of ki" (basically life energy) through you and from your opponent, and you had to make sure that your movements - both physical and mental - were such that your "ki" would blend with and guide the "ki" of your opponent. Even after I stopped believing in ki, though, there were some core elements of the art that I just couldn't do, let alone teach, without thinking and talking in terms of ki flow.
A great example of this is the "unbendable arm". This is a pretty critical thing to get right for most aikido techniques. And it feels really weird. Most people when they first get it think that the person trying to fold their arm isn't actually pushing because it doesn't feel like effort to keep their arm straight. Many students (including me once upon a time) end up taking this basic practice as compelling proof that ki is real. Even after I realized that ki wasn't real, I still had to teach unbendable arm this way because nothing else seemed to work.
…and then I found anatomical resources like Becoming a Supple Leopard.
It turns out that the unbendable arm works when:
- your thoracic spine is in a non-kyphotic position
- your head isn't hanging forward (which would mimic the thoracic tension of kyphosis)
- your shoulder is rolled back and down enough for the part of your clavicle immediately above the sternoclavicular joint to stick out a bit (see here)
- your shoulder has slight tension in it from holding your elbow in a pointing-down position
That's it. If you do this correctly, you can relax most of your other arm muscles and still be able to resist pretty enormous force on your arm.
Why, you might ask? Well, from what I have gathered, this lets you engage your latissimus dorsi (pretty large back muscles) in stabilizing your elbow. There's also a bit of strategy where you don't actually have to fully oppose the arm-bender's strength; you just have to stabilize the elbow enough to be able to direct the push-down-on-elbow force into the push-up-on-wrist force.
But the point is, by understanding something about proper posture, you can cut literally months of training down to about ten minutes.
To oversimplify it a little bit, there are basically three things to get right about proper posture for martial arts (at least as I know them):
- You need to get your spine in the right position and brace it properly. (For the most part and for most people, this means tucking your pelvis, straightening your thoracic spine a bit, and tensing your abs a little.)
- You need to use your hip and shoulder ball-and-socket joints properly. (For the most part this seems to mean using them instead of your spine to move, and putting torque in them by e.g. screwing your elbow downward when reaching forward.)
- You need to keep your tissue supple & mobile. (E.g., tight hamstrings can pull your hips out of alignment and prevent you from using your hip joints instead of your mid-lumbar spine (i.e. waist) to bend over. Also, thoracic inflexibility usually locks people in thoracic kyphosis, making it extremely difficult to transfer force effectively between their lower body and their arms.)
My experience is that as people learn how to feel these three principles in their bodies, they're able to correct their physical postures whenever they need to, rather than having to wait for my seemingly magical touch to make an aikido technique suddenly really easy.
It's worth noting that this is mostly known, even in aikido dojos ("training halls"). They just phrase it differently and don't understand the mechanics of it. They'll say things like "Don't bend over; the other guy can pull you down if you do" and "Let the move be natural" and "Relax more; let ki flow through you freely."
But it turns out that getting the mechanical principles of posture down makes basically all the magic of aikido something even a beginner can learn how to see and correct.
A quick anecdote along these lines, which despite being illustrative, you should take as me being a bit of an idiot:
I once visited a dojo near the CFAR office. That night they were doing a practice basically consisting of holding your partner's elbow and pulling them to the ground. It works by a slight shift sideways to cause a curve in the lumbar spine, cutting power between their lower and upper bodies. Then you pull straight down and there's basically nothing they can do about it.
However, the lesson was in terms of feeling ki flow, and the instruction was to pull straight down. I was feeling trollish and a little annoyed about the wrongness and authoritarian delivery of the instruction, so I went to the instructor and asked: "Sensei, I see you pulling slightly sideways, and I had perhaps misheard the instructions to be that we should pull straight down. Should I be pulling slightly sideways too?"
At which point the sensei insisted that the verbal instructions were correct, concentrated on preventing the sideways shift in his movements, and obliterated his ability to demonstrate the technique for the rest of the night.
Brienne Yudkowsky has a lovely piece in which she refers to "mental postures". I highly recommend reading it. She does a better job of pointing at the thing than I think I would do here.
…but if you really don't want to read it just right now, here's the key element I'll be using: There seems to be a mental analog to physical posture.
We've had quite a bit of analogizing rationality as a martial art here. So, as a martial arts practitioner and instructor with a taste of the importance of deeply understanding body mechanics, I really want to ask: What, exactly, are the principles of good mental posture for the Art of Rationality?
In the way I'm thinking of it, this isn't likely to be things like "consider the opposite" or "hold off on proposing solutions". I refer to things of this breed as "mental movements" and think they're closer to the analogs of individual martial techniques than they are principles of mental orientation.
That said, we can look at mental movements to get a hint about what a good mental posture might do. In the body, good physical posture gives you both more power and more room for error: if you let your hands drift behind your head in a shihonage, having a flexible thoracic spine and torqued shoulders and braced abs can make it much harder for your opponent to throw you to the ground even though you've blundered. So, by way of analogy, what might an error in attempting to (say) consider the opposite look like, and what would a good "mental posture" be that would make the error matter less?
(I encourage you to think on your own about an answer for at least 60 seconds before corrupting your mind with my thoughts below. I really want a correct answer here, and I doubt I have one yet.)
…
…
…
When I think of how I've messed up in attempts to consider the opposite, I can remember several instances when my tone was dutiful. I felt like I was supposed to consider the opinion that I disagreed with or didn't want to have turn out to be true. And yet, it felt boring or like submitting or something like that to really take that perspective seriously. I felt like I was considering the opposite roughly the same way a young child replies to their parent saying "Now say that you're sorry" with an almost sarcastic "I'm sorry."
What kind of "mental posture" would have let me make this mistake and yet still complete the movement? Or better yet, what mental posture would have prevented the mistake entirely? At this point I intuit that I have an answer but it's a little tricky for me to articulate. I think there's a way I can hold my mind that makes the childish orientation to truth-seeking matter less. I don't do it automatically, much like most people don't automatically sit up straight, but I sort of know how to see my grasping at a conclusion as overreaching and then… pause and get my mental feet under my mental hips before I try again.
I imagine that wasn't helpful - but I think we have examples of good and bad mental posture in action. In attachment theory, I think that the secure attachment style is a description of someone who is using good mental posture even when in mentally/emotionally threatening situations, whereas the anxious and avoidant styles are descriptions of common ways people "tense up" when they lose good mental posture. I also think there's something interesting in how sometimes when I'm offended I get really upset or angry, and sometimes the same offense just feels like such a small thing - and sometimes I can make the latter happen intentionally.
The story I described above of the aikido sensei I trolled also highlights something that I think is important. In this case, although he didn't get very flustered, he couldn't change what he was doing. He seemed mentally inflexible, like the cognitive equivalent of someone who can't usefully block an overhead attack because of a stiff upper back restricting his shoulder movement. I feel like I've been in that state lots of times, so I feel like I can roughly imagine how my basic mental/emotional orientation to my situation and way of thinking would have to be in order to have been effective in his position right then - and why that can be tricky.
I don't feel like I've adequately answered the question of what good mental posture is yet. But I feel like I have some intuitions - sort of like being able to talk about proper posture in terms of "good ki flow". But I also notice that there seem to be direct analogs of the three core parts of good physical posture that I mentioned above:
- Have a well-braced "spine". Based on my current fledgling understanding, this seems to look something like taking a larger perspective, like imagining looking back at this moment 30 years hence and noticing what does and does not matter. (I think that's akin to tucking your hips, which is a movement in service of posture but isn't strictly part of the posture.) I imagine this is enormously easier when one has a well-internalized sense of something to protect.
- Move your mind in strong & stable ways, rather than losing "spine". I think this can look like "Don't act while triggered", but it's more a warning not to try to do heavy cognitive work while letting your mental "spine" "bend". Instead, move your mind in ways that you would upon reflection want your mind to move, and that you expect to be able to bear "weight".
- Make your mind flexible. Achieve & maintain full mental range of movement. Don't get "stiff", and view mental inflexibility as a risk to your mental health.
All three of these are a little hand-wavy. That third one in particular I haven't really talked about much - in part because I don't really know how to work on that well. I have some guesses, and I might write up some thoughts about that later. (A good solution in the body is called "mobilization", basically consisting of pushing on tender/stiff spots while you move the surrounding joints through their maximal range of motion.) Also, I don't know if there are more principles for the mind than these three, or if these three are drawing too strongly on the analogy and are actually a little distracting. I'm still at the stage where, for mental posture, I keep wanting to say the equivalent of "relax more and let ki flow."
A lot of people say I have excellent physical posture. I think I have a reasonably clear idea of how I made my posture a habit. I'd like to share that because I've been doing the equivalent in my mind for mental posture and am under the impression that it's getting promising results.
I think my physical practice comes down to three points:
- Recognize that having good posture gives you superpowers. It's really hard to throw me down, and I can pretty effortlessly pull people to the ground. A lot of that is martial skill, but a huge chunk of it is just that good posture gives me excellent leverage. This transfers to being able to lift really heavy things and move across the room very efficiently and quickly when needed. This also gives me a pretty big leg up on learning physical skills. Recognizing that these were things I'd gain from learning good posture gave me a lot of drive to stick to my practice.
- Focus on how the correct posture feels, and exactly how it's different from glitchy posture. I found it super-important to notice that my body feels different in specific ways when my shoulders are in the right position versus when they're too far forward or back. Verbal instructions like "Pull shoulders back" don't work nearly as well as the feeling in the body.
- Choose one correction at a time, and always operate from that posture, pausing and correcting yourself when you're about to slip up. Getting good shoulder posture required that I keep my shoulders back all the time. When I would reach for water, I'd notice when my shoulder was in the too-far-forward position, and then pull back and fix my shoulder position before trying again. This sometimes required trying at very basic tasks several times, often quite slowly, until I could get it right each time.
Although I didn't add this until quite late, I would now add a fourth point when giving advice on getting good physical posture: make sure to mobilize the parts of your body that are either (a) preventing you from moving into a good position or (b) requiring you to be very stiff or tense to hold that position. The trouble is, I know how to do that for the body, but I'm not as sure about how to do that for the mind.
But the three bullet points above are instructions that I can follow with respect to mental posture, I think.
So, to the extent that that seems possible for you, I invite you to try to do the same - and let me know how it goes.
Personal story about benefits of Rationality Dojo and shutting up and multiplying
My wife and I have been going to Ohio Rationality Dojo for a few months now, started by Raelifin, who has substantial expertise in probabilistic thinking and Bayesian reasoning, and I wanted to share about how the dojo helped us make a rational decision about house shopping. We were comparing two houses. We had an intuitive favorite house (170 on the image) but decided to compare it to our second favorite (450) by actually shutting up and multiplying, based on exercises we did as part of the dojo.
What we did was compare mathematically each part of the house by comparing the value of that part of the house multiplied by the use of that part of the house, and had separate values for the two of us (A for my wife, Agnes Vishnevkin, and G for me, Gleb Tsipursky, on the image). By comparing it mathematically, 450 came out way ahead. Hard to update our beliefs, but we did it, and are now orienting toward that one as our primary choice. Rationality for the win!
Here is the image of our back-of-the-napkin calculations.
Why people want to die
Over and over again, someones says that living for a very long time would be a bad thing, and then some futurist tries to persuade them that their reasoning is faulty. They tell them that they think that way now, but they'll change their minds when they're older.
The thing is, I don't see that happening. I live in a small town full of retirees, and those few I've asked about it are waiting for death peacefully. When I ask them about their ambitions, or things they still want to accomplish, they have none.
Suppose that people mean what they say. Why do they want to die?
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)