While you can't fool your logical brain, if you want to have a false belief to make you happy, you don't need to anyway. The brain is compartmentalized and often doesn't update what you feel intuitively true, or what you base your actions on, just because you learned a fact. This sentence: "You can't know the consequences of being biased, until you have already debiased yourself" strikes me as most hard to believe. Reading about a bias and considering its consequences, esp. in an academic mindframe does NOT debias you. That requires applying it to your life and reasoning, recognizing when you are biased, sometimes even training and conditioning to change how you think. If after learning about a bias, I rationally decided that I want to keep it, I would just shelve it in my memory as academic trivia irrelevant to daily life, and I would stay just as biased as before in regards to what I do and how I feel.
Does Kolmogorov complexity imply a bound on self-improving AI?
The Kolmogorov complexity ("K") of a string ("S") specifies the size of the smallest Turing machine that can output that string. If a Turing machine (equivalently, by the Church-Turing thesis, any AI) has size smaller than K, it can rewrite its code as much as it wants to, it won't be able to output S. To be specific, of course it can output S by enumerating all possible strings, but it won't be able to decide on S and output it exclusively among the options available. Now suppose that S is the source code for an intelligence strictly better than all those with complexity <K. Now, we are left with 3 options:
- The space of all maximally intelligent minds has an upper bound on complexity, and we have already reached it.
- The universe contains new information that can be used to build minds of greater complexity, or:
- There are levels of intelligence that are impossible for us to reach.
Is my brain a utility minimizer? Or, the mechanics of labeling things as "work" vs. "fun"
I recently encountered something that is, in my opinion, one of the most absurd failure modes of the human brain. I first encountered this after introspection on useful things that I enjoy doing, such as programming and writing. I noticed that my enjoyment of the activity doesn't seem to help much when it comes to motivation for earning income. This was not boredom from too much programming, as it did not affect my interest in personal projects. What it seemed to be, was the brain categorizing activities into "work" and "fun" boxes. On one memorable occasion, after taking a break due to being exhausted with work, I entertained myself, by programming some more, this time on a hobby personal project (as a freelancer, I pick the projects I work on so this is not from being told what to do). Relaxing by doing the exact same thing that made me exhausted in the first place.
The absurdity of this becomes evident when you think about what distinguishes "work" and "fun" in this case, which is added value. Nothing changes about the activity except the addition of more utility, making a "work" strategy always dominate a "fun" strategy, assuming the activity is the same. If you are having fun doing something, handing you some money can't make you worse off. Making an outcome better makes you avoid it. Meaning that the brain is adopting a strategy that has a (side?) effect of minimizing future utility, and it seems like it is utility and not just money here - as anyone who took a class in an area that personally interested them knows, other benefits like grades recreate this effect just as well. This is the reason I think this is among the most absurd biases - I can understand akrasia, wanting the happiness now and hyperbolically discounting what happens later, or biases that make something seem like the best option when it really isn't. But knowingly punishing what brings happiness just because it also benefits you in the future? It's like the discounting curve dips into the negative region. I would really like to learn where is the dividing line between which kinds of added value create this effect and which ones don't (like money obviously does, and immediate enjoyment obviously doesn't). Currently I'm led to believe that the difference is present utility vs. future utility, (as I mentioned above) or final vs. instrumental goals, and please correct me if I'm wrong here.
This is an effect that has been studied in psychology and called the overjustification effect, called that because the leading theory explains it in terms of the brain assuming the motivation comes from the instrumental gain instead of the direct enjoyment, and then reducing the motivation accordingly. This would suggest that the brain has trouble seeing a goal as being both instrumental and final, and for some reason the instrumental side always wins in a conflict. However, its explanation in terms of self-perception bothers me a little, since I find it hard to believe that a recent creation like self-perception can override something as ancient and low-level as enjoyment of final goals. I searched LessWrong for discussions of the overjustification effect, and the ones I found discussed it in the context of self-perception, not decision-making and motivation. It is the latter that I wanted to ask for your thoughts on.
Awesome post, but somebody should do the pessimist version, rewriting various normal facets of the human condition as horrifying angsty undead curses.
The Curse of Downregulation: Sufferers of this can never live "happily ever after", for anything that gives them joy, done often enough, will become mundane and boring. Someone who is afflicted could have the great luck to earn a million a day, and after a year they will be filled with despair and envy at their neighbor who is making two million, no happier than they would be in poverty.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
I will help a suffering thing if it benefits me to help it, or if the social contract requires me to. Otherwise I will walk away.
I adopted this cruel position after going through one long relationship where I constantly demanded emotional "help" from the girl, then another relationship soon afterwards where the girl constantly demanded similar "help" from me. Both those situations felt so sick that I finally understood: participating in any guilt-trip scenario makes you a worse person, no matter whether you're tripping or being tripped. And it also makes the world worse off: being openly vulnerable to guilt-tripping encourages more guilt-tripping all around.
So relax and follow your own utility - this will incentivize others to incentivize you to help them, so everyone will treat you well, and you'll treat them well in advance for the same reason.
People who require help can be divided into those who are capable of helping themselves, and those who are not. Such a position as yours would express the value preference that sacrificing the good of the latter group is better than letting the first group get unpaid rewards - in all cases. For me it's not that simple, the choice depends on the proportion of the groups, cost to me and society, and just how much good is being sacrificed. To make an extreme example, I would save someone's life even if this encourages other people to be less careful protecting theirs.