Risk Contracts: A Crackpot Idea to Save the World
Time start: 18:17:30
I
This idea is probably going to sound pretty crazy. As far as seemingly crazy ideas go, it's high up there. But I think it is interesting enough to at least amuse you for a moment, and upon consideration your impression might change. (Maybe.) And as a benefit, it offers some insight into AI problems if you are into that.
(This insight into AI may or may not be new. I am not an expert on AI theory, so I wouldn't know. It's elementary, so probably not new.)
So here it goes, in short form on which I will expand in a moment:
To manage global risks to humanity, they can be captured in "risk contracts", freely tradeable on the market. Risk contracts would serve the same role as CO2 emissions contracts, which can likewise be traded, and ensure that the global norm is not exceeded as long as everyone plays along with the rules.
So e.g. if I want to run a dangerous experiment that might destroy the world, it's totally OK as long as I can purchase enough of a risk budget. Pretty crazy, isn't it?
As an added bonus, a risk contract can take into account the risk of someone else breaking the terms of contract. When you trasfer your rights to global risk, the contract obliges you to diminish the amount you transfer by the uncertainty about the other party being able to fullfill all obligations that come with such a contract. Or if you have not enough risk budget for this, you cannot transfer to that person.
II
Let's go a little bit more into detail about a risk contract. Note that this is supposed to illustrate the idea, not be a final say on the shape and terms of such a contract.
Just to give you some idea, here are some example rules (with lots of room to specify them more clearly etc., it's really just so that you have a clearer idea of what I mean by a "risk contract"):
- My initial risk budget is 5 * 10^-12 chance of destroying the world. I am going to track this budget and do everything in my power to make sure that it never goes below 0.
- For every action (or set of correlated actions) I take, I will subtract the probability that those actions destroy the world from my budget (using simple subtraction unless correlation between actions is very high).
- If I transfer my budget to an agent who is going to decide about its actions independently from me, I will first pay the cost from my budget for the probability that this agent might not keep the terms of the contract. I will use my best conservative estimates, and refuse the transaction if I cannot keep the risk within my budget.
- Any event in which a risk contract on world destruction is breached will use my budget as if it was equivalent to actually destroying the world.
- Whenever I create a new intelligent agent, I will transfer some risk budget to that agent, according to the rules above.
III
Of course, the application of this could be wider than just an AI which might recursively self-improve - some more "normal" human applications could be risk management in a company or government, or even using risk contract as an internal currency to make better decisions.
I admit though, that the AI case is pretty special - it gives an opportunity to actually control the ability of another agent to keep a risk contract that we are giving to them.
It is an interesting calculation to see roughly what are the costs of keeping a risk contract in the recursive AI case, with a lot of simplifying assumptions. Assume that to reduce risk of child AI going off the rails can be reduced by a constant factor (e.g. have it cut by half) by putting in an additional unit of work. Also assume the chain of child AIs might continue indefinitely, and no later AI will assume a finite ending of it. Then if the chain has no branches, we are basically reduced to a power series: the risk budget of a child AI is always the same fraction of its parent's budget. That means we need linearly increasing amount of work on safety at each step. That in turn means that the total amount of work on safety is quadratic in the number of steps (child AIs).
Time end: 18:52:01
Writing stats: 21 wpm, 115 cpm (previous: 30/167, 33/183, 23/128)
Against Amazement
Time start: 20:48:35
I
The feelings of wonder, awe, amazement. It's a very human experience, and it is processed in the brain as a type of pleasure. If fact, if we look at the number of "5 photos you wouldn't believe" and similar clickbait on the Internet, it functions as a mildly addictive drug.
If I proposed that there is something wrong with those feelings, I would soon be drowned in voices of critique, pointing out that I'm suggesting we all become straw Vulcans, and that there is nothing wrong with subjective pleasure obtained cheaply and at no harm to anyone else.
I do not disagree with that. However, caution is required here, if one cares about epistemic purity of belief. Let's look at why.
II
Stories are supposed to be more memorable. Do you like stories? I'm sure you do. So consider a character, let's call him Jim.
Jim is very interested in technology and computers, and he is checking news sites every day when he comes to work in the morning. Also, Jim has read a number of articles on LessWrong, including the one about noticing confusion.
He cares about improving his thinking, so when he first read about the idea of noticing confusion on a 5 second level, he thought he wants to apply it in his life. He had a few successes, and while it's not perfect, he feels he is on the right track to notice having wrong models of the world more often.
A few days later, he opens his favorite news feed at work, and there he sees the following headline:
"AlphaGo wins 4-1 against Lee Sedol"
He goes on to read the article, and finds himself quite elated after he learns the details. 'It's amazing that this happened so soon! And most experts apparently thought it would happen in more than a decade, hah! Marvelous!'
Jim feels pride and wonder at the achievement of Google DeepMind engineers... and it is his human right to feel it, I guess.
But is Jim forgetting something?
III
Yes, I know that you know. Jim is feeling amazed, but... has he forgotten the lesson about noticing confusion?
There is a significant obstacle to Jim applying his "noticing confusion" in the situation described above: his internal experience has very little to do with feelings of confusion.
His world in this moment is dominated with awe, admiration etc., and those feelings are pleasant. It is not at all obvious that this inner experience corresponds to a innacurate model of the world he had before.
Even worse - improving his model's predictive power would result in less pleasant experiences of wonder and amazement in the future! (Or would it?) So if Jim decides to update, he is basically robbing himself of the pleasures of life, that are rightfully his. (Or is he?)
Time end: 21:09:50
(Speedwriting stats: 23 wpm, 128 cpm, previous: 30/167, 33/183)
Neutralizing Physical Annoyances
Once in a while, I learn something about a seemingly unrelated topic - such as freediving - and I take away some trick that is well known and "obvious" in that topic, but is generally useful and NOT known by many people outside. Case in point, you can use equalization techniques from diving to remove pressure in your ears when you descend in a plane or a fast lift. I also give some other examples.
Ears
Reading about a few equalization techniques took me maybe 5 minutes, and after reading this passage once I was able to successfully use the "Frenzel Maneuver":
The technique is to close off the vocal cords, as though you are about to lift a heavy weight. The nostrils are pinched closed and an effort is made to make a 'k' or a 'guh' sound. By doing this you raise the back of the tongue and the 'Adam's Apple' will elevate. This turns the tongue into a piston, pushing air up.
(source: http://freedivingexplained.blogspot.com.mt/2008/03/basics-of-freediving-equalization.html)
Hiccups
A few years ago, I started regularly doing deep relaxations after yoga. At some point, I learned how to relax my throat in such a way that the air can freely escape from the stomach. Since then, whenever I start hiccuping, I relax my throat and the hiccups stop immediately in all cases. I am now 100% hiccup-free.
Stiff Shoulders
I've spent a few hours with a friend who is doing massage, and they taught me some basics. After that, it became natural for me to self-massage my shoulders after I do a lot of sitting work etc. I can't imagine living without this anymore.
Other?
If you know more, please share!
Willpower Schedule
TL;DR: your level of willpower depends on how much willpower you expect to need (hypothesis)
Time start: 21:44:55 (this is my third exercise in speed writing a LW post)
I.
There is a lot of controversy about how our level of willpower is affected by various factors, including doing "exhausting" tasks before, as well as being told that willpower is a resource that depletes easily, or doesn't etc.
(sorry, I can't go look for references - that would break the speedwriting exercise!)
I am not going to repeat the discussions that already cover those topics; however, I have a new tentative model which (I think) fits the existing data very well, is easy to test, and supersedes all previous models that I have seen.
II.
The idea is very simple, but before I explain it, let me give a similar example from a different aspect of our lives. The example is going to be concerned with, uh, poo.
Have you ever noticed that (if you have a sufficiently regular lifestyle), conveniently you always feel that you need to go to the toilet at times when it's possible to do so? Like for example, how often do you need to go when you are on a bus, versus at home or work?
The function of your bowels is regulated by reading subconscious signals about your situation - e.g. if you are stressed, you might become constipated. But it is not only that - there is a way in which it responds to your routines, and what you are planning to do, not just the things that are already affecting you.
Have you ever had the experience of a background thought popping up in your mind that you might need to go within the next few hours, but the time was not convenient, so you told that thought to hold it a little bit more? And then it did just that?
III.
The example from the previous section, though possibly quite POOrly choosen (sorry, I couldn't resist), shows something important.
Our subconscious reactions and "settings" of our bodies can interact with our conscious plans in a "smart" way. That is, they do not have to wait to see the effects of what you are doing, to adjust to it - they can pull information from your conscious plans and adjust *before*.
And this is, more or less, the insight that I have added to my current working theory of willpower. It is not very complicated, but perhaps non-obvious. Sufficiently non-obvious that I don't think anyone has suggested it before, even after seeing experimental results that match this excellently.
IV.
To be more accurate, I claim that how much willpower you will have depends on several important factors, such as your energy and mood, but it also depends on how much willpower you expect to need.
For example, if you plan to have a "rest day" and not do any serious work, you might find that you are much less *able* to do work on that day than usual.
It's easy enough to test - so instead of arguing this theoretically, please do just that - give it a test. And make sure to record your levels of willpower several times a day for some time - you'll get some useful data!
Time end: 20:00:53. Statistics: 534 words, 2924 characters, 15.97 minutes, 33.4 wpm, 183.1 cpm
Non-Fiction Book Reviews
Time start 13:35:06
For another exercise in speed writing, I wanted to share a few book reviews.
These are fairly well known, however there is a chance you haven't read all of them - in which case, this might be helpful.
Good and Real - Gary Drescher ★★★★★
This is one of my favourite books ever. Goes over a lot of philosophy, while showing a lot of clear thinking and meta-thinking. Number one replacement for Eliezer's meta-philosophy, if it had not existed. The writing style and language is somewhat obscure, but this book is too brilliant to be spoiled by that. The biggest takeaway is the analysis of ethics of non-causal consequences of our choices, which is something that actually has changed how I act in my life, and I have not seen any similar argument in other sources that would do the same. This book changed my intuitions so much that I now pay $100 in counterfactual mugging without second thought.
59 Seconds - Richard Wiseman ★★★
A collection of various tips and tricks, directly based on studies. The strength of the book is that it gives easy but detailed descriptions of lots of studies, and that makes it very fun to read. Can be read just to check out the various psychology results in an entertaining format. The quality of the advice is disputable, and it is mostly the kind of advice that only applies to small things and does not change much in what you do even if you somehow manage to use it. But I still liked this book, and it managed to avoid saying anything very stupid while saying a lot of things. It counts for something.
What You Can Change and What You Can't - Martin Seligman ★★★
It is a heartwarming to see that the author puts his best effort towards figuring out what psychology treatments work, and which don't, as well as builiding more general models of how people work that can predict what treatments have a chance in the first place. Not all of the content is necessarily your best guess, after updating on new results (the book is quite old). However if you are starting out, this book will serve excellently as your prior, on which you can update after checking out the new results. And also in some cases, it is amazing that the author was right about them 20 years ago, and mainstream psychology is STILL not caught up (like the whole bullshit "go back to your childhood to fix your problems" approach, which is in wide use today and not bothered at all by such things as "checking facts").
Thinking, Fast and Slow - Daniel Kahneman ★★★★★
A classic, and I want to mention it just in case. It is too valuable not to read. Period. It turns out some of the studies the author used for his claims have been later found not to replicate. However the details of those results is not (at least for me) a selling point of this book. The biggest thing is the author's mental toolbox for self-analysis and analysis of biases, as well concepts that he created to describe the mechanisms of intuitive judgement. Learn to think like the author, and you are 10 years ahead in your study of rationality.
Crucial Conversations - Al Switzler, Joseph Grenny, Kerry Patterson, Ron McMillan ★★★★
I have almost dropped this book. When I saw the style, it reminded me so much of the crappy self-help books without actual content. But fortunately I have read on a litte more, and it turns out that even while the style is the same in the whole book and it has litte content for the amount of text you read, it is still an excellent book. How is that possible? Simple: it only tells you a few things, but the things it tells you are actually important and they work and they are amazing when you put them into practice. Also on the concept and analysis side, there is precious little but who cares as long as there are some things that are "keepers". The authors spend most of the book hammering the same point over and over, which is "conversation safety". And it is still a good book: if you get this one simple point than you have learned more than you might from reading 10 other books.
How to Fail at Almost Everything and Still Win Big - Scott Adams ★★★
I don't agree with much of the stuff that is in this book, but that's not the point here. The author says what he thinks, and also he himself encourages you to pass it through your own filters. Around one third of the book, I thought it was obviously true; another one third, I had strong evidence that told me the author made a mistake or got confused about something; and the remaining one third gave me new ideas, or points of view that I could use to produce more ideas for my own use. This felt kind of like having a conversation with any intelligent person you might know, who has different ideas from you. It was a healthy ratio of agreement and disagreement, such that leads to progress for both people. Except of course in this case the author did not benefit, but I did.
Time end: 14:01:54
Total time to write this post: 26 minutes 48 seconds
Average writing speed: 31.2 words/minute, 169 characters/minute
The same data calculated for my previous speed-writing post: 30.1 words/minute, 167 characters/minute
[CORE] Concepts for Understanding the World
Background:
I'm recently doing a big project to increase my scholarship and modeling power for both rationality and traditional "serious" topics. One thing I found very useful is taking notes with a clear structure.
The structure I'm using currently is as follows:
- write down useful concepts,
- write down (as a separate category) useful heuristics & things to do in various situations,
- do not write facts, opinions or anything else (I rely on unaided memory to get more filtering).
Heuristic: learn concepts before facts!
Note that you can be mistaken about facts, but you can't harm your epistemology by learning concepts. Even if a concept turns out to be useless or misleading, you are better off knowing about it, understanding how it's misleading, and being able to avoid the trap when you see it.
Let's share concepts!
Please give (at a minimum) a name and a reference (link). A short description in plain language is also welcome.
A Very Concrete Model of Learning From Regrets
Warning 1: This post is written in the form of Java-like pseudocode.
If you have no knowledge of programming, you might have trouble understanding it.
(If you do, it still does not guarantee you will understand, but your chances are better.)
Warning 2: I have more than moderate, but less than high, confidence that this model is approximately correct.
It doesn't mean that my or anyone's brain works exactly in the way shown in the code, but rather that the flow of data in the brain is approximately as if it were using such an algorithm.
The word "approximately" includes stuff I don't (yet) know about, but also stuff I didn't include below to keep it simple.
I wrote this specifically for regrets, but processing of positive memories seems to have similar mechanics (with different constants).
Warning 3: There is little chance of finding any existing studies/data etc. that could directly validate or invalidate this model. (However if you know of any, I'm all ears.)
There might some stuff that is correlated, so if you know something mention it too.
class Brain
{
...
// This represents a memory about a single event
class Memory
{
...
float associatedEmotions; // positive or negative
}
// Your brain keeps track of this
private Map<Memory, Float> memoriesRequireProcessing = new Map<>();
// Add new stuff to the queue
private void somethingHappened(Memory newMemory)
{
float affect = getAffectOfSituation(newMemory);
newMemory.associatedEmotions = affect * 0.5;
if (Math.abs(affect) > 0.1)
memoriesRequireProcessing.add(newMemory, Math.abs(affect));
}
// You have no control over how this works,
// but you can influence the confidence parameter
// (mostly indirectly, a little bit directly)
protected void learnedMyLesson(Memory m, float confidence)
{
float previousValue =
memoriesRequireProcessing.get(m);
float nextValue = previousValue * (1.0 - confidence);
if (nextValue > 0.1)
memoriesRequireProcessing.set(m, nextValue);
else
memoriesRequireProcessing.remove(m);
}
// You can consciously override this and do something else
//
// @return: judgement of success or failure
protected float ruminateOnMemory(Memory m)
{
// Depends on the situation, but the default is
// relatively low confidence
learnedMyLesson(m, 0.1);
// Substitute affect for judgement of success
return getAffectOfSituation(m);
}
// This prompts some thoughts about a memory
private void rememberAbout(Memory m)
{
feelEmotion(m.associatedEmotions);
float judgement = ruminateOnMemory(m);
m.associatedEmotions =
0.9 * m.associatedEmotions
+ 0.2 * judgement;
}
// Your brain does this all the time
private void onIdle()
{
while (memoriesRequireProcessing.thereIsALotOfShit())
{
// Choose some memory paired with a high value
Memory next = memoriesRequireProcessing.choose();
rememberAbout(next);
}
...
}
...
}Notes on Imagination and Suffering
Time: 22:56:47
I
This is going to be an exercise in speed writing a LW post.
Not writing posts at all seems to be worse than writing poorly edited posts.
It is currently hard for me to do anything that even resembles actual speed writing: even as I type this sentence, I have a very hard to resist urge to check it for grammar mistakes and make small corrections/improvements before I've even finished typing.
But to reduce the burden of writing, I predict it is going to be highly useful to develop the ability of actually writing a post as fast as I can type, without going back.
If this proves to have acceptable results, you can expect more regular posts from me in the future.
And possibly, if I develop the habit of writing regularly, I'll finally get to describing some of the topics on which I have (what I believe are) original and sizable clusters of knowledge, which is not easily available somewhere else.
But for now, just some thoughts on a very particular aspect of modelling how human brains think about a very particular thing.
This thing is immense suffering.
Time: 23:03:18
(Still slow!)
II
You might have heard this or similar from someone, possibly more than once in your life:
"you have no idea how I feel!"
or
"you can't even imagine how I feel!"
For me, this kind of phrase has always had the ring of a challenge. I have a potent imagination, and non-negligible experience in the affairs of humans. Therefore, I am certainly able to imagine how you feel, am I not?
Not so fast.
(Note added later: as Gram_Stone mentions, these kinds of statements tend to be used in epistemically unsound arguments, and as such can be presumed to be suspicious; however here, I am more concerned with the fact of the matter of how imagination works.)
Let's back up a little bit and recount some simple observations about imagining numbers.
You might be able to imagine and hold the image of five, six, nine, or even sixteen apples in your mind.
If I tell you to imagine something more complex, like pointed arrows arranged in a circle, you might be able to imagine four, or six, or maybe even eight of them.
If your brain is constructed differently from mine, you might easily go higher with the numbers.
But at some fairly small number, your mental machinery simply no longer has the capacity to imagine more shapes.
III
However, if I tell you that "you can't even imagine 35 apples!" it is obviously not an insult or a challenge, and what is more:
"imagining 35 apples" is NOT EQUAL to "comprehending in every detail what 35 apples are"
I.e. depending on how good your knowledge of natural numbers is, that is to say, if you passed the first class of primary school, you can analyse the situation of "35 apples" in every possible way, and imagine it partially - but not all of it at the same time.
Directly imagining apples is very similar to actually experiencing apples in your life, but it has a severe limitation.
You can experience 35 apples in your life, but you can't imagine all of them at once even if you saw them 3 seconds ago.
Meta: I think I'm getting better at not stopping when I write.
Time: 23:13:00
IV
But, you ask, what is the point of writing all this obvious stuff about apples?
Well, if you move to more emotionally charged topics, like someone's emotions, it is much harder to think about the situation in a clear way.
And if you have a clear model of how your brain processes this information, you might be able to respond in a more effective way.
In particular, you might be saved from feeling guilty or inadequate about not being able to imagine someone's feelings or suffering.
It is a simple fact about your brain that it has a limited capability to imagine emotion.
And especially with suffering, the amount of suffering you are able to experience IS OF A COMPLETELY DIFFERENT ORDER OF MAGNITUDE than the amount you are able to imagine, even with the best intentions and knowledge.
However, can you comprehend it?
V
From this model, it is also immediately obvious that the same thing happens when you think about your own suffering in the past.
We know generally that humans can't remember their emotions very well, and their memories don't correlate very well with reported experience-in-the-moment.
Based on my personal experience, I'll tentatively make some bolder claims.
If you have suffered a tremendous amount, and then enough time has passed to "get over it", your brain is not only unable to imagine how much you have suffered in the past:
it is also unable to comprehend the amount of suffering.
Yes, even if it's your own suffering.
And what is more, I propose that the exact mechanism of "getting over something" is more or less EQUIVALENT to losing the ability to comprehend that suffering.
The same would (I expect) hold in case of getting better after severe PTSD etc.
VI
So in this sense, a person telling you "you cannot even imagine how I feel" is right also with a less literal interpretation of their statement.
If you are a mentally healthy individual, not suffering any major traumas etc., I suggest your brain literally has a defense mechanism (that protects your precious mental health) that makes it impossible for you to not only imagine, but also fully comprehend the amounts of suffering you are being told about.
Time: 23:28:04
Publish!
Secret Rationality Base in Europe
In short, I'm wondering what place/group/organisation/activity could do for rationality in Europe what Berkeley does for rationality in the US?
Soon, we'll have LWCW in Berlin, which I hope will be an occasion to do some networking among people who think seriously about developing rationality communities. But in the meantime, let's do some brainstorming.
Important note: in comments to this post, please use only consequentialist language. For example, say "If we decided for the base to be on Malta, then X would happen" instead of "I think it should be in Malta, because..."
- What would happen if the rationality base was located in [insert specific city/country]?
- What could such a place offer to you now, that would make you consider a temporary/permanent move?
- What would happen if the European rationality community efforts were centered around some particular research topic (e.g. AI)?
- Is there something you can think of that would speed up community-building in Europe?
Of course, share anything else that you think is relevant to the topic.
Also, see you all in Berlin :)
Geometric Bayesian Update
Today, I present to you Bayes theorem like you have never seen it before.
Take a moment to think: how would you calculate a Bayesian update using only basic geometry? I.e., you are given (as line segments) a prior P(H), and also P(E | H) and P(E | ~H) (or their ratio). How do you get P(H | E) only by drawing straight lines on paper?
Can you think of a way that would be possible to implement using a simple mechanical instrument?
It just so happens that today I noticed a very neat way to do this.
Have fun with this GeoGebra worksheet.
And here's a static image version if the live demo doesn't work for you:

Your math homework is to find a proof that this is indeed correct.
Hint: Vg'f cbffvoyr gb qb guvf ryrtnagyl naq jvgubhg nal pnyphyngvbaf, whfg ol ybbxvat ng engvbf bs nernf bs inevbhf gevnatyrf.
Please post answers in rot13, so that you don't spoil the fun for others who want to try.
Edit: For reference, here's a pictograph version of the diagram that came up later as a follow-up to this comment.

View more: Next
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)