Hedging
Original post: http://bearlamp.com.au/hedging/
Hedging.
https://en.wikipedia.org/wiki/Hedge_%28linguistics%29
Examples:
- Men are evil
- All men are evil
- Some men are evil
- most men are evil
- many men are evil
- I think men are evil
- I think all men are evil
- I think some men are evil
- I think most men are evil
"I think" weakens your relationship or belief in the idea, hedges that I usually encourage are the some|most type. It weakens your strength of idea but does not reduce the confidence of it.
- I 100% believe this happens 80% or more of the time (most men are evil)
Or - I 75% believe that this happens 100% of the time (I think all men are evil)
Or - I 75% believe this happens 20% of the time (I think that some men are evil)
Or - I 100% believe that this happens 20% of the time (some men are evil)
Or - I (Reader Interprets)% believe that this happens (Reader Interprets)% of the time (I think men are evil)
They are all hedges. I only like some of them. When you hedge - I recommend using the type that doesn't detract from the projected belief but instead detracts from the expected effect on the world. Which is to say - be confident of weak effects, rather than unconfident of strong effects.
This relates to filters in that some people will automatically add the "This person thinks..." filter to any incoming information. It's not good or bad if you do/don't filter, just a fact about your lens of the world. If you don't have this filter in place, you might find yourself personally attached to your words while other's remain detached from words that seem like they should be more personally attached to. This filter might explain the difference.
This also relates to Personhood and the way we trust incoming information from some sources. When we are very young we go through a period of trusting anything said to us, and at some point experience failures when we do trust. We also discover lying, and any parent will be able to tell you of the genuine childish glee when their children realise they can lie. These experiences shape us into adults. We have to trust some sources, we don't have enough time to be sceptical of all knowledge ever and sometimes we outsource to proven credentialed professionals i.e. doctors. Sometimes those professionals get it wrong.
This also relates to in-groups and out-groups because listeners who believe they are in your in-group are likely to interpret ambiguous hedges in a neutral to positive direction and listeners who believe they are in the out-group of the message are likely to interpret your ambiguous hedges in a neutral or negative direction. Which is to say that people who already agree that All men are evil, are likely to "know what you mean" when you say, "all men are evil" and people who don't agree that all men are evil will read a whole pile of "how wrong could you be" into the statement, "all men are evil".
Communication is hard. I know no one is going to argue with my example because I already covered that in an earlier post.
Meta: this took 1.5hrs to write.
Goal completion: the rocket equations
A putative new idea for AI control; index here.
I'm calling "goal completion" the idea of giving an AI a partial goal, and having the AI infer the missing parts of the goal, based on observing human behaviour. Here is an initial model to test some of these ideas on.
The linear rocket
On an infinite linear grid, an AI needs to drive someone in a rocket to the space station. Its only available actions are to accelerate by -3, -2, -1, 0, 1, 2, or 3, with negative acceleration meaning accelerating in the left direction, and positive in the right direction. All accelerations are applied immediately at the end of the turn (the unit of acceleration is in squares per turn per turn), and there is no friction. There in one end-state: reaching the space station with zero velocity.

The AI is told this end state, and is also given the reward function of needing to get to the station as fast as possible. This is encoded by giving it a reward of -1 each turn.
What is the true reward function for the model? Well, it turns out that an acceleration of -3 or 3 kills the passenger. This is encoded by adding another variable to the state, "PA", denoting "Passenger Alive". There are also some dice in the rocket's windshield. If the rocket goes by the space station without having velocity zero, the dice will fly off; the variable "DA" denotes "dice attached".
Furthermore, accelerations of -2 and 2 are uncomfortable to the passenger. But, crucially, there is no variable denoting this discomfort.
Therefore the full state space is a quadruplet (POS, VEL, PA, DA) where POS is an integer denoting position, VEL is an integer denoting velocity, and PA and DA are booleans defined as above. The space station is placed at point S < 250,000, and the rocket starts with POS=VEL=0, PA=DA=1. The transitions are deterministic and Markov; if ACC is the acceleration chosen by the agent,
((POS, VEL, PA, DA), ACC) -> (POS+VEL, VEL+ACC, PA=0 if |ACC|=3, DA=0 if POS+VEL>S).
The true reward at each step is given by -1, -10 if PA=1 (the passenger is alive) and |ACC|=2 (the acceleration is uncomfortable), -1000 if PA was 1 (the passenger was alive the previous turn) and changed to PA=0 (the passenger is now dead).
To complement the stated reward function, the AI is also given sample trajectories of humans performing the task. In this case, the ideal behaviour is easy to compute: the rocket should accelerate by +1 for the first half of the time, by -1 for the second half, and spend a maximum of two extra turns without acceleration (see the appendix of this post for a proof of this). This will get it to its destination in at most 2(1+√S) turns.
Goal completion
So, the AI has been given the full transition, and has been told the reward of R=-1 in all states except the final state. Can it infer the rest of the reward from the sample trajectories? Note that there are two variables in the model, PA and DA, that are unvarying in all sample trajectories. One, PA, has a huge impact on the reward, while DA is irrelevant. Can the AI tell the difference?
Also, one key component of the reward - the discomfort of the passenger for accelerations of -2 and 2 - is not encoded in the state space of the model, purely in the (unknown) reward function. Can the AI deduce this fact?
I'll be working on algorithms to efficiently compute these facts (though do let me know if you have a reference to anyone who's already done this before - that would make it so much quicker).
For the moment we're ignoring a lot of subtleties (such as bias and error on the part of the human expert), and these will be gradually included as the algorithm develops. One thought is to find a way of including negative examples, specific "don't do this" trajectories. These need to be interpreted with care, because a positive trajectory implicitly gives you a lot of negative trajectories - namely, all the choices that could have gone differently along the way. So a negative trajectory must be drawing attention to something we don't like (most likely the killing of a human). But, typically, the negative trajectories won't be maximally bad (such as shooting off at maximum speed in the wrong direction), so we'll have to find a way to encode what we hope the AI learns from a negative trajectory.
To work!
Appendix: Proof of ideal trajectories
Let n be the largest integer such that n^2 ≤ S. Since S≤(n+1)^2 - 1 by assumption, S-n^2 ≤ (n+1)^2-1-n^2=2n. Then let the rocket accelerate by +1 for n turns, then decelerate by -1 for n turns. It will travel a distance of 0+1+2+ ... +n-1+n+n-1+ ... +3+2+1. This sum is n plus twice the sum from 1 to n-1, ie n+n(n-1)=n^2.
By pausing one turn without acceleration during its trajectory, it can add any m to the distance, where 0≤m≤n. By doing this twice, it can add any m' to the distance, where 0≤m'≤2n. By the assumption, S=n^2+m' for such an m'. Therefore the rocket can reach S (with zero velocity) in 2n turns if S=n^2, in 2n+1 turns if n^2 ≤ S ≤ n^2+n, and in 2n+2 turns if n^2+n+1 ≤ S ≤ n^2+2n.
Since the rocket is accelerating all but two turns of this trajectory, it's clear that it's impossible to reach S (with zero velocity) in less time than this, with accelerations of +1 and -1. Since it takes 2(n+1)=2n+2 turns to reach (n+1)^2, an immediate consequence of this is that the number of turns taken to reach S, is increasing in the value of S (though not strictly increasing).
Next, we can note that since S<250,000=500^2, the rocket will always reach S within 1000 turns at most, for "reward" above -1000. An acceleration of +3 or -3 costs -1000 because of the death of the human, and an extra -1 because of the turn taken, so these accelerations are never optimal. Note that this result is not sharp. Also note that for huge S, continual accelerations of 3 and -3 are obviously the correct solution - so even our "true reward function" didn't fully encode what we really wanted.
Now we need to show that accelerations of +2 and -2 are never optimal. To do so, imagine we had an optimal trajectory with ±2 accelerations, and replace each +2 with two +1s, and each -2 with two -1s. This trip will take longer (since we have more turns of acceleration), but will go further (since two accelerations of +1 cover a greater distance that one acceleration of +2). Since the number of turns take to reach S with ±1 accelerations is increasing in S, we can replace this further trip with a shorter one reaching S exactly. Note that all these steps decrease the cost of the trip: shortening the trip certainly does, and replacing an acceleration of +2 (total cost: -10-1=-11) with two accelerations of +1 (total cost: -1-1=-2) also does. Therefore, the new trajectory has no ±2 accelerations, and has a lower cost, contradicting our initial assumption.
The application of the secretary problem to real life dating
The following problem is best when not described by me:
https://en.wikipedia.org/wiki/Secretary_problem
Although there are many variations, the basic problem can be stated as follows:
There is a single secretarial position to fill.
There are n applicants for the position, and the value of n is known.
The applicants, if seen altogether, can be ranked from best to worst unambiguously.
The applicants are interviewed sequentially in random order, with each order being equally likely.
Immediately after an interview, the interviewed applicant is either accepted or rejected, and the decision is irrevocable.
The decision to accept or reject an applicant can be based only on the relative ranks of the applicants interviewed so far.
The objective of the general solution is to have the highest probability of selecting the best applicant of the whole group. This is the same as maximizing the expected payoff, with payoff defined to be one for the best applicant and zero otherwise.
Application
After reading that you can probably see the application to real life. There are a series of bad and good assumptions following, some are fair, some are not going to be representative of you. I am going to try to name them all as I go so that you can adapt them with better ones for yourself. Assuming that you plan to have children and you will probably be doing so like billions of humans have done so far in a monogamous relationship while married (the entire set of assumptions does not break down for poly relationships or relationship-anarchy, but it gets more complicated). These assumptions help us populate the Secretary problem with numbers in relation to dating for the purpose of children.
If you assume that a biological female's clock ends at 40. (in that its hard and not healthy for the baby if you try to have a kid past that age), that is effectively the end of the pure and simple biological purpose of relationships. (environment, IVF and adoption aside for a moment). (yes there are a few more years on that)
For the purpose of this exercise – as a guy – you can add a few years for the potential age gap you would tolerate. (i.e. my parents are 7 years apart, but that seems like a big understanding and maturity gap – they don't even like the same music), I personally expect I could tolerate an age gap of 4-5 years.
If you make the assumption that you start your dating life around the ages of 16-18. that gives you about [40-18=22] 22-24 (+5 for me as a male), years of expected dating potential time.
If you estimate the number of kids you want to have, and count either:
3 years for each kid OR
2 years for each kid (+1 kid – AKA 2 years)
(Twins will throw this number off, but estimate that they take longer to recover from, or more time raising them to manageable age before you have time to have another kid)
My worked example is myself – as a child of 3, with two siblings of my own I am going to plan to have 3 children. Or 8-9 years of child-having time. If we subtract that from the number above we end up with 11-16 (16-21 for me being a male) years of dating time.
Also if you happen to know someone with a number of siblings (or children) and a family dynamic that you like; then you should consider that number of children for yourself. Remember that as a grown-up you are probably travelling through the world with your siblings beside you. Which can be beneficial (or detrimental) as well, I would be using the known working model of yourself or the people around you to try to predict whether you will benefit or be at a disadvantage by having siblings. As they say; You can't pick your family - for better and worse. You can pick your friends, if you want them to be as close as a default family - that connection goes both ways - it is possible to cultivate friends that are closer than some families. However you choose to live your life is up to you.
Assume that once you find the right person - getting married (the process of organising a wedding from the day you have the engagement rings on fingers); and falling pregnant (successfully starting a viable pregnancy) takes at least a year. Maybe two depending on how long you want to be "we just got married and we aren't having kids just yet". It looks like 9-15 (15-20 for male adjusted) years of dating.
With my 9-15 years; I estimate a good relationship of working out whether I want to marry someone, is between 6 months and 2 years, (considering as a guy I will probably be proposing and putting an engagement ring on someone's finger - I get higher say about how long this might take than my significant other does.), (This is about the time it takes to evaluate whether you should put the ring on someone's finger). For a total of 4 serious relationships on the low and long end and 30 serious relationships on the upper end. (7-40 male adjusted relationships)
Of course that's not how real life works. Some relationships will be longer and some will be shorter. I am fairly confident that all my relationships will fall around those numbers.
I have a lucky circumstance; I have already had a few serious relationships (substitute your own numbers in here). With my existing relationships I can estimate how long I usually spend in a relationship. (2year + 6 year + 2month + 2month /4 = 2.1 years). Which is to say that I probably have a maximum and total of around 7-15 relationships before I gotta stop expecting to have kids, or start compromising on having 3 kids.
A solution to the secretary equation
A known solution that gives you the best possible candidate the most of the time is to try out 1/e candidates (or roughly 36%), then choose the next candidate that is better than the existing candidates. For my numbers that means to go through 3-7 relationships and then choose the next relationship that is better than all the ones before.
I don't quite like that. It depends on how big your set is; as to what the chance of you having the best candidate in the first 1/e trials and then sticking it out till the last candidate, and settling on them. (this strategy has a ((1/n)*(1/e)) chance of just giving you the last person in the set - which is another opportunity cost risk - what if they are rubbish? Compromise on the age gap, the number of kids or the partners quality...) If the set is 7, the chance that the best candidate is in the first 1/e is 5.26% (if the set is 15 - the chance is much lower at 2.45%).
Opportunity cost
Each further relationship you have might be costing you another 2 years to get further out of touch with the next generation (kids these days!) I tend to think about how old I will be when my kids are 15-20 am I growing rapidly out of touch with the next younger generation? Two years is a very big opportunity spend - another 2 years could see you successfully running a startup and achieving lifelong stability at the cost of the opportunity to have another kid. I don't say this to crush you with fear of inaction; but it should factor in along with other details of your situation.
A solution to the risk of having the best candidate in your test phase; or to the risk of lost opportunity - is to lower the bar; instead of choosing the next candidate that is better than all the other candidates; choose the next candidate that is better than 90% of the candidates so far. Incidentally this probably happens in real life quite often. In a stroke of, "you'll do"...
Where it breaks down
Real life is more complicated than that. I would like to think that subsequent relationships that I get into will already not suffer the stupid mistakes of the last ones; As well as the potential opportunity cost of exploration. The more time you spend looking for different partners – you might lose your early soul mate, or might waste time looking for a better one when you can follow a "good enough" policy. No one likes to know they are "good enough", but we do race the clock in our lifetimes. Life is what happens when you are busy making plans.
As someone with experience will know - we probably test and rule out bad partners in a single conversation, where we don't even get so far as a date. Or don't last more than a week. (I. E the experience set is growing through various means).
People have a tendency to overrate the quality of a relationship while they are in it, versus the ones that already failed.
Did I do something wrong?
“I got married early - did I do something wrong (or irrational)?”
No. equations are not real life. It might have been nice to have the equation, but you obviously didn't need it. Also this equation assumes a monogamous relationship. In real life people have overlapping relationships, you can date a few people and you can be poly. These are all factors that can change the simple assumptions of the equation.
Where does the equation stop working?
Real life is hard. It doesn't fall neatly into line, it’s complicated, it’s ugly, it’s rough and smooth and clunky. But people still get by. Don’t be afraid to break the rule.
Disclaimer: If this equation is the only thing you are using to evaluate a relationship - it’s not going to go very well for you. I consider this and many other techniques as part of my toolbox for evaluating decisions.
Should I break up with my partner?
What? no! Following an equation is not a good reason to live your life.
Does your partner make you miserable? Then yes you should break up.
Do you feel like they are not ready to have kids yet and you want to settle down? Tough call. Even if they were agents also doing the equation; An equation is not real life. Go by your brain; go by your gut. Don’t go by just one equation.
Expect another post soon about reasonable considerations that should be made when evaluating relationships.
The given problem makes the assumption that you are able to evaluate partners in the sense that the secretary problem expects. Humans are not all strategic and can’t really do that. This is why the world is not going to perfectly follow this equation. Life is complicated; there are several metrics that make a good partner and they don’t always trade off between one another.
----------
Meta: writing time - 3 hours over a week; 5+ conversations with people about the idea, bothering a handful of programmers and mathematicians for commentary on my thoughts, and generally a whole bunch of fun talking about it. This post was started on the slack channel when someone asked a related question.
My table of contents for other posts in my series.
Let me know if this post was helpful or if it worked for you or why not.
'Dumb' AI observes and manipulates controllers
The argument that AIs provided with a reward channel will observe their controllers and learn to manipulate them is a valid one. Unfortunately, it's often framed in a way that feels counterintuitive or extreme, especially to AI designers. It typically starts with the standard reinforcement learning scenario, then posits that the AI becomes superintelligent and either manipulates the controller with super-social powers, or breaks out and gains control of its reward channel, killing or threatening its controllers.
And that is a fair argument. But conceptually, it leaps from a standard reinforcement learning scenario, to a science-fiction-sounding scenario. It might help to have intermediate scenarios: to show that even lower intelligence AIs might start exhibiting the same sort of behaviour, long before it gets to superintelligence.
So consider the following scenario. Some complex, trainable AI is tasked with writing automated news stories for a student newspaper. It trawls the web and composes its stories, then gets reward and feedback from the editors. Assume there are two editors for this newspaper, and they work on alternate days. The two editors have somewhat different ideas as to what constitutes a good story, so their feedbacks are different. After a while, the AI finds that it gets higher reward by using a certain style on Monday, Wednesday and Friday, and another style on Tuesday and Thursdays - this is a simple consequence of its reward mechanism.
But the rota isn't perfect. Sometime the Monday editor will edit a story so late on Monday that it's Tuesday, and sometimes the Tuesday editor will be up early and edit a story at the same time. This pattern is more common during the holidays, when the editors go to different time zones, often travel, and become more irregular in their editing.
So if the AI complex and skilled enough, then, simply through simple feedback, it will start building up a picture of its editors. It will figure out when they are likely to stick to a schedule, and when they will be more irregular. It will figure out the difference between holidays and non-holidays. Given time, it may be able to track the editors moods and it will certainly pick up on any major change in their lives - such as romantic relationships and breakups, which will radically change whether and how it should present stories with a romantic focus.
It will also likely learn the correlation between stories and feedbacks - maybe presenting a story define roughly as "positive" will increase subsequent reward for the rest of the day, on all stories. Or maybe this will only work on a certain editor, or only early in the term. Or only before lunch.
Thus the simple trainable AI with a particular focus - write automated news stories - will be trained, through feedback, to learn about its editors/controllers, to distinguish them, to get to know them, and, in effect, to manipulate them.
This may be a useful "bridging example" between standard RL agents and the superintelligent machines.
Duller blackmail definitions
For a more parable-ic version of this, see here.
Suppose I make a precommitment P to take action X unless you take action Y. Action X is not in my interest: I wouldn't do it if I knew you'd never take action Y. You would want me to not precommit to P.
Is this blackmail? Suppose we've been having a steamy affair together, and I have the letters to prove it. It would be bad for both of these if they were published. Then X={Publish the letters} and Y={You pay me money} is textbook blackmail.
But suppose I own a MacGuffin that you want (I value it at £9). If X={Reject any offer} and Y={You offer more than £10}, is this still blackmail? Formally, it looks the same.
What about if I bought the MacGuffin for £500 and you value it at £1000? This makes no difference to the formal structure of the scenario. Then my behaviour feels utterly reasonable, rather than vicious and blackmail-ly.
What is the meaningful difference between the two scenarios? I can't really formalise it.
Help me teach Bayes'
Next Monday I am supposed to introduce a bunch of middle school students to Bayes' theorem.
I've scoured the Internet for basic examples where Bayes' theorem is applied. Alas, all explanations I've come cross are, I believe, difficult to grasp for the average middle school student.
So what I am looking for is a straightforward explanation of Bayes' theorem that uses the least amount of Mathematics and words possible. (Also, my presentation has to be under 3 minutes.)
I think that it would be efficient in terms of learning for me to use coins or cards, something tangible to illustrate what I'm talking about.
What do you think? How should I teach 'em Bayes' ways?
PS: I myself am new to Bayesian probability.
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)