There is also a winner's curse risk: if a person is too good, s-he could have some secret disadvantage or may leave me quickly as s-he will have many better options than me. It puts a cap on the level above median which I should look at. Therefore first few attempts have to establish the medial level of available for me people.
Another problem is that any trial run has a cost, like years and money spent. If I did searching too long, I will spent less time with my final partner.
In this post, I analyze a multidimensional version of the dating problem (more commonly called the secretary problem). If you are familiar with this problem you can search for “For this post, I want to make a slight change to the model” and skip to that. Otherwise, here is an intro to the usual dating problem:
Suppose you are a serially monogamous dater and your goal is to eventually find someone to marry. Then you will have to make decisions of the type: should I settle with my current partner or reject them in the hope of finding someone better?
This is the setup for the dating problem. In this model, we assume there is some maximal number, n, of partners you can meet. Here “meet” can mean different things. You could for example interpret it to mean “going on a single date” or “been dating for a year”. Depending on the interpretation, n will have very different sizes. The model does not take into account that information is gradually revealed over time.
As soon as you have “met” someone, you know how good they are compared to previous people you have met, but you have no further information about how good they are compared to people you have not met yet. You will meet the partners in completely random order. You have to reject one partner before you can meet another one, and you can never go back to a person you have rejected. How do you maximize the probability of settling with the best partner?
The optimal strategy is to reject the k first people, where k is around n/e, and e is the base of the natural logarithm. After this exploration phase, you settle down with the next person you meet who is better than all previous partners.
Two types of failures can happen when following this algorithm:
For this post, I want to make a slight change to the model: we have an infinite sequence of potential partners (yeah, I know, Tinder has messed up my world model) of random identically distributed attractiveness. You have dated the first k and plan to settle down with the next person you meet who is better than all previous partners. You know there are better potential partners out there, but all you care about is that you settle down with someone better than all the previous ones. Now there is only one type of failure
a. You run out of time before you meet someone better that the k first
Here, running out of time can mean dying or getting too old to have kids (in case you want kids). If you only have time to meet n people, this is just a new version of failure 1 above since your the best partner among the n you met, will have been among the k first. Again the probability of this failure is the same: k/n. Although the probability of eventually settling is 1, the expected time before settling is given by
kk+kk+1+kk+2+…
so it is infinite.
Some comments on the assumptions
This is clearly an unrealistic model:
In this post, I want to look at a different assumption of the model.
The multidimensional problem
Previously, we assumed that there is a single number (or at least, an element is a totally ordered set) that defined the attractiveness of a partner. In reality, different people have different strengths and weaknesses. Let's assume that there are two parameters you care the most about, X and Y. These parameters can be anything, concrete or abstract, objective or subjective, for example, how good a partner do you think they would be overall, how good a parent do you think they will be, how much you do enjoy spending time with them, how good your sex with them is, how well-aligned are your values or how good looking are they.
An interesting case is when X and Y are different estimates of the same variable, for example, X = how good you feel about this partner on average when you wake up together in the morning, Y= how good you feel about this partner on average when spending time with their family.
Suppose your settling strategy is to reject the first person and afterward settle down with the first person you meet who is better at both X and Y than any previous partner you have met. What will happen?
We still have the same failure type as before
a. You run out of time before meeting the first person who satisfies the stopping criterion.
But now there is also a new type of failure.
b. Even if you had time to meet infinitely many partners, you would never settle.
How likely is this? It depends if X and Y are correlated. First, let us assume that they are independent.
We have P(X2>X1)=P(Y2>Y1)=1/2 and by independence P(X2>X1 and Y2>Y1)=1/4. So the probability that you will not settle with person 2 is 3/4. Similar for person 3 the probability that you won’t settle (given that you haven’t settled before meeting this person) is 8/9, and for person n it is n2−1n2=(n−1)(n+1)n2. We can now compute the probability that you would never settle even given infinite time:
34⋅89⋅1516⋅……
=1⋅32⋅2⋅2⋅43⋅3⋅3⋅54⋅4⋅…
=(12⋅32)⋅(23⋅43)⋅(34⋅54)⋅…
=12⋅(32⋅23)⋅(43⋅34)⋅(54⋅45)⋅…
=12⋅1⋅1⋅1⋅…
=12
This is if you consider settling already after k=1. If you only consider settling after having rejected k partners, the probability that you will never settle with this strategy becomes kk+1.
If you consider a higher number of parameters than two, your odds become a lot worse. Here is a table of the probabilities for a given number of parameters and the number of exes after which you consider settling. The rows are the number of parameters (p) and the columns are the number of exes you have already rejected (k).
The math also gets worse. I did not find a simple formula for the values for more than two parameters. I simply ended the computation after 999 meetings.
For correlated parameters, I haven't tried to find a closed formula. Instead, I have simulated 20.000 sequences each of up to 200.000 meetings. I assumed all parameters were normally distributed with variance 1 and covariance of either 0.2, 0.5, or 0.8. Here are the tables.
0.2:
0.5:
0.8:
The table for 0.8 is probably not relevant for most pairs of traits people care about, but it could be relevant for the case where the parameters are estimates of the same variable.
More about assumptions
These probabilities are under the assumption that you first choose the parameters X and Y that you care about and then start dating.
If instead you first find a life partner and only afterward choose which parameters to look at, you will of course be likely to find many parameters where they are the best you have ever met, since people vary in thousands of ways.
If instead you already have been dating, and then pick parameters that you have been missing in your partners, and then start looking for a partner who is record high on all those parameters, your odds may be better than suggested above. Conversely, if you pick some parameters that one of your exes was good at, your odds would be lower.
We assumed that the values of a partner's parameters were all revealed at the same time. Instead, we could consider a model where they are only revealed one at a time: If a partner's X is not record high, you reject them before learning about their Y. In this model, you have probability 1 of eventually meeting someone who has a better X than anyone else you have met and higher Y than anyone whose Y you have had revealed. However, the expected number of such record (X, Y)s you will meet among the first n people you will meet, now grows much slower. It grows as ln(ln(n)) instead of as ln(n) in the one-parameter model. Similarly, for 3 parameters, the expected number of records now grows as ln(ln(ln(n))) and so on. Even if you are only looking for one record, this could take a long time to find.
So what to do?
What is the takeaway life advice from this?