Epistemic Status: Quite confident (80%?) about the framework being very useful for the subject of free will. Pretty confident (66%?) about the framework being useful for meta-ethics. Hopeful (33%) that I am using it to bring out directionally true statements about what my CEV would be in worlds where we have yet to have found objective value.
Most discussions about free will and meaning seem to miss what I understand to be the point. Rather than endlessly debating the metaphysics, we should focus on the decision-theoretic implications of our uncertainty. Here's how I[1] think we can do that, using an abstracted Pascal's Wager.
Free Will: A Pointless Debate
People argue endlessly about whether we have Free Will, bringing up quantum mechanics, determinism, compatibilism, blah, blah (blah). But, regardless of if we have it or not:
In worlds where we have no free will:
Our beliefs about free will don't matter (we'll do whatever we were determined to do)
Our beliefs about what our actions should be don't matter and can't be changed (they were predetermined)
In worlds where we have free will:
Our beliefs about free will will affect what we will do (and what we will will change what will happen) (possibly to a Will) (unless our will won't work)
Our choices compound through causality, affecting countless other conscious creatures
Therefore, if we have free will, believing in it and acting accordingly is incredibly valuable. If we don't have free will, nothing we (choose to) believe matters anyway. The expected value clearly points towards acting as if we have free will, even if we assign it a very low probability (I don't think too much about what numbers should be here[2] but estimate it at 5-20%).
Meaning and All that is Valuable:
I've found I am unfortunately sympathetic to some nihilistic arguments:
Whether through personal passing, civilizational collapse, or the heat death of the universe, all information about our subjective experiences, be they bliss, bitterness, fulfillment, or failure, will eventually be lost.
Further, even if it truly is just about living in the moment, that which we as humans value - our goals, our emotions, our moral intuitions - are merely mesa-optimizations that contributed to reproductive fitness in our ancestral environment. It would be remarkably convenient if these specific things happened to be what is actually good in any grand, universal sense.
Nevertheless, through applying similar reasoning (Pascals Wager) to the question of meaning/value I can both integrate what seem to me to be very strong arguments in favor of nihilism and remain within a moral framework that is clearly better (still prescribes action/actually says something). [3]
Consider two possibilities:
There exists something (objective value) universally compelling that any sufficiently advanced mind (/bayesian agent) would recognize as valuable - something possibly beyond our evolutionary happenstance and/or something timeless[4]
There is nothing like objective value
If nothing is objectively valuable:
Whether we search for objective value doesn't matter (nothing matters)
Whether we find it doesn't matter (it doesn't exist)
How we spend our time doesn't matter (nothing matters)[5]
If objective value exists:
Whether we search for it matters enormously
Whether we find it matters enormously
How we spend our time is significantly more important and potentially value-creating than in the world above
By objective value, I mean something that Bayesian agents would inevitably converge on valuing through Aumann's Agreement Theorem - regardless of their starting points. While we don't know what this is yet (or if it exists), the convergence property is what makes it "objective" rather than just subjective preference. I can imagine this might include conscious experience as a component, but I remain quite uncertain.
The expected value calculation here is clear: we should act as if objective value exists and try to find it. The downside of being wrong is nothing (literally), while the upside of being right is vast!
What Does This Mean in Practice?
Given this framework, what should we actually do?
Instead of getting lost in meandering metaphysical debates about free will and value, we should act as if we have agency - make plans, take responsibility, and believe our choices matter. The alternative is strictly worse on decision-theoretic terms.
Further, we should try to maximize our odds of finding what is objectively valuable. Currently, I think this is best achieved by:
Acting to ensure our survival[6] (hence a high priority on x-risk reduction)[7]
Creating more intelligent, creative, and philosophically yearning beings with lots of technology
Turning ourselves (and those we interact with) into a species(es) of techno-philosophers, pondering what is valuable, until we are certain we have either found objective value or that we will never find it.
On a less grand note, I expect we should be maintaining ethical guardrails towards minimizing suffering and maximizing happiness, assuffering matters in many plausible theories of value (wow, that is quite convenient for me as a human). Additionally, humans do knowledge-work better when they're both happy and not getting tortured.
The beauty of this approach is that it works regardless of whether we're right about the underlying metaphysics. If we're wrong, we lose nothing. If we're right, we gain everything.
Meta Note: I've been wanting to write a post about this for a while, but never got around to writing it by myself. What I did here was had Claude interrogate me for a while about "What is your [my] world model", then had it propose things it thought I would en joy writing a blog post about, then write a rough draft of this. I've since edited it a decent bit, and gotten feedback from real live humans. I'd love meta-level commentary on this output of this process/Claude and I's writing.
While finding objective value is our most important task, we're forced to prioritize technical problems (like AI alignment and other less pressing X-Risk prevention) because we're too close to killing all the smart, creative, philosophically yearning beings we know of. We must ensure survival before we can properly pursue interim goal.
This creates a kind of nested Pascal's Wager:
We must bet on the existence of objective value
To have any chance of finding it, we must bet on our survival
To survive, we must bet on solving certain technical problems first
Silly Claude, why did this end up here? Ohh well, I guess I better justify it: We (aspiring rationalists) have a lot of thoughts that we have good reason to believe would lead to a better world if more people internalized them. Most of the world, including most of the important people in the world care about appearances (silly mesa-optimizers)! Putting a small amount of effort in how you look (possibly: getting a haircut, wearing clothes that fit +-10%, trying to avoid sweatpants and printed t-shirts, other stuff you might know to be applicable) helps get people to take you more seriously.
Epistemic Status: Quite confident (80%?) about the framework being very useful for the subject of free will. Pretty confident (66%?) about the framework being useful for meta-ethics. Hopeful (33%) that I am using it to bring out directionally true statements about what my CEV would be in worlds where we have yet to have found objective value.
Most discussions about free will and meaning seem to miss what I understand to be the point. Rather than endlessly debating the metaphysics, we should focus on the decision-theoretic implications of our uncertainty. Here's how I[1] think we can do that, using an abstracted Pascal's Wager.
Free Will: A Pointless Debate
People argue endlessly about whether we have Free Will, bringing up quantum mechanics, determinism, compatibilism, blah, blah (blah). But, regardless of if we have it or not:
In worlds where we have no free will:
In worlds where we have free will:
(and what we will will change what will happen) (possibly to a Will) (unless our will won't work)Therefore, if we have free will, believing in it and acting accordingly is incredibly valuable. If we don't have free will, nothing we (choose to) believe matters anyway. The expected value clearly points towards acting as if we have free will, even if we assign it a very low probability (I don't think too much about what numbers should be here[2] but estimate it at 5-20%).
Meaning and All that is Valuable:
I've found I am unfortunately sympathetic to some nihilistic arguments:
Whether through personal passing, civilizational collapse, or the heat death of the universe, all information about our subjective experiences, be they bliss, bitterness, fulfillment, or failure, will eventually be lost.
Further, even if it truly is just about living in the moment, that which we as humans value - our goals, our emotions, our moral intuitions - are merely mesa-optimizations that contributed to reproductive fitness in our ancestral environment. It would be remarkably convenient if these specific things happened to be what is actually good in any grand, universal sense.
Nevertheless, through applying similar reasoning (Pascals Wager) to the question of meaning/value I can both integrate what seem to me to be very strong arguments in favor of nihilism and remain within a moral framework that is clearly better (still prescribes action/actually says something). [3]
Consider two possibilities:
If nothing is objectively valuable:
If objective value exists:
By objective value, I mean something that Bayesian agents would inevitably converge on valuing through Aumann's Agreement Theorem - regardless of their starting points. While we don't know what this is yet (or if it exists), the convergence property is what makes it "objective" rather than just subjective preference. I can imagine this might include conscious experience as a component, but I remain quite uncertain.
The expected value calculation here is clear: we should act as if objective value exists and try to find it. The downside of being wrong is nothing (literally), while the upside of being right is vast!
What Does This Mean in Practice?
Given this framework, what should we actually do?
Instead of getting lost in meandering metaphysical debates about free will and value, we should act as if we have agency - make plans, take responsibility, and believe our choices matter. The alternative is strictly worse on decision-theoretic terms.
Further, we should try to maximize our odds of finding what is objectively valuable. Currently, I think this is best achieved by:
Getting rationalists to dress/present better[8]On a less grand note, I expect we should be maintaining ethical guardrails towards minimizing suffering and maximizing happiness, as suffering matters in many plausible theories of value (wow, that is quite convenient for me as a human). Additionally, humans do knowledge-work better when they're both happy and not getting tortured.
The beauty of this approach is that it works regardless of whether we're right about the underlying metaphysics. If we're wrong, we lose nothing. If we're right, we gain everything.
Meta Note: I've been wanting to write a post about this for a while, but never got around to writing it by myself. What I did here was had Claude interrogate me for a while about "What is your [my] world model", then had it propose things it thought I would en joy writing a blog post about, then write a rough draft of this. I've since edited it a decent bit, and gotten feedback from real live humans. I'd love meta-level commentary on this output of this process/Claude and I's writing.
See section "Free Will: A Pointless Debate"
I know, I know, KO'ing such a weak opponent is like bragging about how you're stronger than your grandparent.
I'd say defining objective value is my best guess as the weak point of this section.
Conveniently, in this world my attempts at humor won't matter either!
Gosh darned corrigibility worming its way into everything!
While finding objective value is our most important task, we're forced to prioritize technical problems (like AI alignment and other less pressing X-Risk prevention) because we're too close to killing all the smart, creative, philosophically yearning beings we know of. We must ensure survival before we can properly pursue interim goal.
This creates a kind of nested Pascal's Wager:
Silly Claude, why did this end up here? Ohh well, I guess I better justify it: We (aspiring rationalists) have a lot of thoughts that we have good reason to believe would lead to a better world if more people internalized them. Most of the world, including most of the important people in the world care about appearances (silly mesa-optimizers)! Putting a small amount of effort in how you look (possibly: getting a haircut, wearing clothes that fit +-10%, trying to avoid sweatpants and printed t-shirts, other stuff you might know to be applicable) helps get people to take you more seriously.