1 min read

2

This is a special post for quick takes by Morphism. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
14 comments, sorted by Click to highlight new comments since:

People often say things like "do x. Your future self will thank you." But I've found that I very rarely actually thank my past self, after x has been done, and I've reaped the benefits of x.

This quick take is a preregistration: For the next month I will thank my past self more, when I reap the benefits of a sacrifice of their immediate utility.

e.g. When I'm stuck in bed because the activation energy to leave is too high, and then I overcome that and go for a run and then feel a lot more energized, I'll look back and say "Thanks 7 am Morphism!"

(I already do this sometimes, but I will now make a TAP out of it, which will probably cause me to do it more often.)

Then I will make a full post describing in detail what I did and what (if anything) changed about my ability to sacrifice short-term gains for greater long-term gains, along with plausible theories w/ probabilities on the causal connection (or lack thereof), as well as a list of potential confounders.

Of course, it is possible that I completely fail to even install the TAP. I don't think that's very likely, because I'm #1-prioritizing my own emotional well-being right now (I'll shift focus back onto my world-saving pursuits once I'm more stablely not depressed). In that case I will not write a full post because the experiment would have not even been done. I will instead just make a comment on this shortform to that effect.

I'm subscribing to replies and rooting for you!

Contrary to what the current wiki page says, Simulacrum levels 3 and 4 are not just about ingroup signalling. See these posts and more, as well as Beaudrillard's original work if you're willing to read dense philosophy.

Here is an example where levels 3 and 4 don't relate to ingroups at all, which I think may be more illuminating than the classic "lion across the river" example:

Alice asks "Does this dress makes me look fat?" Bob says "No."

Depending on the simulacrum level of Bob's reply, he means:

  1. "I believe that the dress does not make you look fat."
  2. "I want you to believe that the dress does not make you look fat, probably because I want you to feel good about yourself."
  3. "Niether you nor I are autistic truth-obsessed rationalists, and therefore I recognize that you did not ask me this question out of curiosity as to whether or not the dress makes you look fat. Instead, due to frequent use of simulacrum level 2 to respond to these sorts of queries in the past, a new social equilibrium has formed where this question and its answer are detached from object-level truth, instead serving as a signal that I care about your feelings. I do care about your feelings, so I play my part in the signalling ritual and answer 'No.'"
  4. "Similar to 3, except I'm a sociopath and don't necessarily actually care about your feelings. Instead, I answer 'No' because I want you to believe that I care about your feelings."

Here are some potentially better definitions, of which the group association definitions are a clear special case:

  1. Communication of object-level truth.

  2. Optimization over the listener's belief that the speaker is communicating on simulacrum level 1, i.e. desire to make the listener believe what the listener says.

These are the standard old definitions. The transition from 1 to 2 is pretty straightforward. When I use 2, I want you to believe I'm using 1. This is not necessarily lying. It is more like Frankfurt's bullshit. I care about the effects of this belief on the listener, regardless of its underlying truth value. This is often (naively considered) prosocial, see this post for some examples.

Now, the transition from 2 to 3 is a bit tricky. Level 3 is a result of a social equilibrium that emerges after communication in that domain gets flooded by prosocial level 2. Eventually, everyone learns that these statements are not about object-level reality, so communication on levels 1 and 2 become futile. Instead, we have:

  1. Signalling of some trait or bid associated with historical use of simulacrum level 2.

E.g. that Alice cares about Bob's feelings, in the case of the dress, or that I'm with the cool kids that don't cross the river, in the case of the lion. Another example: bids to hunt stag.

3 to 4 is analogous to 1 to 2.

  1. Optimization over the listener's belief that the speaker is comminicating on simulacrum level 3, i.e. desire to make the listener believe that the speaker has the trait signalled by simulacrum level 3 communication (i.e. the trait that was historically associated with prosocial level 2 communication).

Like with the jump from 1 to 2, the jump from 3 to 4 has the quality of bullshit, not necessarily lies. Speaker intent matters here.

If you're thinking without writing, you only think you're thinking.

-Leslie Lamport

This seems..... straightforwardly false. People think in various different modalities. Translating that modality into words is not always trivial. Even if by "writing", Lamport means any form of recording thoughts, this still seems false. Often times, an idea incubates in my head for months before I find a good way to represent it as words or math or pictures or anything else.

Also, writing and thinking are separate (albiet closely related) skills, especially when you take "writing" to mean writing for an audience, so the thesis of this Paul Graham post is also false. I've been thinking reasonably well for about 16 years, and only recently have I started gaining much of an ability to write.

Are Lamport and Graham just wordcels making a typical mind fallacy, or is there more to this that I'm not seeing? What's the steelman of this claim that good thinking == good writing?

[-]cqb21

I'm not really sure if I'm talking past you in this or not, but I wrote it all out already so I'm going to post it.

I think I found the context of the quote. I'm reasonably certain it's not meant to be taken literally. It illustrates that when used skillfully writing can enhance one's thinking in such a way that it will outstrip the performance of thought without the assistance of writing.

You have to think before you write, and then you have to read what you wrote and think about it. And you have to keep rewriting, re-reading and thinking, until it’s as good as you can make it, even when writing an email or a text.

You're right that you can pretty clearly practice thinking without the assistance of writing, but writing gives you the constraint of having to form your thoughts into concise and communicable language, which pure thinking doesn't provide. Pure thought only needs to be legible to yourself, and repeating the same thought over and over with zero iteration isn't naturally penalized by the format.

... revising shouldn’t be the art of modifying the presentation of an idea to be more convincing. It should be the art of changing the idea itself to be closer to the truth, which will automatically make it more convincing.

.

Often times, an idea incubates in my head for months before I find a good way to represent it as words or math or pictures or anything else.

This points to a pretty valuable insight. A thought isn't always ready to be rigorously iterated upon. And, rigorous iteration is what writing is both a good tool and a good training method for. You can use pure thought for rigorous iteration, but using writing provides an advantage that our brains alone can't.

Writing gives us an expansion to working memory. I think this is the most significant thing writing does to enhance thought. Objects in our working memory only last 2-30 seconds, while we can keep 5-9 unrelated objects in working memory at a time. This seems quite limited. With writing we can dump them onto the page and then recall as needed.

Graham's claim that people who aren't writing aren't thinking is clearly false. People were thinking well before writing. But I do think writing is at least a good tool for significantly improving our thought processes. The words of Evan Chen sum it up better than I can:

The main purpose of writing is not in fact communication, at least not if you’re interested in thinking well. Rather, the benefits (at least the ones I perceive) are

  • Writing serves as an external memory, letting you see all your ideas and their connections at once, rather than trying to keep them in your head.
  • Explaining the ideas forces you to think well about them, the same way that teaching something is only possible with a full understanding of the concept.
  • Writing is a way to move closer to the truth, rather than to convince someone what the truth is.

Formalizing Placebomancy

I propose the following desideratum for self-referential doxastic modal agents (agents that can think about their own beliefs), where represents "I believe ", represents the agent's world model conditional on , and is the agent's preference relation:

Positive Placebomancy: For any proposition , The agent concludes from , if .

In natural English: The agent believes that hyperstitions, that benefit the agent if true, are true.

"The placebo effect works on me when I want it to".

A real life example: In this sequence post, Eliezer Yudkowsky advocates for using positive placebomancy on "I cannot self-deceive".

I would also like to formalize a notion of "negative placebomancy" (doesn't believe hyperstitions that don't benefit it), "total placebomancy" (believes hypestitions iff they are beneficial), "group placebomancy" (believes group hyperstitions that are good for everyone in the group, conditional on all other group members having group placebomancy or similar), and generalizations to probabilistic self-referential agents (like "ideal fixed-point selection" for logical inductor agents).

I will likely cover all of these in a future top-level post, but I wanted to get this idea out into the open now because I keep finding myself wanting to reference it in conversation.

Edit log:

  • 2024-12-08 rephrased the criterion to be an inference rule rather than an implication. Also made a minor grammar edit.

Can you clarify the Positive Placebomancy axoim?

Does it bracket as:

For any proposition P, The agent concludes P from (□P → P if (W | A) ≻ (W | ¬A)) .

or as:

For any proposition P, (The agent concludes P from □P → P) if (W | A) ≻ (W | ¬ A) .

And what is the relationship between P and A? Should A be P?

Oops that was a typo. Fixed now, and added a comma to clarify that I mean the latter.

Edit: There are actually many ambiguities with the use of these words. This post is about one specific ambiguity that I think is often overlooked or forgotten.

The word "preference" is overloaded (and so are related words like "want"). It can refer to one of two things:

  • How you want the world to be i.e. your terminal values e.g. "I prefer worlds in which people don't needlessly suffer."
  • What makes you happy e.g. "I prefer my ice cream in a waffle cone"

I'm not sure how we should distinguish these. So far, my best idea is to call the former "global preferences" and the latter "local preferences", but that clashes with the pre-existing notion of locality of preferences as the quality of terminally caring more about people/objects closer to you in spacetime. Does anyone have a better name for this distinction?

I think we definitely need to distinguish them, however, because they often disagree, and most "values disagreements" between people are just disagreements in local preferences, and so could be resolved by considering global preferences.

I may write a longpost at some point on the nuances of local/global preference aggregation.

Example: Two alignment researchers, Alice and Bob, both want access to a limited supply of compute. The rest of this example is left as an exercise.

I think you are missing even more confusing meaning: preference means what you actually choose.

In VNM axioms "agent prefers A to B" literally means "agent chooses A over B". It's confusing, because when we talk about human preferences we usually mean mental states, not their behavioral expressions.

This is indeed a meaningful distinction! I'd phrase it as:

  • Values about what the entire cosmos should be like
  • Values about what kind of places one wants one's (future) selves to inhabit (eg, in an internet-like upload-utopia, "what servers does one want to hang out on")

"Global" and "local" is not the worst nomenclature. Maybe "global" vs "personal" values? I dunno.

my best idea is to call the former "global preferences" and the latter "local preferences", but that clashes with the pre-existing notion of locality of preferences as the quality of terminally caring more about people/objects closer to you in spacetime

I mean, it's not unrelated! One can view a utility function with both kinds of values as a combination of two utility functions: the part that only cares about the state of the entire cosmos and the part that only cares about what's around them (see also "locally-caring agents").

(One might be tempted to say "consequentialist" vs "experiential", but I don't think that's right — one can still value contact with reality in their personal/local values.)

There are lots of different dimensions on which these vary.  I'd note that one is purely imaginary (no human has actually experienced anything like that) while the second is prediction strongly based on past experience.  One is far-mode (non-specific in experience, scope, or timeframe) and the other near-mode (specific, steps to achieve well-understood).

Does using the word "values" not sufficiently distinguish from "preferences" for you?

[-]JBlack2-2

The second type of preference seems to apply to anticipated perceptions of the world by the agent - such as the anticipated perception of eating ice cream in a waffle cone. It doesn't have to be so immediately direct, since it could also apply to instrumental goals such as doing something unpleasant now for expected improved experiences later.

The first seems to be a more like a "principle" than a preference, in that the agent is judging outcomes on the principle of whether needless suffering exists in it, regardless of whether that suffering has any effect on the agent at all.

To distinguish them, we could imagine a thought experiment in which such a person could choose to accept or deny some ongoing benefit for themselves that causes needless suffering on some distant world, and they will have their memory of the decision and any psychological consequences of it immediately negated regardless of which they chose.

It's even worse than that. Maybe I would be happier with my ice cream in a waffle cone the next time I have ice cream, but actually this is just a specific expression of being happier eating a variety of tasty things over time and it's just that I haven't had ice cream in a waffle cone for a while. The time after that, I will likely "prefer" something else despite my underlying preferences not having changed. Or something even more complex and interrelated with various parts of history and internal state.

It may be better to distinguish instances of "preferences" that are specific to a given internal state and history, and an agent's general mapping over all internal states and histories.