All of AnnaSalamon's Comments + Replies

I got to the suggestion by imagining: suppose you were about to quit the project and do nothing.  And now suppose that instead of that, you were about to take a small amount of relatively inexpensive-to-you actions, and then quit the project and do nothing.  What're the "relatively inexpensive-to-you actions" that would most help?

Publishing the whole list, without precise addresses or allegations, seems plausible to me.

I guess my hope is: maybe someone else (a news story, a set of friends, something) would help some of those on the list to take i... (read more)

5Chris Monteiro
To identify a person internationally, a name isn't enough, you must also supply an address or social media links. I've performed medium level OSINT on most people so I annotate a fair bit of extra info internally. I do have tentative plans to publish a highly redacted format, such as 'A <seriousness level> plot where someone <did/didn't pay> to <kill/beat/harm> a <number of persons> of <genders> in <city/state/location + country> who appears to be <relationship-details> and <any other key details> who is can be found via <address only/social media> which <has/hasn't> been reported to <le agency/media/other> Honestly it's depressing reading through the cases to the level I can write this up. I have the payer status, crime type, location, address, report details and social info mostly normalised, but I would have to parse all cases again to create this. To my initial point, this is harmful to me. (And anyone)
6Dagon
This was my thinking as well.  On further reflection, and based on OP's response, I realize there IS a balance that's unclear.  The list contains some false-positives.  This is very likely just by the nature of things - some are trolls, some are pure fantasy, some will have moved on, and only a very few are real threats.   So the harm of making a public, anonymous, accusation and warning is definitely nonzero - it escalates tension for a situation that has passed.  The harm of failing to do so in the real cases is also nonzero, but I expect many of the putative victims know they have a stalker or deranged enemy who'd wish them dead, and the information is "just" that this particular avenue has been explored.   That balance is difficult.  I philosophically lean toward "open is better than secret, and neither is as good as organized curation and controlled disclosure".  Since there's no clear interest by authorities, I'd publish.  And probably I'd do so anonymously as I don't want the hassle of having potential murderers know about me.

Maybe some of those who received the messages were more alert to their surroundings after receiving it, even if they weren't sure it was real and didn't return the phone/email/messages?

I admit this sounds like a terrible situation.

3Chris Monteiro
Maybe, but I have been contacted by people who have received the scam email before. It's true that some people reached out AFTER the podcast aired, only then taking it seriously. I am partially able to leverage it for credibility also. Ultimatey significant effort is required to contact people, and then further more to provide a full context, risk assessment, after which they typically require support taking the issue through multiple law enforcement agencies, that is if they don't turn to violence themselves which has happened in at least one occasion:(

Gotcha.  No idea if this is a good or bad idea, but: what are your thoughts on dumping an edited version of it onto the internet, including names, photos and/or social media links, and city/country but not precise addresses or allegations?

3Chris Monteiro
But how are people supposed to react to such framing? Also some orders are limited to just name / address etc, where as some plot graphic torture for weeks and months.

Can you notify the intended victims?  Or at least the more findable intended victims?

4Chris Monteiro
Not really. Journalists spent months doing this via phone, email and messages and were ignored. Also, there are literal 'I am a hitman hired to kill you, pay me money to stop' scams that exist.
  • A man being deeply respected and lauded by his fellow men, in a clearly authentic and lasting way, seems to be a big female turn-on. Way way way bigger effect size than physique best as I can tell.
    • …but the symmetric thing is not true! Women cheering on one of their own doesn't seem to make men want her more. (Maybe something else is analogous, the way female "weight lifting" is beautification?)

My guess at the analogous thing: women being kind/generous/loving seems to me like a thing many men have found attractive across times and cultures, and seems to me ... (read more)

Steven Brynes wrotes:

 "For example, I expect that AGIs will be able to self-modify in ways that are difficult for humans (e.g. there’s no magic-bullet super-Adderall for humans), which impacts the likelihood of your (1a)."

My (1a) (and related (1b)), for reference:

(1a) “You” (the decision-maker process we are modeling) can choose anything you like, without risk of losing control of your hardware.  (Contrast case: if the ruler of a country chooses unpopular policies, they are sometimes ousted.  If a human chooses dieting/unrewarding problems/s

... (read more)

I just paraphrased the OP for a friend who said he couldn't decipher it.  He said it helped, so I'm copy-pasting here in case it clarifies for others.

I'm trying to say:

A) There're a lot of "theorems" showing that a thing is what agents will converge on, or something, that involve approximations ("assume a frictionless plane") that aren't quite true.

B) The "VNM utility theorem" is one such theorem, and involves some approximations that aren't quite true.  So does e.g. Steve Omohundro's convergent instrumental drives, the "Gandhi folk theorems" sho... (read more)

There is a problem that, other things equal, agents that care about the state of the world in the distant future, to the exclusion of everything else, will outcompete agents that lack that property. This is self-evident, because we can operationalize “outcompete” as “have more effect on the state of the world in the distant future”.

I am not sure about that!

One way this argument could fail: maybe agents who  care exclusively about the state of the world in the distant future end up, as part of their optimizing, creating other agents who care ... (read more)

2Steven Byrnes
To complement that list, Superintelligence chapter 7 lists four types of “situations in which an agent can best fulfill its final goals by intentionally changing them” (which is pretty similar to your “creating other agents who care in different ways from that”): * “social signaling” & “social preferences”—basically, maybe there are other powerful agents around who possess some mind-reading capability, including your (1c) * “preferences concerning own goal content” (“for example, the agent might have a final goal to become the type of agent that is motivated by certain values rather than others (such as compassion rather than comfort)”) * “storage [or processing] costs”, which we should probably broaden to ‘practical considerations about the algorithm actually working well in practice’, and then it would probably include your mathematician example and your (1a, 1b, 2, 4). Your (3) would be kinda “maybe there was never a so-called ‘final goal’ in the first place”, which is a bit related to the second bullet point, or maybe we should just say that Bostrom overlooks it. (Or maybe he talks about it somewhere else in the book? I forget.) I’d guess that the third bullet point is less likely to be applicable to powerful AGIs, than to humans. For example, I expect that AGIs will be able to self-modify in ways that are difficult for humans (e.g. there’s no magic-bullet super-Adderall for humans), which impacts the likelihood of your (1a).

or more centrally, long after I finish the course of action.

I don't understand why the more central thing is "long after I finish the course of action" as opposed to "in ways that are clearly 'external to' the process called 'me', that I used to take the actions."

4Steven Byrnes
Hmm, yeah that too. What I had in mind was the idea that “consequentialist” usually has a connotation of “long-term consequentialist”, e.g. taking multiple actions over time that consistently lead to something happening. For example: * Instrumental convergence doesn’t bite very hard if your goals are 15 seconds in the future. * If an AI acts to maximize long-term paperclips at 4:30pm, and to minimize long-term paperclips at 4:31pm, and to maximize them at 4:32pm, etc., and to minimize them at 4:33pm, etc., then we wouldn’t intuitively think of that AI as a consequentialist rational agent, even if it is technically a consequentialist rational agent at each moment.

I was trying to explain to Habryka why I thought (1), (3) and (4) are parts of the assumptions under which the VNM utility theorem is derived.

I think all of (1), (2), (3) and (4) are part of the context I've usually pictured in understanding VNM as having real-world application, at least.  And they're part of this context because I've been wanting to think of a mind as having persistence, and persistent preferences, and persistent (though rationally updated) beliefs about what lotteries of outcomes can be chosen via particular physical actions, and st... (read more)

I... don't think I'm taking the hidden order of the universe non-seriously.  If it matters, I've been obsessively rereading Christopher Alexander's "The nature of order" books, and trying to find ways to express some of what he's looking at in LW-friendly terms; this post is part of an attempt at that.  I have thousands and thousands of words of discarded drafts about it.

Re: why I think there might be room in the universe for multiple aspirational models of agency, each of which can be self-propagating for a time, in some contexts: Biology and cu... (read more)

2plex
Link is broken

I agree.  I love "Notes on the synthesis of form" by Christopher Alexander, as a math model of things near your vase example.

I agree with your claim that VNM is in some ways too lax.

vNM is .. too restrictive ... [because] vNM requires you to be risk-neutral. Risk aversion violates preferences being linear in probability ... Many people desperately want risk aversion, but that's not the vNM way.

Do many people desperately want to be risk averse about the probability a given outcome will be achieved?  I agree many people want to be loss averse about e.g. how many dollars they will have.  Scott Garrabrant provides an example in which a couple wishes to be fair to its membe... (read more)

5niplav
I'm… pretty sure that something like the certainty effect is really important to people, and I'd count that as a type of risk aversion. Often that takes the form of violating continuity and lexically preferring options with certainty over lotteries with non-{0, 1} probabilities. The issue may also partially lie with Bayesianism, where you can never update to (or away from) certainty that you actually have got The Good Thing, Here (or avoided That Bad Thing since it's definitely Not Here). And that can also connect to some of the lack of green in optimizers, because they can never be sure that they have actually got The Good Thing (being certain that at least one paperclip is right here, for real, at least for now). Instead they strive to update ever closer to that certainty, gaining ever more marginal utility since that's marginally more valuable under vNM. Humans and animals, on the other hand, have a mode where they sometimes either round the probability up to 1 (or down to 0) or act as if there is no marginally increasing utility from increasing the probability of Good Thing. So (I think) that they by default perform mild optimization.

The VNM axioms refer to an "agent" who has "preferences" over lotteries of outcomes.  It seems to me this is challenging to interpret if there isn't a persistent agent, with a persistent mind, who assigns Bayesian subjective probabilities to outcomes (which I'm assuming it has some ability to think about and care about, i.e. my (4)), and who chooses actions based on their preferences between lotteries.  That is, it seems to me the axioms rely on there being a mind that is certain kinds of persistent/unaffected.

Do you (habryka) mean there's a new ... (read more)

4Mateusz Bagiński
Right but if there isn't a persistent agent with a persistent mind, then we no longer have an entity to which predicates of rationality apply (at least in the sense that the term "rationality" is usually understood in this community). Talking about it in terms of "it's no longer vNM-rational" feels like saying "it's no longer wet" when you change the subject of discussion from physical bodies to abstract mathematical structures. Or am I misunderstanding you?

The standard dutch-book arguments seem like pretty good reason to be VNM-rational in the relevant sense.

I mean, there are arguments about as solid as the “VNM utility theorem” pointing to CDT, but CDT is nevertheless not always the thing to aspire to, because CDT is based on an assumption/approximation that is not always a good-enough approximation (namely, CDT assumes our minds have no effects except via our actions, eg it assumes our minds have no direct effects on others’ predictions about us).

Some assumptions the VNM utility theorem is based on, that I... (read more)

8habryka
(I don't think any of these, except 2, are things that the VNM axioms rely on. The rest seem totally compatible to me. I agree that 2 is interesting, and I've liked Scott Garrabrant's exploration of the stuff)

Seems helpful for understanding how believing-ins get formed by groups, sometimes.

"Global evaluation" isn't exactly what I'm trying to posit; more like a "things bottom-out in X currency" thing.

Like, in the toy model about $ from Atlas Shrugged, an heir who spends money foolishly eventually goes broke, and can no longer get others to follow their directions.  This isn't because the whole economy gets together to evaluate their projects.  It's because they spend their currency locally on things again and again, and the things they bet on do not pay off, do not give them new currency.

I think the analog happens in me/others: I'll get excited about some topic, pursue it for awhile, get back nothing, and decide the generator of that excitement was boring after all.

2Matt Goldenberg
ah that makes sense  in my mind this isn't resources flowing to elsewhere, it's either:   1. An emotional learning update 2. A part of you that hasn't been getting what it wants speaking up.

Hmm.  Under your model, are there ways that parts gain/lose (steam/mindshare/something)?

7Viliam
There is an ACX article on "trapped priors", which in the Ayn Rand analogy would be... uhm, dunno. The idea is that a subagent can make a self-fulfilling prophecy like "if you do X, you will feel really bad". You use some willpower to make yourself do X, but the subagent keeps screaming at you "now you will feel bad! bad!! bad!!!" and the screaming ultimately makes you feel bad. Then the subagent says "I told you so" and collects the money. The business analogy could be betting on company internal prediction market, where some employees figure out that they can bet on their own work ending up bad, and then sabotage it and collect the money. And you can't fire them, because HR does not allow you to fire your "best" employees (where "best" is operationalized as "making excellent predictions on the internal prediction market").
2Matt Goldenberg
in my model that happens through local updates, rather than a global system for instance, if i used my willpower to feel my social anxiety completely (instead of the usual strategy of suppression) while socializing, i might get some small or large reconsolidation updates to the social anxiety, such that that part thinks it's needed in less situations or not at all alternatively, the part that has the strategy of going to socialize and feeling confident may gain some more internal evidence, so it wins the internal conflict slightly more (but the internal conflict is still there and causes a drain) i think the sort of global evaluation you're talking about is pretty rare, though something like it can happen when someone e.g. reaches a deep state of love through meditation, and then is able to access lots of their unloved parts that are downstream TRYING to get to that love and suddenly a big shift happens to whole system simultaneously (another type of global reevaulation can take place through reconsolidating deep internal organizing principles like fundamental ontological constraints or attachment style)

Does it feel to you as though your epistemic habits / self-trust / intellectual freedom and autonomy / self-honesty takes a hit here?

4jefftk
I think it's a pretty weak hit, though not zero. There are so many things I want to look into that I don't have time for that having this as another factor in my prioritization doesn't feel very limiting to my intellectual freedom. I do think it is good to have a range of people in society who are taking a range of approaches, though!

Fair point; I was assuming you had the capacity to lie/omit/deceive, and you're right that we often don't, at least not fully.

I still prefer my policy to the OPs, but I accept your argument that mine isn't a simple Pareto improvement.

Still:

  • I really don't like letting social forces put "don't think about X" flinches into my or my friends' heads; and the OPs policy seems to me like an instance of that;
  • Much less importantly: as an intelligent/self-reflective adult, you may be better at hiding info if you know what you're hiding, compared to if you have guesse
... (read more)
2jefftk
In many cases I have guesses, but because I just have vague impressions they're all very speculative. This is consistent with being able to say "I haven't looked into it" and "I really don't know", and because these are all areas where the truth is not decision relevant it's been easy to leave it at that. Perhaps people notice I have doubts, but at least in my social circles that's acceptable if not made explicit.

I don't see advantage to remaining agnostic, compared to:

1) Acquire all the private truth one can.

Plus:

2) Tell all the public truth one is willing to incur the costs of, with priority for telling public truths about what one would and wouldn't share (e.g. prioritizing to not pose as more truth-telling than one is).

--

The reason I prefer this policy to the OP's "don't seek truth on low-import highly-politicized matters" is that I fear not-seeking-truth begets bad habits.  Also I fear I may misunderstand how important things are if I allow politics to in... (read more)

[anonymous]146

One of the advantages to remaining agnostic comes from the same argument that users put forth in the comment sections on this very site way back in the age of the Sequences (I can look up the specific links if people really want me to, they were in response to the Doublethink Sequence) for why it's not necessarily instrumentally rational for limited beings like humans to actually believe in the Litany of Tarski: if you are in a precarious social situation, in which retaining status/support/friends/resources is contingent on you successfully signaling to yo... (read more)

6Ben Pace
Also, a norm of "allowing people to keep their beliefs private on subjects they feel a lot of pressure on" gives space for people to gather information personally without needing to worry about the pressures on them from their society.

Yes, this is a good point, relates to why I claimed at top that this is an oversimplified model.  I appreciate you using logic from my stated premises; helps things be falsifiable.

It seems to me:

  • Somehow people who are in good physical health wake up each day with a certain amount of restored willpower.  (This is inconsistent with the toy model in the OP, but is still my real / more-complicated model.)
  • Noticing spontaneously-interesting things can be done without willpower; but carefully noticing superficially-boring details and taking notes in hop
... (read more)
4Kaj_Sotala
This fits in with opportunity cost-centered and exploration-exploitation -based views of willpower. Excessive focus on any one task implies that you are probably hitting diminishing returns while accumulating opportunity costs for not doing anything else. It also implies that you are probably strongly in "exploit" mode and not doing much exploring. Under those models, accumulating mental fatigue acts to force some of your focus to go to tasks that feel more intrinsically enjoyable rather than duty-based, which tends to correlate with things like exploration and e.g. social resource-building. And your willpower gets reset during the night so that you could then go back to working on those high-opportunity cost exploit tasks again. I think those models fit together with yours.

Thanks for asking.  The toy model of “living money”, and the one about willpower/burnout, are meant to appeal to people who don’t necessarily put credibility in Rand; I’m trying to have the models speak for themselves; so you probably *are* in my target audience.  (I only mentioned Rand because it’s good to credit models’ originators when using their work.)

Re: what the payout is:

This model suggests what kind of thing an “ego with willpower” is — where it comes from, how it keeps in existence:

  • By way of analogy: a squirrel is a being who turns acor
... (read more)
5Ustice
Thanks for clarifying! Willpower is a tricky concept. I’ve suffered from depression at times, where getting out of bed felt like a huge exertion of emotional energy. Due to my tenuous control over my focus with ADHD, I often have to repeat in my head what I’m doing so I don’t forget in the middle of it. I’ve also put in 60-hour weeks writing code, both because I’ve had serious deadlines, but also because time disappeared as I got so wrapped up in it. I’ve stayed on healthy diets for years without problem, and had times where slipped back to high sugar foods. All of these are examples of what people refer to as willpower (or lack there-of). Most of them are from times in my life where I haven’t felt really in control. This is especially true regarding memory. It’s not uncommon for me to realize as I am putting my groceries away that I didn’t get the one item I really needed (and have to go back). That said, I’m pretty good at grit: I’m willing to put in the work, despite hardships and obstacles. I’m also good at leading by example. I’ll fight the good fight, when needed, All of these different features of me and my brain, are wrapped up in the concept of willpower. Each of them are a mixture of conscious and unconscious patterns of behavior (including cognitive). It’s this distinction that makes me look askance at the concept of willpower. It’s too wrapped up in moral judgement. I wasn’t diagnosed with ADHD until after my son was. I lived with a lot guilt and shame because I interpreted the things I struggled with as a moral failings, because I just lacked the willpower. Then I saw how many people struggled with the same sorts of things I did. It was really weird learning that so many things I previously would have described as negative personality traits of mine, turned out to be what happens when someone has this quirk in their brain that me and my son have. Now, I don’t carry that guilt. Now, I know that despite my best efforts, tools, and practices, there

I mean, I see why a party would want their members to perceive the other party's candidate as having a blind spot.  But I don't see why they'd be typically able to do this, given that the other party's candidate would rather not be perceived this way, the other party would rather their candidate not be perceived this way, and, naively, one might expect voters to wish not to be deluded.  It isn't enough to know there's an incentive in one direction; there's gotta be more like a net incentive across capacity-weighted players, or else an easier time creating appearance-of-blindspots vs creating visible-lack-of-blindspots, or something.  So, I'm somehow still not hearing a model that gives me this prediction.

2deepthoughtlife
To be pedantic, my model is pretty obvious, and clearly gives this prediction, so you can't really say that you don't see a model here, you just don't believe the model. Your model with  extra assumptions doesn't give this prediction, but the one I gave clearly does. You can't find a person this can't be done to because there is something obviously wrong with everyone? Things can be twisted easily enough. (Offense is stronger than defense here.) If you didn't find it, you just didn't look hard/creatively enough. Our intuitions against people tricking us aren't really suitable defense against sufficiently optimized searching. (Luckily, this is actually hard to do so it is pretty confined most of the time to major things like politics.) Also, very clearly, you don't actually have to convince all that many people for this to work! If even 20% of people really bought it, those people would probably vote and give you an utter landslide if the other side didn't do the same thing (which we know they do, just look at how divisive candidates obviously are!)

You raise a good point that Susan’s relationship to Tusan and Vusan is part of what keeps her opinions stuck/stable.

But I’m hopeful that if Susan tries to “put primary focal attention on where the scissors comes from, and how it is working to trick Susan and Robert at once”, this’ll help with her stuckness re: Tusan and Vusan.  Like, it’ll still be hard, but it’ll be less hard than “what if Robert is right” would be.

Reasons I’m hopeful:

I’m partly working from a toy model in which (Susan and Tusan and Vusan) and (Robert and Sobert and Tobert) all used ... (read more)

4TsviBT
I think you might have been responding to which I did say in a parenthetical, but I was mainly instead saying Your reasons for hope make sense. Anecdata: In my case it would be mostly a hope, not a memory. E.g. I don't remember a time when "I understand what you're saying, but..." was a credible statement... Maybe it never was? E.g. I don't remember a time when I would expect people to be sufficiently committed to computing "what would work for everyone to live together" that they kept doing so in political contexts.

I don't follow this model yet.  I see why, under this model, a party would want the opponent's candidate to enrage people / have a big blind spot (and how this would keep the extremes on their side engaged), but I don't see why this model would predict that they would want their own candidate to enrage people / have a big blind spot.

3deepthoughtlife
So, my model isn't about them making their candidate that way, it is the much more obvious political move... make your opponent as controversial as possible. There is something weird / off / wrong about your opponent's candidate, so find out things that could plausibly make the electorate think that, and push as hard as possible. I think they're good enough at it. Or, in other words, try to find the best scissors statements about your opponent, where 'best' is determined both in terms of not losing your own supporters, and in terms of losing your opponent possible supporters. This is often done as a psyop on your own side, to make them not understand why anyone could possibly support said person. That said, against the simplified explanation I presented in my initial comment, there is also the obvious fact I didn't mention that the parties themselves have a certain culture, and that culture will have blindspots which they don't select along, but the other party does. Since the selection optimizes hard for what the party can see, that makes the selected bad on that metric, and even pushes out the people that can see the issue making it even blinder.
4Nick_Tarleton
It sounds to me like the model is 'the candidate needs to have a (party-aligned) big blind spot in order to be acceptable to the extremists(/base)'. (Which is what you'd expect, if those voters are bucketing 'not-seeing A' with 'seeing B'.) (Riffing off from that: I expect there's also something like, Motive Ambiguity-style, 'the candidate needs to have some, familiar/legible(?), big blind spot, in order to be acceptable/non-triggering to people who are used to the dialectical conflict'.)

Thanks; I love this description of the primordial thing, had not noticed this this clearly/articulately before, it is helpful.

Re: why I'm hopeful about the available levers here: 

I'm hoping that, instead of Susan putting primary focal attention on Robert ("how can he vote this way, what is he thinking?"), Susan might be able to put primary focal attention on the process generating the scissors statements: "how is this thing trying to trick me and Robert, how does it work?"

A bit like how a person watching a commercial for sugary snacks, instead of putt... (read more)

4TsviBT
This is clarifying... Does it actually have much to do with Robert? Maybe it would be more helpful to talk with Tusan and Vusan, who are also A-blind, B-seeing, candidate Y supporters. They're the ones who would punish non-punishers of supporting candidate X / talking about A. (Which Susan would become, if she were talking to an A-seer without pushing back, let alone if she could see into her A-blindspot.) You could talk to Robert about how he's embedded in threats of punishment for non-punishment of supporting candidate Y / talking about B, but that seems more confusing? IDK.

Or: by seeing themselves, and a voter for the other side, as co-victims of an optical illusion, designed to trick each of them into being unable to find another's areas of true seeing.  And by working together to figure out how the illusion works, while seeing it as a common enemy.

But my specific hypothesis here is that the illusion works by misconstruing the other voter's "Robert can see a problem with candidate Y" as "Robert can't see the problem with candidate X", and that if you focus on trying to decode the first the illusion won't kick in as much. 

By parsing the other voter as "against X" rather than "for Y", and then inquiring into how they see X as worth being against, and why, while trying really hard to play taboo and avoid ontological buckets.

6AnnaSalamon
Or: by seeing themselves, and a voter for the other side, as co-victims of an optical illusion, designed to trick each of them into being unable to find another's areas of true seeing.  And by working together to figure out how the illusion works, while seeing it as a common enemy. But my specific hypothesis here is that the illusion works by misconstruing the other voter's "Robert can see a problem with candidate Y" as "Robert can't see the problem with candidate X", and that if you focus on trying to decode the first the illusion won't kick in as much. 

Huh.  Is your model is that surpluses are all inevitably dissipated in some sort of waste/signaling cascade?  This seems wrong to me but also like it's onto something.

4tailcalled
And I guess I should say, I have a more sun-oriented and less competition-oriented view. A surplus (e.g. in energy from the sun or negentropy from the night) has a natural "shape" (e.g. trees or solar panels) that the surplus dissipates into. There is some flexibility in this shape that leaves room for choice, but a lot less than rationalists usually assume.
4tailcalled
Kind of. First, the big exception: If you manage to enforce global authoritarianism, you can stockpile surplus indefinitely, basically tiling the world with charged-up batteries. But what's the point of that? Secondly, "waste/signaling cascade" is kind of in the eye of the beholder. If a forest is standing in some region, is it wasting sunlight that could've been used on farming? Even in a very literal sense, you could say the answer is yes since the trees are competing in a zero-sum game for height. But without that competition, you wouldn't have "trees" at all, so calling it a waste is a value judgement that trees are worthless. (Which of course you are entitled to make, but this is clearly a disagreement with the people who like solarpunk.) But yeah, ultimately I'm kind of thinking of life as entropy maximization. The surplus has to be used for something, the question is what. If you've got nothing to use it for, then it makes sense for you to withdraw, but then it's not clear why to worry that other people are fighting over it.

I like your conjecture about Susan's concern about giving Robert steam.

I am hoping that if we decode the meme structure better, Susan could give herself and Robert steam re: "maybe I, Susan, am blind to some thing, B, that matters" without giving steam to "maybe A doesn't matter, maybe Robert doesn't have a blind spot there."  Like, maybe we can make a more specific "try having empathy right at this part" request that doesn't confuse things the same way.  Or maybe we can make a world where people who don't bother to try that look like schmucks who aren't memetically savvy, or something.  I think there might be room for something like this?

4Nathan Helm-Burger
I think Tsvi has a point that empathy towards a group that has some good-but-puzzling views and some clearly-bad views is a tricky thing to manage internally. I think an easier step in this direction is to approach the problem more analytically. This is why I feel such affection for the Intellectual Turing Test. You can undertake the challenge of fully understanding some else's viewpoint without needing to emotionally commit to it. It can be a purely intellectual challenge. Sometimes, as I try to write an ITt for a view I feel is overall incorrect I sneer at the view a bit in my head. I don't endorse that, I think ideally one approaches the exercise in an emotionally neutral way. Nevertheless, it is a much easier step from being strongly set against a view to trying to tackle the challenge of fully understanding it. Going the further step to empathy for (parts of) the other view is a much harder further step to take.

IIUC, I agree with your vision being desirable. (And, IDK, it's sort of plausible that you can basically do it with a good toolbox that could be developed straightforwardly-ish.)

But there might be a gnarly, fundamental-ish "levers problem" here:

  • It's often hard to do [the sort of empathy whereby you see into your blindspot that they can see]
  • without also doing [the sort of empathy that leads to you adopting some of their values, or even blindspots].

(A levers problem is analogous to a buckets problem, but with actions instead of beliefs. You have an avai... (read more)

If we can get good enough models of however the scissors-statements actually work, we might be able to help more people be more in touch with the common humanity of both halves of the country, and more able to heal blind spots.

E.g., if the above model is right, maybe we could tell at least some people "try exploring the hypothesis that Y-voters are not so much in favor of Y, as against X -- and that you're right about the problems with Y, but they might be able to see something that you and almost everyone you talk to is systematically blinded to about X."... (read more)

I think this idea is worth exploring. The first bit seems pretty easy to convey and get people to listen to:

"try exploring the hypothesis that Y-voters are not so much in favor of Y, as against X -- and that you're right about the problems with Y...

But the second bit

... but they might be able to see something that you and almost everyone you talk to is systematically blinded to about X."

sounds like a very bitter pill to swallow, and therefore hard to get people to listen to.

I think motivated reasoning effects turn our attention quickly away from ideas we t... (read more)

I don't know the answer, but it would be fun to have a twitter comment with a zillion likes asking Sam Altman this question.  Maybe someone should make one?

6Arjun Panickssery
https://x.com/panickssery/status/1792586407623393435

I've bookedmarked this; thank you; I expect to get use from this list.

Resonating from some of the OP:

Sometimes people think I have a “utility function” that is small and is basically “inside me,” and that I also have a set of beliefs/predictions/anticipations that is large, richly informed by experience, and basically a pointer to stuff outside of me.

I don’t see a good justification for this asymmetry.

Having lived many years, I have accumulated a good many beliefs/predictions/anticipations about outside events: I believe I’m sitting at a desk, that Biden is president, that 2+3=5, and so on and so on.  These beliefs came... (read more)

And this requires what I've previously called "living from the inside," and "looking out of your own eyes," instead of only from above. In that mode, your soul is, indeed, its own first principle; what Thomas Nagel calls the "Last Word." Not the seen-through, but the seer (even if also: the seen).

 

I like this passage! It seems to me that sometimes I (perceive/reason/act) from within my own skin and perspective: "what do I want now? what's most relevant? what do I know, how do I know it, what does it feel like, why do I care? what even am I, this proce... (read more)

I love that book!  I like Robin's essays, too, but the book was much easier for me to understand.  I wish more people would read it, would review it on here, etc.

3xpym
That's probably Kevin's touch. Robin has this almost inhuman detachment, which on the one hand allows him to see things most others don't, but on the other makes communicating them hard, whereas Kevin managed to translate those insights into engaging humanese. Any prospective "rationality" training has to comprehensively grapple with the issues raised there, and as far as I can tell, they don't usually take center stage in the publicized agendas.

A related tweet by Qiaochu:

(I don't necessarily agree with QC's interpretation of what was going on as people talked about "agency" -- I empathize some, but empathize also with e.g. Kaj's comment in a reply that Kaj doesn't recognize this at from Kaj's 2018 CFAR mentorship training, did not find pressures there to coerce particular kinds of thinking).

My point in quoting this is more like: if people don't have much wanting of their own, and are immersed in an ambient culture that has opinions on what they should "want," experiences such as QC's seem sorta l... (read more)

8Elizabeth
I'm not sure if this is a disagreement or supporting evidence, but: I remember you saying you didn't want to teach SPARC kids too much [word similar to agency but not quite that. Maybe good at executing plans?], because they'd just use it to [coerce] themselves more. This was definitely before covid, maybe as far back as 2015 or 2016. I'm almost certain it was before QC even joined CFAR. It was a helpful moment for me. 

Some partial responses (speaking only for myself):

1.  If humans are mostly a kludge of impulses, including the humans you are training, then... what exactly are you hoping to empower using "rationality training"?  I mean, what wants-or-whatever will they act on after your training?  What about your "rationality training" will lead them to take actions as though they want things?  What will the results be?

1b.  To illustrate what I mean: once I taught a rationality technique to SPARC high schoolers (probably the first year of SPARC, ... (read more)

4Wei Dai
To give a straight answer to this, if I was doing rationality training (if I was agenty enough to do something like that), I'd have the goal that the trainees finish the training with the realization that they don't know what they want or don't currently want anything, but they may eventually figure out what they want or want something, and therefore in the interim they should accumulate resources/optionality, avoid doing harm (things that eventually might be considered irreversibly harmful), and push towards eventually figuring out what they want. And I'd probably also teach a bunch of things to mitigate the risk that the trainees too easily convince themselves that they've figured out what they want.
4TekhneMakre
It makes sense, but I think it's missing that adults who try to want in the current social world get triggered and/or traumatized as fuck because everyone else is behaving the way you describe.
9AnnaSalamon
A related tweet by Qiaochu: (I don't necessarily agree with QC's interpretation of what was going on as people talked about "agency" -- I empathize some, but empathize also with e.g. Kaj's comment in a reply that Kaj doesn't recognize this at from Kaj's 2018 CFAR mentorship training, did not find pressures there to coerce particular kinds of thinking). My point in quoting this is more like: if people don't have much wanting of their own, and are immersed in an ambient culture that has opinions on what they should "want," experiences such as QC's seem sorta like the thing to expect.  Which is at least a bit corroborated by QC reporting it.

I'm trying to build my own art of rationality training, and I've started talking to various CFAR instructors about their experiences – things that might be important for me to know but which hadn't been written up nicely before.

Perhaps off topic here, but I want to make sure you have my biggest update if you're gonna try to build your own art of rationality training.

It is, basically: if you want actual good to result from your efforts, it is crucial to build from and enable consciousness and caring, rather than to try to mimic their functionality.

If you're... (read more)

I'm not Critch, but to speak my own defense of the numeracy/scope sensitivity point:

IMO, one of the hallmarks of a conscious process is that it can take different actions in different circumstances (in a useful fashion), rather than simply doing things the way that process does it (following its own habits, personality, etc.).  ("When the facts change, I change my mind [and actions]; what do you do, sir?")

Numeracy / scope sensitivity is involved in, and maybe required for, the ability to do this deeply (to change actions all the way up to one's entire... (read more)

I am pretty far from having fully solved this problem myself, but I think I'm better at this than most people, so I'll offer my thoughts.

My suggestion is to not attempt to "figure out goals and what to want," but to "figure out blockers that are making it hard to have things to want, and solve those blockers, and wait to let things emerge."

Some things this can look like:

  1.  Critch's "boredom for healing from burnout" procedures.  Critch has some blog posts recommending boredom (and resting until quite bored) as a method for recovering one's ability
... (read more)

Okay, maybe?  But I've also often been "real into that" in the sense that it resolves a dissonance in my ego-structure-or-something, or in the ego-structure-analog of CFAR or some other group-level structure I've been trying to defend, and I've been more into "so you don't get to claim I should do things differently" than into whether my so-called "goal" would work.  Cf "people don't seem to want things."

. The specific operation that happened was applying ooda loops to the concept of ooda loops.

I love this!

Surprise 4: How much people didn't seem to want things

And, the degree to which people wanted things was even more incoherent than I thought. I thought people wanted things but didn't know how to pursue them. 

[I think Critch trailed off here, but implication seemed to be "basically people just didn't want things in the first place"]

 

I concur.  From my current POV, this is the key observation that should've, and should still, instigate a basic attempt to model what humans actually are and what is actually up in today's humans.  It's too b... (read more)

3xpym
What do people here think about Robin Hanson's view, for example as elaborated by him and Kevin Simler in the book Elephant in the Brain? I've seen surprisingly few mentions/discussions of this over the years in the LW-adjacent sphere, despite Hanson being an important forerunner of the modern rationalist movement. One of his main theses, that humans are strategic self-deceivers, seems particularly important (in the "big if true" way), yet downplayed/obscure.
4Raemon
On one hand, when you say it like that, it does seem pretty significant. I'm not sure I think there's that much confusion to explain? Like, my mainline story here is: 1. Humans are mostly a kludge of impulses which vary in terms of how coherent / agentic they are. Most of them have wants that are fairly basic, and don't lend themselves super well to strategic thinking. (I think most of them also consider strategic thinking sort of uncomfortable/painful). This isn't that weird, because, like, having any kind of agency at all is an anomaly. Most animals have only limited agency and wanting-ness. 2. There's some selection effect where the people who might want to start Rationality Orgs are more agentic, have more complex goals, and find deliberate thinking about their wants and goals more natural/fun/rewarding.   3. The "confusion" is mostly a typical mind error on the part of people like us, and if you look at evolution the state of most humans isn't actually that weird or surprising. Perhaps something I'm missing or confused about is what exactly Critch (or, you, if applicable?) mean by "people don't seem to want things." I maybe am surprised that the filtering effect of people who showed up at CFAR workshops or similar still didn't want things. Can you say a bit more about what you've experienced, and what felt surprising or confusing about it?

I'm curious to hear how you arrived at the conclusion that a belief is a prediction. 

I got this in part from Eliezer's post Make your beliefs pay rent in anticipated experiences.  IMO, this premise (that beliefs should try to be predictions, and should try to be accurate predictions) is one of the cornerstones that LessWrong has been based on.

2jeffreycaruso
I just read the post that you linked to. He used the word "prediction" one time in the entire post so I'm having trouble understanding how that was mean't to be an answer to my question. Same with that it's a cornerstone of LessWrong, which, for me, is like asking a Christian why they believe in God, and they answer, because the Bible tells me so. Is a belief a prediction?  If yes, and a prediction is an act of forecasting, then there must be a way to know if your prediction was correct or incorrect. Therefore, maybe one requirement for a belief is that it's testable, which would eliminate all of our beliefs in things unseen.  Maybe there are too many meanings assigned to just that one word - belief. Perhaps instead of it being a verb, it should be a preposition attached to a noun; i.e., a religious belief, a financial belief, etc. Then I could see a class of beliefs that were predictive versus a different class of beliefs that were matters of faith.

I love this post.  (Somehow only just read it.)

My fav part: 
>  In the context of quantilization, we apply limited steam to projects to protect ourselves from Goodhart. "Full steam" is classically rational, but we do not always want that. We might even conjecture that we never want that. 

To elaborate a bit:

It seems to me that when I let projects pull me insofar as they pull me, and when I find a thing that is interesting enough that it naturally "gains steam" in my head, it somehow increases the extent to which I am locally immune fro... (read more)

Oh, man, yes, I hadn't seen that post before and it is an awesome post and concept.  I think maybe "believing in"s, and prediction-market-like structures of believing-ins, are my attempt to model how Steam gets allocated.

9niplav
Several disjointed thoughts, all exploratory. I have the intuition that "believing in"s are how allocating Steam feels like from the inside, or that they are the same thing. This makes sense if "believing in"s are useful for intra- and inter-agent coordination, which is the thing people accumulate to go on stag hunts together. Coordinating with your future self, in this framework, requires the same resource as coordinating with other agents similar to you along the relevant axes right now (or across time). Steam might be thought of as a scalar quantity assigned to some action or plan, and which changes depending on the actions being executed or not. Steam is necessarily distinct from probability or utility because if you start making predictions about your own future actions, your belief estimation process (assuming it has some influence on your actions) has a fixed point in predicting the action will not be carried out, and then intervening to prevent the action from being carried out. There is also another fixed point in which the agent is maximally confident it will do something, and then just doing it, but it can't be persuaded to not do it.[1] As stated in the original post, steam helps solve the procrastination paradox. I have the intuition that one can relate the (change) in steam/utility/probability to each other: Assuming utility is high, 1. If actions/plans are performed, their steam increases 2. If actions/plans are not performed, steam decreases 3. If steam decreases slowly and actions/plans are executed, increase steam 4. If steam decreases quickly and actions/plans are not executed, decrease steam even more quickly(?) 5. If actions/plans are completed, reduce steam If utility decreases a lot, steam only decreases a bit (hence things like sunk costs). Differential equations look particularly useful to talking more rigorously about this kind of thing. Steam might also be related to how cognition on a particular topic gets started; to avoid th
Load More