LESSWRONG
LW

All of metachirality's Comments + Replies

Am I correct in assuming you don't think one should give the money in the counterfactual mugging?

I don't know what the first part of your comment is trying to say. I agree that counterfactual mugging isn't a thing that happens. That's why it's called a thought experiment.

I'm not quite sure what the last paragraph is trying to say either. It sounds somewhat similar to an counter-argument I came up with (which I think is pretty decisive), but I can't be certain what you actually meant. In any case, there is the obvious counter-counter-argument that in the counterfactual mugging, the agent in the heads branch and the tails branch are not quite identical either, one has seen the coin land on heads and the other has seen the coin land on tails.

2JBlack16h

Regarding the first paragraph: every purported rational decision theory maps actions to expected values. In most decision theory thought experiments, the agent is assumed to know all the conditions of the scenario, and so they can be taken as absolute facts about the world leaving only the unknown random variables to feed into the decision-making process. In the Counterfactual Mugging, that is explicitly not true. The scenario states So it's not enough to ask what a rational agent with full knowledge of the rest of the scenario should do. That's irrelevant. We know it as omniscient outside observers, but the agent in question knows only what the mugger tells them. If they believe it then there is a reasonable argument that they should pay up, but there is nothing given in the scenario that makes it rational to believe the mugger. The prior evidence is massively against believing the mugger. Any decision theory that ignores this is broken. Regarding the second paragraph: yes, indeed there is that additional argument against paying up and rationality does not preclude accepting that argument. Some people do in fact use exactly that argument even in this very much weaker case. It's just a billion times stronger in the "Bob could have been Alice instead" case and makes rejecting the argument untenable.

metachirality's Shortform

metachirality21h0-3

I came up with an argument for alignment by default.

In the counterfactual mugging scenario, a rational agent gives the money, even though they never see themselves benefitting from it. Before the coin flip, the agent would want to self-modify to give the money to maximize the expected value, therefore the only reflectively stable option is to give the money.

Now imagine instead of a coin flip, it's being born as one of two people: Alice, who wants to not be murdered for 100 utils, and Bob, who wants to murder Alice for 1 utils. As with the counterfactual mu... (read more)

3JBlack18h

Counterfactual mugging is a mug's game in the first place - that's why it's called a "mugging" and not a "surprising opportunity". The agent don't know that Omega actually flipped a coin, would have paid you counterfactually if the agent was the sort of person to pay in this scenario, would have flipped the coin at all in that case, etc. The agent can't know these things, because the scenario specifies that they have no idea that Omega does any such thing or even that Omega existed before being approached. So a relevant rational decision-theoretic parameter is an estimate of how much such an agent would benefit, on average, if asked for money in such a manner. A relevant prior is "it is known that there are a lot of scammers in the world who will say anything to extract cash vs zero known cases of trustworthy omniscient beings approaching people with such deals". So the rational decision is "don't pay" except in worlds where the agent does know that omniscient trustworthy beings vastly outnumber untrustworthy beings (whether omniscient or not), and those omniscient trustworthy beings are known to make these sorts of deals quite frequently. Your argument is even worse. Even broad decision theories that cover counterfactual worlds such as FDT and UDT still answer the question "what decision benefits agents identical to Bob the most across these possible worlds, on average". Bob does not benefit at all in a possible world in which Bob was Alice instead. That's nonexistence, not utility.

bgold's Shortform

metachirality2mo20

Sounds like synesthesia

[Link] A community alert about Ziz

metachirality3mo42

I fear that, while it might be a good idea to discourage LSD, it would make things even worse to discourage transitioning.

keltan's Shortform

metachirality4mo20

Highly Advanced Epistemology 101?

Deontic Explorations In "Paying To Talk To Slaves"

metachirality4mo110

probably doesn't change much, but janus' claude generated comment was the first mention of claude acting like a base model on LW

Habryka's Shortform Feed

metachirality5mo11

It ought to be a top-level post on the EA forum as well.

2habryka5mo

(Someone is welcome to link post, but indeed I am somewhat hoping to avoid posting over there as much, as I find it reliably stressful in mostly unproductive ways)

Algebraic Linguistics

metachirality5mo21

Well that's because it's meant to be quantifying over linear equations. $x$ and $y$ are not meant to be replaced but $a$ and $b$ are.

Algebraic Linguistics

metachirality5mo31

i is often used as an index in math, similar to how it is used as an index in for loops.

leogao's Shortform

metachirality5mo10

What would an event optimized for this sort of thing look like?

3Joseph Miller5mo

Unconferences are a thing for this reason

JargonBot Beta Test

metachirality6mo10

Why not generate it after it's posted publically?

6Raemon6mo

Reasoning is: * Currently it takes 40-60 seconds to generate jargon (we've experimented with ways of trimming that down but it's gonna be at least 20 seconds) * I want authors to actually review the content before it goes live. * Once authors publish the post, I expect very few of them to go back and edit it more. * If it happens automagically during draft saving, then by the time you get to "publish post", there's a natural step where you look at the autogenerated jargon, check if it seems reasonable, approve the ones you like and then hit "publish" * Anything that adds friction to this process I expect to dramatically reduce how often authors bother to engage with it.

Habryka's Shortform Feed

metachirality6mo32

Aaaa! I'm used to Arial or whatever Windows' default display font is. The larger stroke weight is rather uncomfortable to me.

4habryka6mo

We previously had Calibri for Windows (indeed a very popular Windows system font). Gill Sans (which we now ship to all operating systems) is a quite popular MacOS and iOS system font. I currently think there are some weird rendering issues on Windows, but if that's fixed, my guess is you would get used to it quickly enough. Gill Sans is not a rare font on the internet.

yams's Shortform

metachirality6mo30

Yarvin was not part of the CCRU. I think Land and Yarvin only became associates post-CCRU.

1yams6mo

updated, thanks!

MichaelDickens's Shortform

metachirality7mo3121

Maybe make a post on the EA forum?

Glitch Token Catalog - (Almost) a Full Clear

metachirality7mo70

It seems like if the SCP hypothesis is true, block characters should cause it to act strangely.

5Lao Mein7mo

It does! 'What is \'████████\'?\n\nThis term comes from the Latin for "to know". It' 'What is \'████████\'?\n\n"████████" is a Latin for "I am not",' Putting it in the middle of code causes it to sometimes spontaneously switch to an SCP story ' for i in █████.\n\n"I\'m not a scientist!"\n\n- Dr' ' for i in █████,\n\n[REDACTED]\n\n[REDACTED]\n\n[REDACTED] [REDACTED]\n\n[REDACTED]'

Lao Mein's Shortform

metachirality8mo10

Does it not have any sort of metadata telling you where it comes from?

My only guess is that some of it is probably metal lyrics.

Lao Mein's Shortform

metachirality8mo10

Is this an LLM generation or part of the training data?

2Lao Mein8mo

This is from OpenWebText, a recreation of GPT2 training data. "@#&" [token 48193] occured in 25 out of 20610 chunks. 24 of these were profanity censors ("Everyone thinks they’re so f@#&ing cool and serious") and only contained a single instance, while the other was the above text (occuring 3299 times!), which was probably used to make the tokenizer, but removed from the training data. I still don't know what the hell it is. I'll post the full text if anyone is interested.

Meno's Paradox

metachirality9mo76

I don't see how 3 follows.

1Hudjefa9mo

Si, I have the same difficulty. However, sources indicate that Socrates/Plato/others didn't brush it aside as inconsequential. I tried googling, but haven't found anything that could be considered a solution.

Is an AI religion justified?

metachirality9mo10

That's because we don't have the intelligence to exterminate ants (without causing more problems.)

On another note, if an artificial superintelligence needed a human for something, it would probably be able to find someone it could convince on the spot, no pre-built religion needed.

1p4rziv4l9mo

*probably. Maybe it'll start looking for people who are pre-aligned. Religion is also a useful single word, which carries the most meaning per bit to a normie. Maybe just enough to make them take it seriously. I believe there is something to be taken seriously about it.

Is an AI religion justified?

metachirality9mo510

We have nothing to offer. Anything we can do, an artificial superintelligence can do better, with space and energy and atoms we irritatingly take up.

0p4rziv4l9mo

That's pretty pessimistic. I am looking for things I could do to help Superintelligence. Crucially, we won't understand why they need us to do things they ask us to do. Ants take up a lot of space, yet we don't systematically hunt them down, they are pretty orthogonal to our values. We find cats and dogs friendly and worthwhile. However, wolves and sabertooth tigers are gone.

Is an AI religion justified?

Answer by metachiralityAug 06, 202430

Why would we want to worship AI?

-2p4rziv4l9mo

Because Superintelligence is more powerful than us. If you can't beat them, join them. Maybe Superintelligence will help us terraform Mars if we also perform some favors. Worshipping is a provocative way of saying aligning ourselves with Superintelligence's goals.

Closed Limelike Curves's Shortform

metachirality9mo3-1

I think the thing that actually makes people more rational is thinking of them as principles you can apply to your own life rather than abstract notions, which is hard to communicate in a Wikipedia page about Dutch books.

1Closed Limelike Curves9mo

Sure, but you gotta start somewhere, and a Wikipedia article would help.

Most smart and skilled people are outside of the EA/rationalist community: an analysis

metachirality10mo10

Emmett Shear might also count, but he might merely be rationalist-adjacent.

Most smart and skilled people are outside of the EA/rationalist community: an analysis

metachirality10mo30

IMO trying the problem yourself before researching it makes you appreciate what other people have already done even more. It's pretty easy to fall victim to hindsight bias if you haven't experienced the difficulty of actually getting anywhere.

quila's Shortform

metachirality10mo32

they figure out planting and then rationally collaborate with each other?

I feel like they would end up converging on the same problems that plague human sociality.

quila's Shortform

metachirality10mo10

I think asociality might prevent the development of altruistic ethics.

Also it's hard to see how an asocial species would develop civilization.

1[anonymous]10mo

same, but not sure, i was in the process of adding a comment about that they figure out planting and then rationally collaborate with each other? these might depend on 'degree of (a)sociality'. hard for me to imagine a fully asocial species though they might exist and i'd be interested to see examples. chatgpt says..

TurnTrout's shortform feed

metachirality10mo10

This reminds me of Moravec's paradox.

The Potential Impossibility of Subjective Death

metachirality10mo107

You should read Greg Egan's excellent novel Permutation City.

1VictorLJZ10mo

Will do, have heard great things about it!

yanni's Shortform

metachirality10mo10

I think working on safety roles at capabilities orgs is mostly mutually exclusive with a pause, so I don't think this is that remarkable.

Isomorphisms don't preserve subjective experience... right?

Answer by metachiralityJul 03, 202420

Sorta? Usually the idea is that the presence or absence of hardware determines the anthropic probability of being that conscious process, otherwise you would expect to be some random arbitrary Boltzmann brain-like conscious.

Also this is an immediate corollary of the mathematical universe hypothesis, which says our universe is a mathematical structure.

Sci-Fi books micro-reviews

metachirality10mo10

I feel like you're not giving enough credit to Greg Egan since he came up with all the philosophy himself.

2Yair Halberstadt10mo

Possibly, but some of the missteps just feel too big to ignore. Like what on earth is going on in the second half of the book?

Ilya Sutskever created a new AGI startup

metachirality10mo43

Let's hope not!

quetzal_rainbow10mo2024

Actually, we should hope that LW is very wrong about AI and alignment is easy.

I would have shit in that alley, too

metachirality10mo40

I remember going to a city and seeing someone on the subway loudly threatening nonexistent people. I wasn't scared, I just felt bad that in all likelihood, the world had failed this person through no fault of their own.

The 27 papers

metachirality1y50

I like this format and framing of "90% of what matters" and someone should try doing it with other subjects.

MIRI 2024 Communications Strategy

metachirality1y10

Decision theory/trade reasons

I think this still means MIRI is correct when it comes to the expected value though

4ryan_greenblatt1y

If you're a longtermist, sure. If you just want to survive, not clearly.

What mistakes has the AI safety movement made?

metachirality1y21

The thing that got me was Pause AI trying to coalition with people against AI art. I don't really have anything against the idea of a pause but Pause AI seems a bit simulacrum level 2 for me.

davekasten's Shortform

metachirality1y103

A subpoena for what?

Why do we enjoy music?

metachirality1y10

I don't think I'm really looking for something like that, since it doesn't touch on the perception of music as much as it does the reasons why we have it.

LessOnline (May 31—June 2, Berkeley, CA)

metachirality1y20

Isn't TLP's email on his website?

3Ben Pace1y

I did find it and we sent him an email, hope he reads it and joins :)

metachirality's Shortform

metachirality1y10

Sure, I just prefer a native bookmarking function.

metachirality's Shortform

metachirality1y21

I wish I could bookmark comments/shortform posts.

2faul_sname1y

Yes, that would be cool. Next to the author name of a post orcomment, there's a post-date/time element that looks like "1h 🔗". That is a copyable/bookmarkable link.

Some Experiments I'd Like Someone To Try With An Amnestic

metachirality1y4-7

You can actually use this to do the sleeping beauty experiment IRL and thereby test SIA vs SSA. Unfortunately you can only get results if you're the one being put under.

Shortform

metachirality1y10

This sort of begs the question of why we don't observe other companies assassinating whistleblowers.

2lc1y

Robin Hanson has apparently asked the same thing. It seems like such a bizarre question to me: * Most people do not have the constitution or agency for criminal murder * Most companies do not have secrets large enough that assassinations would reduce the size of their problems on expectation * Most people who work at large companies don't really give a shit if that company gets fined or into legal trouble, and so they don't have the motivation to personally risk anything organizing murders to prevent lawsuits

metachirality's Shortform

metachirality1y50

I think there should be a way to find the highest rated shortform posts.

9habryka1y

You can! Just go to the all-posts page, sort by year, and the highest-rated shortform posts for each year will be in the Quick Takes section: 2024: 2023: 2022:

David Gross's Shortform

metachirality1y10

I like to phrase it as "the path to simplicity involves a lot of detours." Yes, Newtonian mechanics doesn't account for the orbit of Mercury but it turned out there was an even simpler, more parsimonious theory, general relativity, waiting for us.

What is the easiest/funnest way to build up a comprehensive understanding of AI and AI Safety?

metachirality1y43

Vanessa Kosoy has a list specifically for her alignment agenda but is probably applicable to agent foundations in general: https://www.alignmentforum.org/posts/fsGEyCYhqs7AWwdCe/learning-theoretic-agenda-reading-list

avturchin's Shortform

metachirality1y93

We don't actually know if it's GPT 4.5 for sure. It could be an alternative training run that preceded the current version of ChatGPT 4 or even a different model entirely.

2faul_sname1y

It might be informative to try to figure out when its knowledge cutoff is (right now I can't do so, as it's at it's rate limit).

Arjun Panickssery's Shortform

metachirality1y32

I think it disambiguates by saying it's specifically a crux as in "double crux"

7Arjun Panickssery1y

If I understand the term "double crux" correctly, to say that something is a double crux is just to say that it is "crucial to our disagreement."

metachirality1y10

Copied from a reply on lukehmiles' short form:

The hypothesis I would immediately come up with is that less traditionally masculine AMAB people are inclined towards less physical pursuits.

If it is related to IQ, however, this is less plausible, although perhaps some sort of selection effect is happening here.