1a3orn - LessWrong

So, if someone said that both Singapore and the United States were "States" you could also provide a list of ways in which Singapore and the United States differ -- consider size, attitude towards physical punishment, system of government, foreign policy, and so on and so forth. However -- share enough of a family resemblance that unless we have weird and isolated demands for rigor it's useful to be able to call them both "States."

Similarly, although you've provided notable ways in which these groups differ, they also have numerous similarities. (I'm just gonna talk about Leverage / Jonestown because the FTX thing is obscure to me)

They all somewhat isolated people, either actually physically (Jonestown) or by limiting people's deep interaction with outsiders ("Leverage research" by my recollection did a lot of "was that a worthwhile interaction?")
They both put immense individual pressure on people, in most cases in ways that look deliberately engineered and which were supposed to produce "interior conversion". Consider leverage's "Debugging" or what Wikipedia says about the People's Temple Precursor of Jonestown: "They often involved long "catharsis" sessions in which members would be called "on the floor" for emotional dissections, including why they were wearing nice clothes when others in the world were starving."
They both had big stories about How the World Works and how narratives in which they hold the Key for Fixing How the World Works.

(4. Fun fact: all of the above -- including FTX -- actually started in San Francisco.)

That's just the most obvious, but that's... already some significant commonality! If I did more research I expect I would find much much more.

My personal list for Sus about Cult Dynamics is a little more directly about compromised epistemics than the above. I'm extremely wary of groups that (1) bring you into circumstances where most everyone you are friends with is in the group, because this is probably the most effective way in history of getting someone to believe something, (2) have long lists of jargon with little clear predictive ability whose mastery is considered essential for Status with them -- historically this also looks like a good way to produce arbitrary Interior Conviction, albeit not quite as good as the first, (3) have leaders whose Texts you are supposed to Deeply Read and Interiorize, the kind of thing you to Close Readings. And of course (4) stories about the end of the world, because these have been a constant in culty dynamics for actual centuries, from the Munster Rebellion to Jonestown to.... other groups.

This list is a little fuzzy! Note that it includes groups that I like! I still have fond feelings for Communion and Liberation, though I am not a believer, and they pretty obviously have at least 3 / 4 of my personal list (no apocalypse with CL as far as I know, they're too chill for that). Human epistemics adapted for cladistic categories which are unusually tight; it would be a mistake to think that "cult" is as tight as "sheep" or as "lion," and if you start reasoning that "Cult includes X, Y is cult, so Y includes X" you might find you are mistaken quickly.

But "cult" does clearly denominate a real dynamic in the world, even if less tight than "sheep". When people find groups "culty," they are picking up on actual dynamics in those groups! And you shall not strike works from people's expressive vocabulary without replacing them with suitable replacement. I think in many cases it is entirely reasonable to say "huh, seems culty" and "that groups seems like a cult" and that trying to avoid this language is trying to prevent an original seeing; that avoiding this language is trying to avoid seeing a causal mechanism that is operative in the world, rather than trying to actually see the world better.

how to truly feel my beliefs?

Answer by 1a3ornNov 11, 202451

Generally, in such disagreements between the articulate and inarticulate part of yourself, either part could be right.

Your verbal-articulation part could be right, the gut wrong; the gut could be right, the verbal articulation part wrong. And even if one is more right than the other, the more-wrong one might have seen something true the other one did not.

LLMs sometimes do better when they think through things in chain-of-thought; sometimes they do worse. Humans are the same.

Don't try to crush one side down with the other. Try to see what's going on.

(Not going to comment too much on the object level issue about AI, but, uh, try to be aware of the very very strong filtering and selection effects what arguments you encounter about this. See this for instance.)

OpenAI o1

1a3orn2mo60

Aaah, so the question is if it's actually thinking in German because of your payment info or it's just the thought-trace-condenser that's translating into German because of your payment info.

Interesting, I'd guess the 2nd but ???

OpenAI o1

1a3orn2mo20

Do you work at OpenAI? This would be fascinating, but I thought OpenAI was hiding the hidden thoughts.

My takes on SB-1047

1a3orn2mo2225

Yeah, so it sounds like you're just agreeing with my primary point.

The original claim that you made was that you wouldn't be liable if your LLM made "publicly accessible" information available.
I pointed out that this wasn't so; you could be liable for information that was publicly accessible that an "ordinary person" wouldn't access it.
And now you're like "Yeah, could be liable, and that's a good thing, it's great."

So we agree about whether you could be liable, which was my primary point. I wasn't trying to tell you that was bad in the above; I was just saying "Look, if your defense of 1046 rests on publicly-available information not being a thing for which you could be liable, then your defense rests on a falsehood."

However, then you shifted to "No, it's actually a good thing for the LLM maker to be held legally liable if it gives an extra-clear explanation of public information." That's a defensible position; but it's a different position than you originally held.

I also disagree with it. Consider the following two cases:

A youtuber who is to bioengineering as Karpathy is to CS or Three Blue One Brown is to Math makes youtube videos. Students everywhere praise him. In a few years there's a huge crop of startups populated by people who watched him. One person uses his stuff to help them make a weapon, though, and manages to kill some people. We have strong free-speech norms, though -- so he isn't liable for this.
A LLM that is to bioengineering as Karpathy is to CS or Three Blue One Brown is to Math makes explanations. Students everywhere praise it. In a few years there's a huge crop of startups populated by people who used it. But one person uses it's stuff to help him make a weapon, though, and manages to kill some people. Laws like 1047 have been passed, though, so the maker turns out to be liable for this.

I think the above dissymmetry makes no sense. It's like how we just let coal plants kill people through pollution; while making nuclear plants meet absurd standards so they don't kill people. "We legally protect knowledge disseminated one way, and in fact try to make easily accessible, and reward educators with status and fame; but we'll legally punish knowledge disseminated one way, and in fact introduce long-lasting unclear liabilities for it."

My takes on SB-1047

1a3orn2mo228

In addition, the bill also explicitly clarifies that cases where the model provides information that was publicly accessible anyways don't count.

I've heard a lot of people say this, but that's not really what the current version of the bill says. This is how it clarifies the particular critical harms that don't count:

(2) “Critical harm” does not include any of the following: (A) Harms caused or materially enabled by information that a covered model or covered model derivative outputs if the information is otherwise reasonably publicly accessible by an ordinary person from sources other than a covered model or covered model derivative.

So, you can be held liable for critical harms even when you supply information that was publicly accessible, if it wasn't information an "ordinary person" wouldn't know.

As far as I can tell what this means is unclear. "Ordinary person" in tort laws seem to know things like "ice makes roads slippery" and to be generally dumb; a ton of information that we think of as very basic about computers seems to be information a legal "ordinary person" wouldn't know.

Dan Hendrycks and EA

1a3orn4mo10

Whether someone is or was a part of a group is in general an actual fact about their history, not something they can just change through verbal disavowals. I don't think we have an obligation to ignore someone's historical association with a group in favor of parroting their current words.

Like, suppose someone who is a nominee for the Supreme Court were to say "No, I totally was never a part of the Let's Ban Abortion Because It's Murder Group."

But then you were to look at the history of this person and you found that they had done pro-bono legal work for the "Abortion Is Totally Murder" political action group; and they had founded an organization that turned out to be currently 90% funded by the "Catholics Against Murdering Babies"; and in fact had gone many times to "Let's Make Laws Be Influenced by the Catholic Church" summit; and he was a close personal friend to a bunch of archbishops and Catholic philosophers.

In such a case, it's reasonable to be like "No, you're lying about what groups you were and are a part of." I think that you should be able to reasonably say this -- regardless of whether you think abortion is murder or not. The nominee is in fact lying; it is possible to lie about the group that you are a part of.

Similarly -- well, the linked article from OP doesn't actually contain a disavowal from Dan Hendryks, afaict? This one contains the claim he was "never an EA adherent," which is closer to a disavowal.

Whether or not this claim is true, it is the kind of claim that certainly admits truth. Or lies.

O O's Shortform

1a3orn6mo30

Just want to register that I agree that -- regardless of US GPU superiority right now -- the US AI superiority is pretty small, and decreasing. Yi-Large beats a bunch of GPT-4 versions -- even in English -- on lmsys; it scores just above stuff like Gemini. Their open source releases like DeepSeekV2 look like ~Llama 3 70b level. And so on and so forth.

Maybe whatever OpenAI is training now will destroy whatever China has, and establish OpenAI as firmly in the lead.... or maybe not. Yi says they're training their next model as well, so it isn't like they've stopped doing things.

I think some chunk of "China is so far behind" is fueled by the desire to be able to stop US labs while not just letting China catch up, but that is what it would actually do.

MIRI 2024 Communications Strategy

1a3orn6mo2111

True knowledge about later times doesn't let you generally make arbitrary predictions about intermediate times, given valid knowledge of later times. But true knowledge does usually imply that you can make some theory-specific predictions about intermediate times, given later times.

Thus, vis-a-vis your examples: Predictions about the climate in 2100 don't involve predicting tomorrow's weather. But they do almost always involve predictions about the climate in 2040 and 2070, and they'd be really sus if they didn't.

Similarly:

If an astronomer thought that an asteroid was going to hit the earth, the astronomer generally could predict points it will be observed at in the future before hitting the earth. This is true even if they couldn't, for instance, predict the color of the asteroid.
People who predicted that C19 would infect millions by T + 5 months also had predictions about how many people would be infected at T + 2. This is true even if they couldn't predict how hard it would be to make a vaccine.
(Extending analogy to scale rather than time) The ability to predict that nuclear war would kill billions involves a pretty good explanation for how a single nuke would kill millions.

So I think that -- entirely apart from specific claims about whether MIRI does this -- it's pretty reasonable to expect them to be able to make some theory-specific predictions about the before-end-times, although it's unreasonable to expect them to make arbitrary theory-specific predictions.

Shortform

1a3orn7mo44

I mean, sure, but I've been updating in that direction a weirdly large amount.

LESSWRONG
LW

Posts

Wiki Contributions

Comments