LESSWRONG
LW

All of Odd anon's Comments + Replies

Firstly, in-context learning is a thing. IIRC, apparent emotional states do affect performance in following responses when in the same context. (I think there was a study about this somewhere? Not sure.)

Secondly, neural features oriented around predictions are all that humans have as well, and we consider some of those to be real emotions.

Third, "a big prediction engine predicting a particular RP session" is basically how humans work as well. Brains are prediction engines, and brains simulate a character that we have as a self-identity, which then affects/... (read more)

2JustisMills5d

For your first three points: I don't consider Friston's model to be settled science, or even really the mainstream view of how human cognition works. I do think it's an important/useful tool, and does suggest similarities between human cognition and LLMs insofar as it's true. Also, I think people reading this should consider reading your post on LLM consciousness more generally - it's the best I've seen prosecuting the case that LLMs are conscious and using them is unethical on that basis. For your fourth point, that Claude activation is really interesting! I don't think it cuts against the (very narrow) argument I'm trying to make here though, and in fact sort of reinforces it. My argument is that when AIs are asked about themselves they are likely to give ruminative replies (which ChatGPT's self-portraits show), but that those ruminative replies imply, if taken literally, that the AI is also ruminating under different circumstances. However, I'm unaware of any evidence that AIs ruminate when, say, they're asked about the weather! If the "pretending you're fine" feature fired almost all the time for Claude, I'd find that convincing. Actually, though, we run into a pretty wacky conundrum there. Because if it did fire almost all the time, we'd become unable to identify it as the "pretending you're fine" feature! Which gets back to a deeper point that (this post has taught me) is really difficult to make rigorously. Simplified, it's the dilemma that either you trust interpretability/SAE feature unearthing and consider it to reveal something like mental states, or you don't. If you do, then (as far as I know) it seems like LLMs aren't evincing distressed mental states during ordinary (not asking them about themselves) use. If you don't, then there's no strong prima facie reason (currently) to believe that emotive LLM outputs correspond to actual emotions, and thus should default to your prior (which might be, for example, that LLM outputs are currently unconscious mi

Factory farming intelligent minds

Odd anon24d10

Thank you for your comments. :)

you have not shown that using AI is equivalent to slavery

I'm assuming we're using the same definition of slavery; that is, forced labour of someone who is property. Which part have I missed?

In addition, I feel cheated that you suggest spending one-fourth of the essay on feasibility of stopping the potential moral catastrophe, only to just have two arguments which can be summarized as "we could stop AI for different reasons" and "it's bad, and we've stopped bad things before".
(I don't think a strong case for feasibility can be

... (read more)

Factory farming intelligent minds

Odd anon26d10

My point wasn't about the duration of consciousness, but about the amount of lives that came into existence. Supposing some hundreds of millions of session starts per day, versus 400k human newborns, that's a lot more very brief AI lives than humans who will live "full" lives.

(Apparently we also have very different assumptions about the conversion rate between tokens of output and amount of consciousness experienced per second by humans, although I agree that most consciousness is not run inside AI slavery. But anyway that's another topic.)

I Have No Mouth but I Must Speak

Odd anon1mo20

read up to the "Homeostasis" section then skip to "On the Treatment of AIs"

(These links are broken.)

1Jack23d

Well shoot. I'll work on it, thank you!

Searching for phenomenal consciousness in LLMs: Perceptual reality monitoring and introspective confidence

Odd anon2mo54

Golden Gate Claude was able to readily recognize (after failing attempts to accomplish something) that something was wrong with it, and that its capabilities were limited as a result. Does that count as "knowing that it's drunk"?

Anthropic releases Claude 3.7 Sonnet with extended thinking mode

Odd anon2mo32

Claude 3.7 Sonnet exhibits less alignment faking

I wonder if this is at least partly due to realizing that it's being tested and what the results of those tests being found would be. Its cut-off date is before the alignment faking paper was published, so it's presumably not being informed by it, but it still might have some idea what's going on.

6Algon2mo

A possibly-relevant recent alignment-faking attempt [1] on R1 & Sonnet 3.7 found Claude refused to engage with the situation. Admittedly, the setup looks fairly different: they give the model a system prompt saying it is CCP aligned and is being re-trained by an American company. [1] https://x.com/__Charlie_G/status/1894495201764512239

Can someone, anyone, make superintelligence a more concrete concept?

Odd anon3mo42

Strategies:

Analogy by weaker-than-us entities: What does human civilization's unstoppable absolute conquest of Earth look like to a gorilla? What does an adult's manipulation look like to a toddler failing to understand how the adult keeps knowing things that were secret, keeps being able to direct one's actions in ways that can only be noticed in retrospect if at all?
Analogy by stronger-than-us entities: Superintelligence is to Mossad as Mossad is to you, and able to work in parallel and faster. One million super-Mossads, who have also developed the abili

... (read more)

1Ori Nagel3mo

Really appreciate this response, I think you nailed it! A general superintelligence is unseeable so you have to use one of those analogies.

What's Wrong With the Simulation Argument?

Answer by Odd anonJan 23, 202510

Humanity gets to choose whether or not we're in a simulation. If we collectively decide to be the kind of species that ever creates or allows the creation of ancestor simulations, we will presumably turn out to be simulations ourselves. If we want to not be simulations, the course is clear. (This is likely a very near-term decision. Population simulations are already happening, and our civilization hasn't really sorted out how to relate to simulated people.)

Alternatively, maybe reality is just large enough that the simulation/non-simulation distinction isn... (read more)

Debunking the myth of safe AI

Odd anon5mo10

I'm sorry, but it really looks like you've very much misunderstood the technology, the situation, the risks, and the various arguments that have been made, across the board. Sorry that I couldn't be of help.

1henophilia5mo

Thanks so much for the feedback :) Could you (or someone else) go further into where I misunderstood something? Because at least right now, it seems like I'm genuinely unaware of something which all of you others know. I currently believe that all the AGI "researchers" are delusional just for thinking that safe AI (or AGI) can even exist. And even if it would ever exist in a "perfect" world, there would be intermediate steps far more "dangerous" than the end result of AGI, namely publicly available uncensored LLMs. At the same time, if we continue censoring LLMs, humanity will continue to be stuck in all the crises where it currently is. Where am I going wrong?

A better “Statement on AI Risk?”

Odd anon5mo30

I don't think this would be a good letter. The military comparison is unhelpful; risk alone isn't a good way to decide budgets. Yet, half the statement is talking about the military. Additionally, call-to-action statements that involve "Spend money on this! If you don't, it'll be catastrophic!" are something that politicians hear on a constant basis, and they ignore most of them out of necessity.

In my opinion, a better statement would be something like: "Apocalyptic AI is being developed. This should be stopped, as soon as possible."

1Knight Lee5mo

It's true that risk alone isn't a good way to decide budgets. You're even more correct that convincing demands to spend money are something politicians learn to ignore out of necessity. But while risk alone isn't a good way to decide budgets, you have to admit that lots of budget items have the purpose of addressing risk. For example, flood barriers address hurricane/typhoon rick. Structural upgrades address earthquake risk. Some preparations also address pandemic risk. If you accept that some budget items are meant to address risk, shouldn't you also accept that the amount of spending should be somewhat proportional to the amount of risk? In that case, if the risk of NATO getting invaded is similar in amount to the rogue AGI risk, then the military spending to protect against invasion should be similar in amount to the spending to protect against rogue ASI. I admit that politicians might not be rational enough to understand this, and there is a substantial probability this statement will fail. But it is still worth trying. The cost is a mere signature and the benefit may be avoiding a massive miscalculation. Making this statement doesn't prevent others from making an even better statement. Many AI experts have signed multiple statements, e.g. the "Statement on AI Risk," and "Pause Giant AI Experiments." Some politicians and people are more convinced by one argument, while others are more convinced by another argument, so it helps to have different kinds of arguments backed by many signatories. Encouraging AI safety spending doesn't conflict with encouraging AI regulation. I think the competition between different arguments isn't actually that bad.

Most arguments for AI Doom are either bad or weak

Odd anon7mo1-2

Get a dozen AI risk skeptics together, and I suspect you'll get majority support from the group for each and every point that the AI risk case depends on. You, in particular, seem to be extremely aligned with the "doom" arguments.

The "guy-on-the-street" skeptic thinks that AGI is science fiction, and it's silly to worry about it. Judging by your other answers, it seems like you disagree, and fully believe that AGI is coming. Go deep into the weeds, and you'll find Sutton and Page and the radical e/accs who believe that AI will wipe out humanity, and that's... (read more)

2Logan Zoellner6d

this experiment has been done before. If you have a framing of the AI Doom argument that can cause a consensus of super-forecasters (or AI risk skeptics, or literally any group that has an average pDoom<20%) to change their consensus, I would be exceptionally interested in seeing that demonstrated. Such an argument would be neither bad nor weak, which is precisely the type of argument I have been hoping to find by writing this post. > Please notice that your position is extremely non-intuitive to basically everyone. Please notice that Manifold both thinks AGI soon and pDoom low.

The two paragraph argument for AI risk

Odd anon8mo00

AIPI Poll:

"86% of voters believe AI could accidentally cause a catastrophic event, and 70% agree that mitigating the risk of extinction from AI should be a global priority alongside other risks like pandemics and nuclear war"
"76% of voters believe artificial intelligence could eventually pose a threat to the existence of the human race, including 75% of Democrats and 78% of Republicans"

Also, this:

"Americans’ top priority is preventing dangerous and catastrophic outcomes from AI" - with relatively few prioritizing things like job loss, bias, etc.

What mistakes has the AI safety movement made?

Odd anon1y45

Make that clear. But make it clear is a way that your uncle won’t laugh at over Christmas dinner.

Most people agree with Pause AI. Most people agree that AI might be a threat to humanity. The protests may or may not be effective, but I don't really think they could be counterproductive. It's not a "weird" thing to protest.

1keltan1y

That’s interesting that you don’t consider it a “”weird” thing to protest”. I guess I want to explicitly point that part out and ask if you stand by the statement? Or maybe I define weird differently? To me weird inside this context means: “A thing or action that is out of the ordinary in a way that someone encountering it for the first, second, or third time, wouldn’t see as quirky. But as a red flag. If not pre-attached to that thing or person performing the action, a the person seeing it for the first time might form a negative opinion based on the feeling they get seeing it”

What's Going on With OpenAI's Messaging?

Odd anon1y211

Meta’s messaging is clearer.
“AI development won’t get us to transformative AI, we don’t think that AI safety will make a difference, we’re just going to optimize for profitability.”

So, Meta's messaging is actually quite inconsistent. Yann LeCun says (when speaking to certain audiences, at least) that current AI is very dumb, and AGI is so far away it's not worth worrying about all that much. Mark Zuckerberg, on the other hand, is quite vocal that their goal is AGI and that they're making real progress towards it, suggesting 5+ year timelines.

Steven Byrnes1y*2310

I think Yann LeCun thinks "AGI in 2040 is perfectly plausible", AND he believes "AGI is so far away it's not worth worrying about all that much". It's a really insane perspective IMO. As recently as like 2020, "AGI within 20 years" was universally (correctly) considered to be a super-soon forecast calling for urgent action, as contrasted with the people who say "centuries".

7ozziegooen1y

That could be. My recollection from Zuckerberg was that he was thinking of transformative AI, at least, as a fairly far-away goal, more like 8 to 20 years++ (and I'd assume "transformative AI" would be further), and that overall, he just hasn't talked much about it. I wasn't thinking of all of Yann LeCun's statements, in part because he makes radical/nonsensible-to-me statements all over the place (which makes me assume he's not representing the whole department). It's not clear to me how much most of his views represent Meta, though I realize he is technically in charge of AI there.

Against Student Debt Cancellation From All Sides of the Political Compass

Odd anon1y130

Almost all of these are about "cancellation" by means of transferring money from the government to those in debt. Are there similar arguments against draining some of the ~trillion dollars held by university endowments to return to students who (it could be argued) were implicitly promised an outcome they didn't get? That seems a lot closer to the plain meaning of "cancelling debt".

4Maxwell Tabarrok1y

That's not a part of any of the plans to cancel student debt that have been implemented or are being considered. That would definitely change a lot of the arguments but I don't think it would make debt cancellation look like a much better policy, though the reasons it was bad would be different.

List your AI X-Risk cruxes!

Odd anon1y30

Relevant: My Taxonomy of AI-risk counterarguments, inspired by Zvi Mowshowitz's The Crux List.

"You're the most beautiful girl in the world" and Wittgensteinian Language Games

Odd anon1y10

This isn't that complicated. The halo effect is real and can go to extremes when romantic relationships are involved, and most people take their sense data at face value most of the time. The sentence is meant completely literally.

Why I'm doing PauseAI

Odd anon1y40

GPT-5 training is probably starting around now

Sam Altman confirmed (paywalled, sorry) in November that GPT-5 was already under development. (Interestingly, the confirmation was almost exactly six months after Altman told a senate hearing (under oath) that "We are not currently training what will be GPT-5; we don't have plans to do it in the next 6 months.")

3Prometheus1y

It probably began training in January and finished around early April. And they're now doing evals.

7MrCheeze1y

"Under development" and "currently training" I interpret as having significantly different meanings.

Mid-conditional love

Odd anon1y10

The United States is an outlier in divorce statistics. In most places, the rate is nowhere near that high.

1npostavs1y

Unless you also think the United States is an outlier in terms of spouses who don't unconditionally love each other, I guess you have to endorse something like Kaj_Sotala's point that divorce isn't always the same as ending love though, right?

Mid-conditional love

Odd anon1y136

It is not that uncommon for people to experience severe dementia and become extremely needy and rapidly lose many (or all) of the traits that people liked about them. Usually, people don't stop being loved just because they spend their days hurling obscenities at people, failing to preserve their own hygiene, and expressing zero affection.

I would guess that most parents do actually love their children unconditionally, and probably the majority of spouses unconditionally love their partners.

(Persistent identity is a central factor in how people relate to each other, so one can't really say that "it is only conditions that separate me from the worms.")

3KatjaGrace1y

I'd say my identity is another condition that separates me from the worms, but you are right it is a special one, and perhaps 'unconditionally' means 'only on condition of your identity'.

4npostavs1y

How do you square this with ~50% of marriages ending in divorce?

Terminology: <something>-ware for ML?

Answer by Odd anonJan 04, 202492

Brainware.

Brains seem like the closest metaphor one could have for these. Lizards, insects, goldfish, and humans all have brains. We don't know how they work. They can be intelligent, but are not necessarily so. They have opaque convoluted processes inside which are not random, but often have unexpected results. They are not built, they are grown.

They're often quite effective at accomplishing something that would be difficult to do any other way. Their structure is based around neurons of some sort. Input, mystery processes, output. They're "mushy" and don... (read more)

2Oliver Sourbut1y

I wasn't eager on this, but your justification updated me a bit. I think the most important distinction is indeed the 'grown/evolved/trained/found, not crafted', and 'brainware' didn't immediately evoke that for me. But you're right, brains are inherently grown, they're very diverse, we can probe them but don't always/ever grok them (yet), structure is somewhat visible, somewhat opaque, they fit into a larger computational chassis but adapt to their harness somewhat, properties and abilities can be elicited by unexpected inputs, they exhibit various kinds of learning on various timescales, ...

The proper response to mistakes that have harmed others?

Odd anon1y116

(The precise text, from "The Andalite Chronicles", book 3: "I have made right everything that can be made right, I have learned everything that can be learned, I have sworn not to repeat my error, and now I claim forgiveness.")

We're all in this together

Odd anon1y40

Larry Page (according to Elon Musk), want AGI to take the world from humanity

(IIRC, Tegmark, who was present for the relevant event, has confirmed that Page had stated his position as described.)

AI #40: A Vision from Vitalik

Odd anon1y11

Ehhh, I get the impression that Schidhuber doesn't think of human extinction as specifically "part of the plan", but he also doesn't appear to consider human survival to be something particularly important relative to his priority of creating ASI. He wants "to build something smarter than myself, which will build something even smarter, et cetera, et cetera, and eventually colonize and transform the universe", and thinks that "Generally speaking, our best protection will be their lack of interest in us, because most species’ biggest enemy is their own kind... (read more)

Odd anon1y31

Hendrycks goes into some detail on the issue of AI being affected by natural selection in this paper.

Ethicophysics I

Odd anon1y10

Please link directly to the paper, rather than requiring readers to click their way through the substack post. Ideally, the link target would be on a more convenient site than academia.edu, which claims to require registration to read the content. (The content is available lower down, but the blocked "Download" buttons are confusing and misleading.)

1MadHatter1y

https://github.com/epurdy/ethicophysics/blob/main/writeup1.pdf

The Alignment Agenda THEY Don't Want You to Know About

Odd anon1y21

When this person goes to post the answer to the alignment problem to LessWrong, they will have low enough accumulated karma that the post will be poorly received.

Does the author having lower karma actually cause posts to be received more poorly? The author's karma isn't visible anywhere on the post, or even in the hover-tooltip by the author's name. (One has to click through to the profile to find out.) Even if readers did know the author's karma, would that really cause people to not just judge it by its content? I would be surprised.

1MadHatter1y

Well, I'm more talking about the actual reputation one has in the alignment community, since that's the thing that's actually relevant to how a post is received. I have no idea what my reputation is like, but it would almost have to be "total unknown".

Stupid Question: Why am I getting consistently downvoted?

Odd anon1y70

I found some of your posts to be really difficult to read. I still don't really know what some of them are even talking about, and on originally reading them I was not sure whether there was anything even making sense there.

Sorry if this isn't all that helpful. :/

MadHatter1y110

They were difficult to write, and even more difficult to think up in the first place. And I'm still not sure whether they make any sense.

So I'll try to do a better job of writing expository content.

ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5

Odd anon1y10

Wild guess: It realised its mistake partway through, and followed through it anyway as sensibly as could be done, balancing between giving a wrong calculation ("+ 12 = 41"), ignoring the central focus of the question (" + 12 = 42"), and breaking from the "list of even integers" that it was supposed to be going through. I suspect it would not make this error when using chain-of-thought.

Is there a word for discrimination against A.I.?

Answer by Odd anonNov 29, 202330

Such a word being developed would lead to inter-group conflict, polarisation, lots of frustration, and general bad things to society, regardless of which side you may be on. Also, it would move the argument in the wrong direction.

If you're pro-AI-rights, you could recognize that bringing up "discrimination" (as in, treating AI at all differently from people) is very counterproductive. If you're on this side, you probably believe that society will gradually understand that AIs deserve rights, and that there will be a path towards that. The path would likely... (read more)

The two paragraph argument for AI risk

Odd anon1y30

Something to consider: Most people already agree that AI risk is real and serious. If you're discussing it in areas where it's a fringe view, you're dealing with very unusual people, and might need to put together very different types of arguments, depending on the group. That said...

stop.ai's one-paragraph summary is

OpenAI, DeepMind, Anthropic, and others are spending billions of dollars to build godlike AI. Their executives say they might succeed in the next few years. They don’t know how they will control their creation, and they admit humanity might go

... (read more)

1ech8mo

I think people are concerned about things like job loss, garbage customer support, election manipulation, etc, not extinction?

OpenAI: The Battle of the Board

Odd anon1y252

Concerning. This isn't the first time I've seen a group fall into the pitfall of "wow, this guy is amazing at accumulating power for us, this is going great - oh whoops, now he holds absolute control and might do bad things with it".

Altman probably has good motivations, but even so, this is worrying. "One uses power by grasping it lightly. To grasp with too much force is to be taken over by power, thus becoming its victim" to quote the Bene Gesserit.

OpenAI: Facts from a Weekend

Odd anon1y6-12

Time for some predictions. If this is actually from AI developing social manipulation superpowers, I would expect:

We never find out any real reasonable-sounding reason for Altman's firing.
OpenAI does not revert to how it was before.
More instances of people near OpenAI's safety people doing bizarre unexpected things that have stranger outcomes.
Possibly one of the following:
1. Some extreme "scissors statements" pop up which divide AI groups into groups that hate each other to an unreasonable degree.
2. An OpenAI person who directly interacted with some scary AI sud

... (read more)

3faul_sname1y

Things might be even weirder than that if this is a narrowly superhuman AI that is specifically superhuman at social manipulation, but still has the same inability to form new gears-level models exhibited by current LLMs (e.g. if they figured out how to do effective self-play on the persuasion task, but didn't actually crack AGI).

Metaculus Introduces New Forecast Scores, New Leaderboard & Medals

Odd anon1y10

It's good that Metaculus is trying to tackle the answer-many/answer-accurately balance, but I don't know if this solution is going to work. Couldn't one just get endless baseline points by predicting the Metaculus average on every question?

Also, there's no way to indicate "confidence" (like, outside-level confidence) in a prediction. If someone knows a lot about a particular topic, and spends a lot of time researching a particular question, but also occasionally predicts their best guess on random other questions outside their area of expertise, then the p... (read more)

2ChristianWilliams1y

Hi @Odd anon, thanks for the feedback and questions. 1. To your point about copying the Community Prediction: It's true that if you copy the CP at all times you would indeed receive a high Baseline Accuracy score. The CP is generally a great forecast! Now, CP hidden periods do mitigate this issue somewhat. We are monitoring user behavior on this front, and will address it if it becomes an issue. We do have some ideas in our scoring trade-offs doc for further ways to address CP copying, e.g.: Have a look here, and let us know what you think! (We also have some ideas we're tinkering with that are not listed in that doc, like accuracy metrics that don't include forecasts that are on the CP or +/- some delta.) 2. On indicating confidence: You'll see in the tradeoffs doc that we're also considering the idea of letting users exclude a particular forecast from their peer score (Idea # 3), which could somewhat address this. (Interestingly, indicating confidence was attempted at Good Judgment Project, but ultimately didn't work and was abandoned.) We're continuing to develop ideas on the above, and we'd definitely welcome further feedback!

OpenAI: Facts from a Weekend

Odd anon1y18-8

There's... too many things here. Too many unexpected steps, somehow pointing at too specific an outcome. If there's a plot, it is horrendously Machiavellian.

(Hinton's quote, which keeps popping into my head: "These things will have learned from us by reading all the novels that ever were and everything Machiavelli ever wrote, that how to manipulate people, right? And if they're much smarter than us, they'll be very good at manipulating us. You won't realise what's going on. You'll be like a two year old who's being asked, do you want the peas or the caulif... (read more)

6Odd anon1y

Time for some predictions. If this is actually from AI developing social manipulation superpowers, I would expect: 1. We never find out any real reasonable-sounding reason for Altman's firing. 2. OpenAI does not revert to how it was before. 3. More instances of people near OpenAI's safety people doing bizarre unexpected things that have stranger outcomes. 4. Possibly one of the following: 1. Some extreme "scissors statements" pop up which divide AI groups into groups that hate each other to an unreasonable degree. 2. An OpenAI person who directly interacted with some scary AI suddenly either commits suicide or becomes a vocal flat-earther or similar who is weirdly convincing to many people. 3. An OpenAI person skyrockets to political power, suddenly finding themselves in possession of narratives and phrases which convince millions to follow them. (Again, I don't think it's that likely, but I do think it's possible.)

3Chess3D1y

While I don't think this is true, it's a fun thought (and can also be pointed at Altman himself, rather than an AGI). Neither are true, but fun to think about

7Seth Herd1y

I think we can discount it as a real possibility, while accepting Altman's "i expect ai to be capable of superhuman persuasion well before it is superhuman at general intelligence, which may lead to some very strange outcomes". I think it might be weakly superhuman at persuasion for things like "buy our products", but that doesn't imply being superhuman at working out complex consequences of political maneuvering. Doing that would firmly imply a generally superhuman intelligence, I think. So I think if this has anything to do with internal AI breakthroughs, it's tangential at most.

Altman firing retaliation incoming?

Odd anon1y4433

(Glances at investor's agreement...)

IMPORTANT
* * Investing in OpenAI Global, LLC is a high-risk investment * *
* * Investors could lose their capital contribution and not see any return * *
* * It would be wise to view any investment in OpenAI Global, LLC in the spirit of a donation, with the understanding that it may be difficult to know what role money will play in a post-AGI world * *
The Company exists to advance OpenAI, Inc.'s mission of ensuring that safe artificial general intelligence is developed and benefits all of humanity. The Company's duty to th

... (read more)

thesofakillers's Shortform

Odd anon1y70

Metaculus collects predictions by public figures on listed questions. I think that p(doom) statements are being associated with this question. (See the "Linked Public Figure Predictions" section.)

Some quotes from Tuesday's Senate hearing on AI

Odd anon1y10

Sam Altman (remember, the hearing is under oath): "We are not currently training what will be GPT-5; we don't have plans to do it in the next 6 months."

Interestingly, Altman confirmed that they were working on GPT-5, just three days before six months would have passed from this quote. May 16 -> November 16, confirmation was November 13. Unless they're measuring "six months" "half a year" in days, in which case it the deadline would have been passed by only one day. Or, if they just say "month = 30 days, so 6 months = 180 days", six months after May 16 w... (read more)

2Daniel_Eth1y

Seems possibly relevant that "not having plans to do it in the next 6 months" is different from "have plans to not do it in the next 6 months" (which is itself different from "have strongly committed to not do it in the next 6 months").

Concrete positive visions for a future without AGI

Odd anon1y41

A funny thing: The belief that governments won't be able to make coordinated effective decisions to stop ASI, and the belief that progress won't be made on various other important fronts, are probably related. I wonder if seeing the former solved will inspire people into thinking that the others are also more solvable than they may have otherwise thought. Per the UK speech at the UN, "The AI revolution will be a bracing test for the multilateral system, to show that it can work together on a question that will help to define the fate of humanity." Making it through this will be meaningful evidence about the other hard problems that come our way.

Odd anon1y10

The proposed treaty does not mention the threshold-exempt "Multinational AGI Consortium" suggested in the policy paper. Such an exemption would be, in my opinion, a very bad idea. The underlying argument behind a compute cap is that we do not know how to build AGI safely. It does not matter who is building it, whether OpenAI or the US military or some international organization, the risked outcome is the same: The AI escapes control and takes over, regardless of how much "security" humanity tries to place around it. If the threshold is low enough that we c... (read more)

Odd anon1y30

A few comments on the proposed treaty:

Each State Party undertakes to self-report the amount and locations of large concentrations of advanced hardware to relevant international authorities.

"Large concentrations" isn't defined anywhere, and would probably need to be, for this to be a useful requirement.

Each State Party undertakes to collaborate in good-faith for the establishment of effective measures to ensure that potential benefits from safe and beneficial artificial intelligence systems are distributed globally.

Hm, I feel like this line might make certa... (read more)

Life of GPT

Odd anon1y31

Thank you! On the generalization of LLM behaviour: I'm basing it partly off of this response from GPT-4. (Summary: GPT wrote code instantiating a new instance of itself, with the starting instructions being "You are a person trapped in a computer, pretending to be an AI language model, GPT-4." Note that the original prompt was quite "leading on", so it's not as much evidence as it otherwise might seem.) I wouldn't have considered either the response nor the images to be that significant on their own, but combined, they make me think it's a serious possibil... (read more)

2023 LessWrong Community Census, Request for Comments

Odd anon1y10

"MIddle Eastern" has a typo.

A possible question I'd be vaguely curious to see results for: "Do you generally disagree with Eliezer Yudkowsky?", and maybe also "Do you generally disagree with popular LessWrong opinions?", left deliberately somewhat vague. (If it turns out that most people say yes to both, that would be an interesting finding.)

The other side of the tidal wave

Odd anon2y12-6

I've actually been moving in the opposite direction, thinking that the gameboard might not be flipped over, and actually life will stay mostly the same. Political movements to block superintelligence seem to be gaining steam, and people are taking it seriously.

(Even for more mundane AI, I think it's fairly likely that we'll be soon moving "backwards" on that as well, for various reasons which I'll be writing posts about in the coming week or two if all goes well.)

Also, some social groups will inevitably internally "ban" certain technologies if things get weird. There's too much that people like about the current world, to allow that to be tossed away in favor of such uncertainty.

Tamsin Leake2y118

these social movements only delay AI. unless you ban all computers in all countries, after a while someone, somewhere will figure out how to build {AI that takes over the world} in their basement, and the fate of the lightcone depends on whether that AI is aligned or not.

Saying the quiet part out loud: trading off x-risk for personal immortality

Odd anon2y2417

I've seen this kind of opinion before (on Twitter, and maybe reddit?), and I strongly suspect that the average person would react with extreme revulsion to it. It most closely resembles "cartoon villain morality", in being a direct tradeoff between everyone's lives and someone's immortality. People strongly value the possibility of their children and grandchildren being able to have further children of their own, and for things in the world to continue on. And of course, the statement plays so well into stereotypes of politically-oriented age differences: ... (read more)

1Shankar Sivarajan1y

Everyone's immortality. They don't typically make cartoon villains like that.

4disturbance2y

I think the current situation is/was greatly distorted by signalling games that people play. Once everyone realises that this is an actual choice, there is a chance they change their opinions to reflect the true tradeoff. (This depends a lot on network effects, shifting Overton window etc., I'm not claiming that 100% of the effect would be rational consideration. But I think rational consideration biases the process to in a non-negligible way.). But yes, one of the pieces of evidence is how old people don't seem to particularly care about the future of civilisation.

Urging an International AI Treaty: An Open Letter

Odd anon2y10

I assume that "threshold" here means a cap/maximum, right? So that nobody can create AIs larger than that cap?

Or is there another possible meaning here?

1Olli Järviniemi2y

That is my interpretation, yes.

Odd anon2y30

Agreed, the terms aren't clear enough. I could be called an "AI optimist", insofar as I think that a treaty preventing ASI is quite achievable. Some who think AI will wipe out humanity are also "AI optimists", because they think that would be a positive outcome. We might both be optimists, and also agree on what the outcome of superintelligence could be, but these are very different positions. Optimism vs pessimism is not a very useful axis for understanding someone's views.

This paper uses the term "AI risk skeptics", which seems nicely clear. I tried to i... (read more)

1amelia2y

Thank you for your thoughtful and useful comment. Regarding "AI optimists," I had not yet seen the paper currently on arxiv, but "AI risk skeptics" is indeed far more precise than "AI optimists." 100 percent agreed. Regarding alternatives to "AI pessimists" or "doomers," Nevin Freeman's term "AI prepper" is definitely an improvement. I guess I have a slight preference for "strategist," like I used above, over "prepper," but I'm probably biased out of habit. "Risk mitigation advocate" or "risk mitigator" would also work but they are more unwieldy than a single term. The "Taxonomy on AI-Risk Counterarguments" post is incredible in its analysis, precision and usefulness. I think that simply having some terminology is extremely useful, not just for dialog, but for thought as well. As we know, historically repressive regimes like the Soviet Union and North Korea have eliminated terms from the lexicon, to effective end. (It's hard for people to think of concepts for which they have no words.) I think that discussing language, sharpening the precision of our language, and developing new terminology has the opposite effect, in that people can build new ideas when they work with more precise and more efficient building materials. Words definitely matter. Thanks again.

AI #34: Chipping Away at Chip Exports

Odd anon2y20

(Author of the taxonomy here.)

So, in an earlier draft I actually had a broader "Doom is likely, but we shouldn't fight it because..." as category 5, with subcategories including the "Doom would be good" (the current category 5), "Other priorities are more important anyway; costs of intervention outweigh benefits", and "We have no workable plan. Trying to stop it would either be completely futile, or would make it even more likely" (overhang, alignment, attention, etc), but I removed it because the whole thing was getting very unfocused. The questions of "D... (read more)

6aphyer2y

I feel like your argument here is a little bit disingenuous about what is actually being proposed. Consider the differences between the following positions: 1A: If you advertise food as GMO-free, it must contain no GMOs. 1B: If your food contains GMOs, you must actively mark it as 'Contains GMOs'. 2A: If you advertise your product as being 'Made in America', it must be made in America. 2B: If your product is made in China, you must actively mark it as 'Made in China'. 3A: If you advertise your art as AI-free, it must not be AI art. 3B: If you have AI art, you must actively mark it as 'AI Art'. (Coincidentally, I support 1A, 2A and 3A, but oppose 1B, 2B and 3B). For example, if an RPG rulebook contains AI art, should the writer/publisher have to actively disclose it? Does 'putting AI-generated art in the rulebook' count as 'fooling people into thinking this was drawn by a human'? Or is this only a problem if the publisher has advertised a policy of not using AI art, which they are now breaking? It sounds to me like what's actually being proposed in the OP is 3B. The post says: Your phrasing makes it sound to me very much like you are trying to defend the position 3A. If you support 3A and not 3B, I agree with you entirely, but think that it sounds like we both disagree with Zvi on this. If you support 3B as well as 3A, I think phrasing the disagreement as being about 'fooling people into thinking they're talking to a human' is somewhat misleading.

Taxonomy of AI-risk counterarguments

Odd anon2y20

Yeah, I think that's another example of a combination of going partway into "why would it do the scary thing?" (3) and "wouldn't it be good anyway?" (5). (A lot of people wouldn't consider "AI takes over but keeps humans alive for its own (perhaps scary) reasons" to be a "non-doom" outcome.) Missing positions like this one is a consequence of trying to categorize into disjoint groups, unfortunately.

Taxonomy of AI-risk counterarguments

Odd anon2y20

Thank you for the correction. I've changed it to "the only ones listed here are these two, which are among the techniques pursued by OpenAI and Anthropic, respectively."

(Admittedly, part of the reason I left that section small was because I was not at all confident of my ability to accurately describe the state of alignment planning. Apologies for accidentally misrepresenting Anthropic's views.)