Replying toAlignment will happen by default. What’s next?

Alignment will happen by default. What’s next?

The models don't even consider taking over in their 'private thoughts'

It would be strange to seriously consider taking over the world (as opposed to joking about it), if you lack anywhere near the power to do so.

Chris_Leong6d

Don't focus on maximizing impact on participants; this is less important than reducing the mentorship bottleneck, which is best served by boosting the most advanced participants.

Could you clarify? Do you mean that if you have the chance to support someone new who would gain a lot since they haven't participated in many AI safety programs or the chance to support someone more advanced, you'd suggest picking the later? With the reasoning being that the former might look like a better bet because of more room to make a difference, however boosting the latter increases the supply of mentors and therefore actually ends up benefiting beginners as least as much.

Replying toA Proposal for a Better ARENA: Shifting from Teaching to Research Sprints

Chris_Leong1mo

A Proposal for a Better ARENA: Shifting from Teaching to Research Sprints

"The content of the existing ARENA notebooks could be a prerequisite for the new program"

I don't think this would work very well. If you were super disciplined and you took one day every two weeks to work through one notework, you'd spend most of a year just to qualify for the program.

Also, shifting Arena to focus on research sprints would, in a sense, reduce the diversity of the field in that most other programs focus more on research than developing ML skills. If one program were to shift to doing research spints, I suspect it'd actually be better for a program that already focuses on research to do that.

Replying toToss a bitcoin to your Lightcone – LW + Lighthaven's 2026 fundraiser

Chris_Leong2mo

Toss a bitcoin to your Lightcone – LW + Lighthaven's 2026 fundraiser

"If we fundraise less than $1.4M (or at least fail to get reasonably high confidence commitments that we will get more money before we run out of funds), I expect we will shut down"

It seems really strange to have the first goal at $1 million and the second at $2 million when these don't directly correspond to milestones.

Then again, maybe you've got strategic reasons for this.

Replying toAlignment Fellowship

Chris_Leong2mo

Alignment Fellowship

200k is pretty high. A higher salary can increase the number of applicantions, but it also increases the number of applications you'd need to filter through.

Replying toNotice When People Are Directionally Correct

Chris_Leong2mo

Notice When People Are Directionally Correct

Thanks!

"AI Explained has also felt like he lost depth of insight"

What makes you say that?

Replying toNotice When People Are Directionally Correct

Chris_Leong2mo

Notice When People Are Directionally Correct

I've been encouraged to write a self-review: I don't have much to say here, except that if I knew this article would be this popular (over 100 upvotes), then I would have written it a bit more carefully the first time. I just spent 10 minutes rewriting some awkwards phrasings.

Replying toNotice When People Are Directionally Correct

Chris_Leong2mo

Notice When People Are Directionally Correct

Out of these, who is your top pick?

Replying toToss a bitcoin to your Lightcone – LW + Lighthaven's 2026 fundraiser

Chris_Leong2mo

Toss a bitcoin to your Lightcone – LW + Lighthaven's 2026 fundraiser

Is there a way to insert diagrams like that into Less Wrong posts in general or is this a feature you added just for this specific post?

Sydney AI Safety Fellowship 2026 (Priority deadline this Sunday)

Chris_Leong

2mo

Application deadline:

Main deadline: Midnight, 7th December, Sydney time
If we have unfilled slots, we may still accept applications until the 14th of December

Location: Sydney (definite); Melbourne (likely; contingent on sufficient high-quality applications)

When: January/February 2026 with remote activities pre and post the main fellowship (detail further down)

Apply now

We are looking for a small group of strongly motivated, highly agentic individuals with good strategic judgement (from technical researchers to governance thinkers to entrepreneurs) who want to spend the summer developing situational awareness, figuring out where they can best contribute, and working on a project that demonstrates their potential.

Our core promise: Other programs in AI safety are primarily designed to accelerate your career as fast as... (read 626 more words →)

Quotes on AI and wisdom

Chris_Leong

3mo

Excepted from an upcoming post on AI and wisdom

But the moral considerations, Doctor...

Did you and the other scientists not stop to consider the implications of what you were creating? — Roger Robb

When you see something that is technically sweet, you go ahead and do it and you argue about what to do about it only after you have had your technical success. That is the way it was with the atomic bomb— Oppenheimer

❦

There are moments in the history of science, where you have a group of scientists look at their creation and just say, you know: ‘What have we done?... Maybe it's great, maybe it's bad, but what have we done? —

... (read 401 more words →)

Towards Humanist Superintelligence

Chris_Leong

3mo

I'm sharing this post from Mustafa Suleyman, CEO of Microsoft AI, because it's honestly quite shocking to see a post like this coming out of Microsoft. I know some people might accuse him of safety washing, but to me it comes off as an honest attempt to grapple with the future.

Selected Graphics Showing Progress towards AGI

Chris_Leong

4mo

Except from the upcoming post: Beyond Human Wisdom: Can We Survive the Rise of AGI?

Coming soon 😊(PM me if you'd be keen to provide feedback on the draft).

Motivation

Motivating image — `View on Life Itself`

The 🅂🅄🅅 Triad (the challenge)

🅂peed - in absolute terms and relative to the speed of governance
🅄ncertainty - regarding the situation and strategy
🅅ulnerability - many catastrophic threats that are hard or costly to defend against

Graphs on Progress

`View on Metaculus`
`View Goodheart Lab's forecast aggregator`

Placeholder for an experimental art project — Under construction 🚧^[1]

Anything can be art, it might just be bad art — Millie Florence

Art in the Age of the Internet

The medium is the message — Marshall McLuhan, Media Theorist

Hypertext is not a technology, it is a way of thinking — ChatGPT 5^[2]

Writing is the process of reducing a tapestry of interconnections to a narrow sequence. This is, in a sense, illicit. This is a wrongful compression of what should spread out, and today’s computers, they’ve betrayed that — Ted Nelson, founder of Project Xanadu^[3]^[4]

𝕯𝖔𝖔𝖒 $^{؟}$

𝒽𝑜𝓌 𝓉𝑜 𝒷𝑒𝑔𝒾𝓃? 𝓌𝒽𝒶𝓉 𝒶𝒷𝑜𝓊𝓉 𝒶𝓉 𝕿𝖍𝖊 𝕰𝖓𝖉?^[5]

𝕿𝖍𝖊 𝕰𝖓𝖉? 𝕚𝕤 𝕚𝕥 𝕣𝕖𝕒𝕝𝕝𝕪 𝕿𝖍𝖊 𝕰𝖓𝖉?

𝓎𝑒𝓈. 𝒾𝓉 𝒾𝓈 𝕿𝖍𝖊 𝕰𝖓𝖉. 𝑜𝓇 𝓂𝒶𝓎𝒷𝑒 𝒯𝒽ℯ

... (read 1549 more words →)

Four Quotes on Transformative Technology

Chris_Leong

6mo

Urgent: get collectively wiser - Yoshua Bengio, AI "Godfather", On the Wisdom Race

Before the prospect of an intelligence explosion, we humans are like small children playing with a bomb. Such is the mismatch between the power of our plaything and the immaturity of our conduct. — Nick Bostrom, Superintelligence

❦

Did you and the other scientists not stop to consider the implications of what you were creating? — Roger Robb

When you see something that is technically sweet, you go ahead and do it and you argue about what to do about it only after you have had your technical success. That is the way it was with the atomic bomb— Oppenheimer

❦

😱✂️💣💣💣💣💣 𝙳 𝙸 𝚂 𝙰 𝚂 𝚃 𝙴 𝚁 - 𝙱 𝚈 - 𝙳 𝙴 𝙵 𝙰 𝚄 𝙻 𝚃 ? - Public Draft

This is a draft post to hold my thoughts on Disaster-By-Default.

I have an intuition that either the SUV Triad can be turned into an argument for Disaster-By-Default and so I created this post to explore this possibility.

However, I consider this post experimental in that it may not pan out.

☞ The　𝙳 𝙸 𝚂 𝙰 𝚂 𝚃 𝙴 𝚁 - 𝙱 𝚈 - 𝙳 𝙴 𝙵 𝙰 𝚄 𝙻 𝚃　hypothesis:

AGI leads to some kind of societal scale catastrophe by default

Clarification: This isn't a claim that it wouldn't be possible to avoid

... (read 1955 more words →)

Parent comment for: Why the focus on wise AI advisors?

On actually taking expressions literally: tension as the key to meditation?

Chris_Leong

7mo

Some speculations based upon the Vasocomputational Theory of Mediation, meditation and some poorly understood Lakoff. Even though reading about meditation is low risk, I wouldn't necessarily assume that it is risk-free.

A summer's night. Two friends have been sitting around a fire, discussing life and meaning late into the evening...

Riven: So in short, I feel like I'm being torn in two.

Rafael: Part of you is being pulled one direction and another part of you is being pulled another way.

Riven: That’s exactly what I’m feeling.

Rafael: Figuratively or literally?

Riven: What? No… what?!

Rafael: I’m serious

Riven: Literally??

Rafael: Yes.

Riven: Come now, can't you be serious for once? Whilst the bit may worked for Socrates, I have to admit that... (read 1482 more words →)

An Easily Overlooked Post on the Automation of Wisdom and Philosophy

Chris_Leong

8mo

This week for Wise AI Wednesdays, I'll be sharing something a bit different - the announcement post of a competion that is already over (the AI Impacts Essay competition on the Automation of Wisdom and Philosophy). If you're wondering why I'm sharing it, even though some of the specific discussion of the competition is no longer relevant, I still believe this post contains a lot of great content and I think it would be a shame if everyone forgot about it just because it happened to be in the announcement post.

This post explains why they think this might be important, lists some potentially interesting research directions, and then finishes with an FAQ.... (read 213 more words →)

Potentially Useful Projects in Wise AI

Chris_Leong

8mo

This is a list of projects^[1] to consider for folks who want to use Wise AI to steer the world towards positive outcomes.

Some of these projects are listed because they're impactful. Others are listed because I believe they would be good projects for someone to get started.

Please note that this post is titled "potentially useful projects" for a reason. Some of these projects are likely to have much higher impact than others. Whilst I've really tried to avoid projects with net-negative EV, it's entirely possible that a few would be anyway^[2]. Please don't ignore your own judgment just because I've listed a project!

I'm sure my views about what projects are valuable will change... (read 1486 more words →)

QR Code for: Why the focus on wise AI advisors? (plus FAQ)

Short Link: https://shorturl.at/idQt9

Reflections on AI Wisdom, plus announcing Wise AI Wednesdays

Chris_Leong

9mo

I recently finished leading an AI Safety Camp project on Wise AI Advisors^[1] (my team included Chris Cooper, Matt Hampton, and Richard Kroon). Since we want to share our work in an orderly fashion, I’m launching Wise AI Wednesdays. Each Wednesday^[2], I (or one of my teammates) will be sharing a post, initially drawn from our AI Safety Camp outputs, but later including shifting to include future work and outputs summaries or commentary on related research. I’m hoping that a regular posting schedule will help cultivate Wise AI/Wise AI Advisors as a subfield of AI Safety.

This inaugural post provides an update on how my views on AI and wisdom have changed since I won... (read 642 more words →)

There's been a lot of discussion about how Less Wrong is mostly just AI these days.

If that's something that folk want to address, I suspect that the best way to do this would be to run something like the Roots of Progress Blog-Building Intensive. My admittedly vague impression is that it seems to have been fairly successful.

Between Less Wrong for distribution, Lighthaven for a writing retreat and Less Online for networking, a lot of the key infrastructure is already there to run a really strong program if the Lightcone team ever decided to pursue this.

There was discussion about an FHI of the West before, but that seems hard given the current funding situation. I suspect that a program like this would be much more viable.

AI Safety & Entrepreneurship v1.0

Chris_Leong

10mo

Why did I create both a post and a wiki article?

Posts are best for making sure people see the initial version of the post, whilst Wiki articles are best for long-term maintenance. Posting an article that links to a Wiki page provides the best of both worlds.

For the most up-to-date version, see the Wiki page.

Articles:

There should be more AI safety organisations

Why does the AI Safety Community need help founding projects?

AI Assurance Tech Report

AI Safety as a YC Startup

Alignment can be the ‘clean energy’ of AI

AI Tools for Existential Security

Incubation Programs:

Def/acc at Entrepreneur First - this is a new program, which focuses on "defensive" tech, which will produce a wide variety of startups,... (read 301 more words →)

I guess orgs need to be more careful about who they hire as forecasting/evals researchers.

Sometimes things happen, but three people at the same org...

This is also a massive burning of the commons. It is valuable for forecasting/evals orgs to be able to hire people with a diversity of viewpoints in order to counter bias. It is valuable for folks to be able to share information freely with folks at such orgs without having to worry about them going off and doing something like this.

But this only works if those less worried about AI risks who join such a collaboration don't use the knowledge they gain to cash in on the AI boom... (read more)

Random thought: We should expect LLM's trained on user responses to have much more situational knowledge than early LLM's trained on the pre-Chatbot internet because users will occasionally make reference to the meta-context.

It may be possible to get some of this information from pre-training on chatlogs/excerpts that make their way onto the internet, but the information won't be quite as accessible because of differences in the context.

If this were a story, there'd be some kind of academy taking in humanity's top talent and skilling them up in alignment.

Most of the summer fellowships seem focused on finding talent that is immediately useful. And I can see how this is tempting given the vast numbers of experienced and talented folks seeking to enter the space. I'd even go so far as to suggest that the majority of our efforts should probably be focused on finding people who will be useful fairly quickly.

Nonetheless, it does seem as though there should be at least one program that aims to find the best talent (even if they aren't immediately useful) and which provides... (read more)

Collapsable boxes are amazing. You should consider using them in your posts.

They are a particularly nice way of providing a skippable aside. For example, filling in background information, answering an FAQ or including evidence to support an assertion.

Compared to footnotes, collapsable boxes are more prominent and are better suited to contain paragraphs or formatted text.

Chris_Leong1y*Quick Take

Why the focus on wise AI advisors? (Metapost^[1] with FAQ 📚🙋🏻‍♂️, Informal^[2] Edition 🕺, Working Draft 🛠️)

About this Post - 🔗: 🕺wiseaiadvisors.com , 🕴️Formal Edition (coming soon)

This post is still in draft, so any feedback would be greatly appreciated 🙏. It'll be posted as a full, proper Less Wrong/EA Forum/Alignment forum post, as opposed to just a short-form, when it's ready 🌱🌿🌳.

✉️ PM via LW profile

📲 QR Code

This post is a collaboration between Chris Leong (primary author) and Christopher Clay (editor), written in the voice of Chris Leong.

We have worked very hard on this^[3] and we hope you find it to be of some use. Despite this work, it will most likely contain enough

... (read 8448 more words →)

Chris_Leong

Decoupling vs Contextualizing Norms

Notice When People Are Directionally Correct

Don't Dismiss Simple Alignment Approaches

On Destroying the World

Chris_Leong

Sydney AI Safety Fellowship 2026 (Priority deadline this Sunday)

Quotes on AI and wisdom

Towards Humanist Superintelligence

Selected Graphics Showing Progress towards AGI

Four Quotes on Transformative Technology

On actually taking expressions literally: tension as the key to meditation?

An Easily Overlooked Post on the Automation of Wisdom and Philosophy

Wise AI Wednesdays

Linguistic Freedom: Map and Territory Revisted

Investigations Into Infinity

Chris_Leong

Decoupling vs Contextualizing Norms

Notice When People Are Directionally Correct

Don't Dismiss Simple Alignment Approaches

On Destroying the World

Chris_Leong

Sydney AI Safety Fellowship 2026 (Priority deadline this Sunday)

Quotes on AI and wisdom

Towards Humanist Superintelligence

Selected Graphics Showing Progress towards AGI

Four Quotes on Transformative Technology

On actually taking expressions literally: tension as the key to meditation?

An Easily Overlooked Post on the Automation of Wisdom and Philosophy

Wise AI Wednesdays

Linguistic Freedom: Map and Territory Revisted

Investigations Into Infinity

Motivation

Graphs on Progress

Placeholder for an experimental art project — Under construction 🚧[1]

QR Code for: Why the focus on wise AI advisors? (plus FAQ)

For the most up-to-date version, see the Wiki page.

Articles:

Incubation Programs:

Why the focus on wise AI advisors? (Metapost[1] with FAQ 📚🙋🏻‍♂️, Informal[2] Edition 🕺, Working Draft 🛠️)

Placeholder for an experimental art project — Under construction 🚧^[1]

Why the focus on wise AI advisors? (Metapost^[1] with FAQ 📚🙋🏻‍♂️, Informal^[2] Edition 🕺, Working Draft 🛠️)