All of Andy E Williams's Comments + Replies

Thanks for the comment. Your response highlights a key issue in epistemology—how humans (and AI) can drift in their understanding of intelligence without realizing it. Any prescribed answer to a question can fail at the level of assumptions or anywhere along the reasoning chain. The only way to reliably ground reasoning in truth is to go beyond a single framework and examine all other relevant perspectives to confirm convergence on truth.

The real challenge is not just optimizing within a framework but ensuring that the framework itself is recursively exami... (read more)

The Real AI Alignment Failure Will Happen Before AGI—And We’re Ignoring It

Amazing writing in the story! Very captivating and engaging. This post raises an important concern—that AI misalignment might not happen all at once, but through a process of deception, power-seeking, and gradual loss of control. I agree that alignment is not a solved problem, and that scenarios like this deserve serious consideration.

But there is a deeper structural failure mode that may make AI takeover inevitable before AGI even emerges—one that I believe deserves equal attention:... (read more)

You're welcome. But which part are you thanking me for and hoping that I keep doing?

1Milan W
All of it. Thinking critically about AI outputs (and also human outputs), and taking mitigating measures to reduce the bullshit in both.

Thanks for your interest. Let me look it over and make whatever changes required for it to be ready to go out. As for ChatGPT being agreeable, ChatGPT’s tendency toward coherence with existing knowledge (it's prioritization of agreeableness) can be leveraged advantageously, as the conclusions it generates—when asked for an answer rather than being explicitly guided toward one—are derived from recombinations of information present in the literature. These conclusions are typically aligned with consensus-backed expert perspectives, reflecting what might be i... (read more)

Yes I tried asking multiple times in different context windows, in different models, and with and without memory. And yes I'm aware that ChatGPT prioritizes agreeableness in order to encourage user engagement. That's why I attempt to prove all of its claims wrong, even when they support my arguments.

1Milan W
Thank you for doing, that and please keep doing it. Maybe also run a post draft trough another human before posting, though.

Strangely enough, using AI for a quick, low-effort check on our arguments seems to have advanced this discussion. I asked ChatGPT 01 Pro to assess whether our points cohere logically and are presented self-consistently. It concluded that persuading someone who insists on in-comment, fully testable proofs still hinges on their willingness to accept the format constraints of LessWrong and to consult external materials. Even with a more logically coherent, self-consistent presentation, we cannot guarantee a change of mind if the individual remains strict... (read more)

Thanks again for your interest. If there is a private messaging feature on this platform please send your email so I might forward the “semantic backpropagation” algorithm I’ve developed along with some case studies assessing it’s impact on collective outcomes. I do my best not to be attached to any idea or to be attached to being right or wrong so I welcome any criticism. My goal is simply to try to help solve the underlying problems of AI safety and alignment, particularly where the solutions can be generalized to apply to other existential challenges su... (read more)

2the gears to ascension
Your original sentence was better. I'll just ask Claude to respond to everything you've said so far: I generally find AIs are much more helpful for critiquing ideas than for generating them. Even here, you can see Claude was pretty wordy and significantly repeated what I'd already said.

Thanks very much for your engagement! I did use ChatGPT to help with readability, though I realize it can sometimes oversimplify or pare down novel reasoning in the process. There’s always a tradeoff between clarity and depth when conveying new or complex ideas. There’s a limit to how long a reader will persist without being convinced something is important, and that limit in turn constrains how much complexity we can reliably communicate. Beyond that threshold, the best way to convey a novel concept is to provide enough motivation for people to investigat... (read more)

3the gears to ascension
Would love to see a version of this post which does not involve ChatGPT whatsoever, only involves Claude to the degree necessary and never to choose a sequence of words that is included in the resulting text, is optimized to be specific and mathematical, and makes its points without hesitating to use LaTeX to actually get into the math. And expect the math to be scrutinized closely - I'm asking for math so that I and others here can learn from it to the degree it's valid, and pull on it to the degree it isn't. I'm interested in these topics and your post hasn't changed that interest, but it's a lot of words and I can't figure out if there's anything novel underneath the pile of marketing stuff. How would you make your entire point in 10 words? 50? 200?