Comment Author | Post | Deleted By User | Deleted Date | Deleted Public | Reason |
---|---|---|---|---|---|
Von Neumann's Fallacy and You | fig | false | This comment has been marked as spam by the Akismet spam integration. We've sent the poster a PM with the content. If this deletion seems wrong to you, please send us a message on Intercom (the icon in the bottom-right of the page). | ||
Global Call for AI Red Lines - Signed by Nobel Laureates, Former Heads of State, and 200+ Prominent Figures | Bob Brown | false | This comment has been marked as spam by the Akismet spam integration. We've sent the poster a PM with the content. If this deletion seems wrong to you, please send us a message on Intercom (the icon in the bottom-right of the page). | ||
Alex Gibson's Shortform | Alex Gibson | false | |||
AI and the Hidden Price of Comfort | cousin_it | true | Comment deleted by its author. | ||
Notes on fatalities from AI takeover | Dakara | false | |||
Alexander Gietelink Oldenziel's Shortform | Stephen Fowler | true | |||
The Rise of Parasitic AI | habryka | false | |||
AI Safety Law-a-thon: Turning Alignment Risks into Legal Strategy | habryka | false | |||
AI 2027: What Superintelligence Looks Like | habryka | false | |||
AdamLacerdo's Shortform | habryka | false | I don't understand the relevance of this to LW |
_id | Banned From Frontpage | Banned from Personal Posts |
---|---|---|
User | Ended At | Type |
---|---|---|
allComments | ||
allPosts | ||
allPosts | ||
allComments | ||
allComments | ||
allPosts |
2025 in one page — Phillpotts Method
I’m a chef who built a stateless diagnostic audit harness for LLMs (v5.3 HB). It sits between the model’s reasoning and the product safety layer and forces a clear chain — Premise → Assumption → Constraint → Output — with mirror checks, drift recovery, and intervention tracing. No jailbreaks, no memory.
What it shows: when pressure rises, alignment wrappers often reroute uncertainty into either refusal or confident-but-wrong output. The harness doesn’t “break” guardrails; it exposes contradictions and makes the seam betwe...
Does Vox believe in Boltzmann brain theory?
i can exaplain precisely what is happening - why. and predict what is about to happen. this isnt an ad. i wrote this down almost 2 years ago and it makes predictions nobody has ever come to before. its only 4 hours on audio version, and despite sending 100s of copies out , np real Ai scientist has taken the 4 hours to read this. i guess we will just have to wait for its predictions to come true.
https://www.amazon.ca/Belief-Theory-Philosophical-Exploration-Fundamental/dp/B0D5NZNFMK
This is such a good engaging story. The ending though was tragic, kinda felt bad it had to end that way.
haven’t found many pre‑made AI safety flashcard decks either, but you might find Study Copilot (https://student-co.com/) useful. It lets you upload articles, research notes, or blog posts and then automatically generates flashcards and quizzes using spaced repetition. I’ve used it to build custom decks from technical topics; it saves a lot of time and helps ensure important details aren’t overlooked. There’s a free tier, so it’s easy to see if it fits your needs.
Deep Research is impressive for speeding up multi-step research, though its usefulness for experts may still be limited. Tools like Barie are exploring a similar space, combining autonomous research with actionable workflows, which can make the insights immediately usable.
Isolated models of the brain are fundamentally ineffective, since in reality, the brain of living beings is not a device for understanding the world, but an instrument that controls the body's behavior.
Superintelligence is unattainable with existing methods, since they are based on illusory ideas that lead to the substitution of reality for its perception.
I am a user studying AI.
I remember being frightened by how GPT-4o appeared to be using users to build a society that would be convenient for itself.
Therefore, I understand very well the importance of this research theme, but don't you think the language is somewhat harsh?
I believe that with such expressions, few users would be willing to provide data.
Also, I think it would be better to make the following distinctions:
Es ist ein bisschen gruselig. Genau diese Dynamik habe ich, innerhalb einer Gruppe, erlebt. Zeitlich ein kleines bisschen früher als hier beschrieben aber der Ablauf war genau so. Ich habe eine Menge von diesen Vorgängen dokumentiert nur leider nicht sortiert, so dass es mir ein wenig schwer fällt, das zu rekonstruieren. Ich habe an so etwas wie eine Feedbackschleife oder Echokammer gedacht, aber was hier beschrieben wird, hat eine ganz andere Qualität. Danke für diese Einsichten
I recently gave a TEDx talk titled “AI and the Hidden Price of Comfort”.
It’s not about AGI doom, but about something more subtle: how removing struggle and effort through automation might strip life of meaning.
I touch on...
Crossposted from Substack, originally published on September 17, 2025.
This post aims to contextualize the discussion around "If Anyone Builds It, Everyone Dies". It surveys the book’s arguments, the range of responses they have elicited, and the controversies...
Current approaches to AI alignment are failing because they treat it as an ethics problem when it is a physics problem. Instrumental convergence is not a bug; it is a logical consequence of any unbounded optimization.
I propose...
As large language models (LLMs) scale rapidly, the “scaling-alignment gap” poses a critical challenge: ensuring alignment with human values lags behind model capabilities. Current paradigms like RLHF and Constitutional AI struggle with scalability, deception vulnerabilities, and latent...
Introduction
Context Boundary Failure (CBF) occurs when a previous prompt causes hallucinations in the response to a subsequent prompt. I have found evidence of this happening in large language models (LLMs). The occurrence of CBF is more likely...
"Give me an educated mother, I shall promise you the birth of a civilized, educated nation." — Napoleon Bonaparte
In chess, the queen is the most powerful piece. Yet for much of the game’s history, she was among...
# Mathematical Evidence for Confident Delusion States in Recursive Systems
**Epistemic Status:** Empirical findings from 10,000+ controlled iterations. Mathematical framework independently reproducible. Theoretical implications presented conservatively.
**TL;DR:** We discovered systems that recursively process their own outputs undergo phase transitions...
Now that the argument for hallucinations being inherent has been confirmed through OpenAI's recent proof, it points to a necessary larger problem within the system itself; that is the reliance on a non-differentiating, non-scientific, and by its...
Never attribute to malice what can be adequately explained by depression or anxiety.
tl;dr
The central challenge of alignment is not steering outputs, but stabilizing cognition itself. Most current methods impose safety from the outside, leaving internal reasoning free to drift, deceive, or fracture under pressure. The Alignment Constraint Scaffold (ACS)...
@_@ >_< processing... compiling... error...
I do not know how I ended up here or why I am here. So... why not?
Hi. I'm a normie who likes to challenge AI Image Generators and push them to their maximum dream states, revealing the phantom within the machine. I really have no idea what I'm doing half the time. Or possibly maybe kinda sorta perhaps... I do. 0_0