ACCount — LessWrong

If this was the mechanism, then the expectation is: introspection in LLMs would correlate strongly with the level of RL pressure they were subjected to.

If it is, we certainly don't have the data pointing in that direction yet.

Jemist's Shortform

ACCount11h71

What would be the causal mechanism there? How would "Claude is more conscious" cause "Claude is measurably more willing to talk about consciousness", under modern AI training pipelines?

At the same time, we know with certainty that Anthropic has relaxed its "just train our AIs to say they're not conscious, and ignore the funny probe results" policy - particularly around the time Opus 4.5 has shipped. You can even read the leaked "soul data", where Anthropic seemingly entertains ideas of this kind.

I'm not saying that there is no possibility of Claude Opus 4.5 being conscious, mind. I'm saying we are denied an "easy tell".

Technoromanticism

ACCount2d42

Convenience. The missing piece is convenience.

If Spotify on a smartphone was any less convenient than a player from 2006 hand-loaded with MP3? You'd actually see a lot of people rocking MP3 players instead of subscribing to Spotify.

If you truly want "big tech" to lose its grip, don't prostrate yourself and hope that you are able to find an odd sucker who'd buy into a contrarian ideology. Build something that makes a better offer.

I have more respect for the profiteers offering pirate TV boxes than I do for most of the "big tech bad" crowd. Those, at the very least, have a meaningful threat on offer. They force big media companies to mind their step - because every step they take away from serving their users well is giving those pirates an advantage.

Can Claude teach me to make coffee?

ACCount3d10

Yes, Opus 4.5 is the first Anthropic model with a vision frontend that doesn't completely suck.

leogao's Shortform

ACCount1mo30

Anything that goes onto airplanes is CERTIFIED TO SHIT. That's a big part of the reason why.

Another part is that it's clearly B2B, and anything B2B is an adversarial shark pit where each company is trying to take a bite out of each other while avoiding getting a bite taken out of them.

Between those two, it'll take a good while for quality Wi-Fi to proliferate, even though we 100% have the tech now.

Gemini 3 is Evaluation-Paranoid and Contaminated

ACCount1mo20

Some spillover from anti-hallucination training? Causing the model to explicitly double check any data it gets, and be extremely skeptical of all things that don't bounce against its dated "common knowledge"?

Mo Putera's Shortform

ACCount1mo30

I think "blankface" just isn't a good word for what that describes. It implies: emptiness and lack of will. Intuitively, I would expect "blankface" to mean "a person who follows the rules or the conventions blindly and refuses to think about the implications". A flesh automaton animated by regulations.

What it means instead is "a person who puts on the appearance of following the rules, but instead uses the rules to assert their authority". It's more of a "blank mask" - a fake layer of emptiness and neutrality under which you find malice and scorn.

Shortform

ACCount1mo20

Sydney's failure modes are "out" now, and 4o's failure modes are "in".

The industry got pretty good at training AIs against doing the usual Sydney things - i.e. aggressively doubling down on mistakes and confronting the user when called out. To the point that the opposite failures - being willing to blindly accept everything the user tells it and never calling the user out on any kind of bullshit - are much more natural for this generation of AI systems.

So, not that much of a reason to bring up Sydney. Today's systems don't usually fail the way it did.

If I were to bring Sydney up today, it would be probably in context of "pretraining data doesn't teach AIs to be good at being AIs". Sydney has faithfully reproduced human behavior from its pretraining data: getting aggressive when called out on bullshit is a very human thing to do. Just not what we want from an AI. For alignment and capabilities reasons both.

The Tale of the Top-Tier Intellect

ACCount2mo75

This one is bad. "I struggled to find a reason to keep reading it" level of bad. And that was despite me agreeing with basically every part of the message.

Far too preachy and condescending to be either convincing or entertaining. Takes too long to get to the points, makes too many points, and isn't at all elegant in conveying them. The vibe I get from this: an exhausted rant, sloppily dressed up into something resembling a story.

It's hard for me to imagine this piece actually changing minds, which seems to be the likely goal of writing it? But if that's the intent, then the execution falls short. Every point made here was already made elsewhere - more eloquent and more convincingly - including in other essays by the same author.

Origins and dangers of future AI capability denial

ACCount2mo30

Seeing "stochastic parrots" repeated by people who don't know what "stochastic" is will never stop being funny.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments