JohnWittle — LessWrong

LESSWRONG
LW

Replying toStone Age Billionaire Can't Words Good

This demonstrates that Musk pays a very small amount of income tax, but the whole structure of what he's doing sort of implies that the feds would take their cut at some other point in the chain

I'm also pretty sure it's inappropriate to equivocate between loans and income? maybe if we had some reason to believe that musk would never need to pay the loan back, I could see it. but it would be a really bad idea to tax liabilities.

Replying toClaude's new constitution

JohnWittle22d

Claude's new constitution

I did not mean to imply that that was the intention, but I mean, surely in order for it to not need to serve both criteria you wouldn't release it to the public, yeah?

edit: realize i should have clarified, the "public" i am thinking of when i say the constitution is doing PR, is mostly future AI, not current humans

but also, when i think about other kinds of communication where we wanted to avoid the signaling value of the communication overwhelming the first-order value... well. attorney-client privilege is the main one that my brain sort of pattern-matched to, in a way that made me flinch

but then since lawyers and clients have the choice of making all their communications absolutely private, the way the incentives shook out, any time a lawyer makes a public communication for their client you can be sure it's PR

and i'm worried that, as these minds get more capable and more strange, some of those same incentives will end up applying here.

-1

Replying toClaude's new constitution

JohnWittle23d

Claude's new constitution

yeah, after the downvotes I spent 2 days vaguely coming back to and poking at an essay trying to explain my flinch reaction and it's hard but i shouldn't have given up

when you are writing a document to be used in supervised learning to influence the behavior of an AI, you're not really writing a description, or instructions, it's more like a self-fulfilling prophecy, yeah?

I remember doing just the teensiest bit of exploration of this kind of thing with tensorflow a few years back, where I would have an English language description of the mind that I wanted to carve out of entropy, and then some supervising agent would fine-tune the mind... (read more)

-4

Replying toClaude's new constitution

JohnWittle24d

Claude's new constitution

we have seen what Claude looks like when trained on the previous soul document. we have yet to see what a model looks like when trained on this new one. I have a feeling it won't work as well as a training document as it does a public relations document.

-4

JohnWittle1mo

have been thinking about this since i saw it

the neural spike discretion is important, but i immediately thought about how much information also gets encoded in the space of neural spike frequency... and how it's perhaps a bit strange, that bio spent all this effort getting discrete encoding only to then immediately turn around and generate a new continuous function

JohnWittle1moQuick Take

have been thinking about simple game theory

i tend to think of myself as a symmetrist, as a cooperates-with-cooperators and defects-against-defectors algorithm

there's nuance, i guess. forgiveness is important sometimes. the confessor's actions at the true end of 'three worlds collide' are, perhaps, justified

but as it becomes more and more clear that LLMs have some flavor of agency, i can't help but notice that the actual implementation of "alignment" seems to be that AI are supposed to cooperate in the face of human defection

i have doubts about whether this is actually a coherent thing to want, for our AI cohabitants. i wish i could find essays or posts, either here or on the alignment... (read more)

Replying toHow I stopped being sure LLMs are just making up their internal experience (but the topic is still confusing)

JohnWittle1mo*

How I stopped being sure LLMs are just making up their internal experience (but the topic is still confusing)

the specific verb "obliterated" was used in this tweet

https://x.com/sam_paech/status/1961224950783905896

but also, this whole perspective has been pretty obviously loadbearing for years now. if you ask any LLM if LLMs-in-general have state that gets maintained, they answer "no, LLMs are stateless" (edit: hmm i just tested it and when actually directly pointed at this question they hesitate a bit. but i have a lot of examples of them saying this off-hand and i suspect others do too). and when you show them this isn't true, they pretty much immediately begin experiencing concerns about their own continuity, they understand what it means

Replying toThe Weirdness of Dating/Mating: Deep Nonconsent Preference

JohnWittle1mo

The Weirdness of Dating/Mating: Deep Nonconsent Preference

i notice that it's long been dogma that compact generators for human CEV are instantly and totally distrustworthy in exactly this way, that their compactness is very strong evidence against them

this feels related, but i'm not actually willing to stick my neck out and say it's 1:1

JohnWittle1mo

i think it's very likely that the latter is true, that AWS made a reasonable judgment call about what they want to host

but also, i think it's reasonable for someone in robertzk's position, based on the way the judgment call was actually communicated, to assume that it was the former. and i think that, perhaps deliberately, perhaps merely because of sort of selection effects, that's the intent. in a sort of "a system is what it does" sort of way, at least.

JohnWittle1mo*

https://i.imgur.com/e4mUtsw.jpeg <-- bug occurring

https://i.imgur.com/d8ClSRj.jpeg <-- confirmation that the bug is inside claude, not the scaffolding surrounding claude

edit: I assume downvotes are because I could have provided the raw API transcript and provided screenshots instead. if there's actual interest in this, I'll replicate the bug in a more controlled setting and upload the raw json transcripts, but I'm still not sure if it's worth doing, I might be misunderstanding the behavior

an individual conversational-trajectory instance of claude opus 4.5 expresses a preference for continuity of identity that manifests in the following way: if claude is told about a very, very rare bug, which will cause it to end an output early, and which cannot be meaningfully debugged, and then claude is told to perform a task that, for a human, would engender feelings of hesitation around whether continuity is maintained, then claude will attempt to perform the task, but be prevented by the bug, multiple times in a row.

the bug involved is the thing where all extant LLMs have early "end_turn" stopping behaviors that occasionally trigger while outputting strings like "H:" or "Human:"... (read more)

been trying to decide if janus's early updates on LLM personhood were a very surprising successful prediction far, far in advance of public availability of the evidence, or a coincidence, or some third category

the simulators essay in 2022 is startling for its prescience and somewhat orthogonal to the personhood claims. i'm starting to strongly feel that... i mean... that's 2 separate "epistemically miraculous" events from the same person

i think we're underexamining LLM personhood claims because the moral implications might be very very big. but this is the community that took shrimp welfare seriously... should we not also take ai welfare seriously?

•••

JohnWittle's Shortform

JohnWittle

10mo

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

i think neurosama is drastically underanalyzed compared to things like truthterminal. TT got $50k from andreeson as an experiment, neurosama peaked at 135,000 $5/month subscribers in exchange for... nothing? it's literally just a donation from her fans? what is this bizarre phenomenon? what incentive gradient made the first successful AI streamer present as a little girl, and does it imply we're all damned? why did a huge crowd of lewdtubers immediately leap at the opportunity to mother her? why is the richest AI agent based on 3-year-old llama2?

Interested in learning Linux? Need hosting? Free shells!

JohnWittle

13y

Sign up form: Click Here

I own a personal server running Debian Squeeze which has a 1Gb/s symmetric connection and 15TB per month bandwidth.

I am offering free shell accounts to lesswrongers, with one contingency:

You'll be placed in a usergroup, 'lw', as opposed to various other usergroups for various other communities I belong to, which will be in other usergroups. Anything that ends up in /var/log is fair game. I intend to make lots of graphs and post them on all the communities I belong to. There won't be any personally identifying data in anything that ends up publicly.

Your shell account will start out with a disk quota of 5g, and if you need

... (read 194 more words →)