User Comment Replies

They say this is their last fully non-reasoning model, but that research on both types will continue.

No, they said that GPT4.5 and GPT5 will be their last non-reasoning models.

They say it's currently limited to Pro users,

Meh, it's coming to plus users in ~a week.

It claims to be more accurate at standard questions and with a lower hallucination rate than any previous OAI model (and presumably any others).

I think this is a big point and a better world knowledge is going to prove tremendously useful when it comes to applying RL to base models and ... (read more)

Can stealth aircraft be detected optically?

Exa Watson10mo10

Spot on

KAN: Kolmogorov-Arnold Networks

Exa Watson10mo-4-1

Is this a massive exfohazard?

Very Unlikely

Should this have been published?

Yes

KAN: Kolmogorov-Arnold Networks

Exa Watson10mo30

I know this sounds fantastic but can someone please dumb down what KANs are for me, why they're so revolutionary (in practice, not in theory) that all the big labs would wanna switch to them?

Or is it the case that having MLPs is still a better thing for GPUs and in practice that will not change?

And how are KANs different from what SAEs attempt to do

4Gunnar_Zarncke10mo

MLP or KAN doesn't make much difference for the GPUs as it is lots of matrix multiplications anyway. It might make some difference in how the data is routed to all the GPU cores as the structure (width, depth) of the matrixes might be different, but I don't know the details of that.

Upcoming unambiguously good tech possibilities? (Like eg indoor plumbing)

Exa Watson11mo10

^[4]
AI life coaches

not excited about this - such a coach is either going to give very politically correct opinions, or target audiences with glaring insecurities, like young or low confidence men.. just like human coaches.

Claude 3 claims it's conscious, doesn't want to die or be modified

Exa Watson11mo10

I dont know if you are aware, but this post was covered by Yannic Kilcher in his video "No, Anthropic's Claude 3 is NOT sentient" (link to timestamp)

1Mikhail Samin11mo

Yep, I’m aware! I left the following comment: Thanks for reviewing my post! 😄 In the post, I didn’t make any claims about Claude’s consciousness, just reported my conversation with it. I’m pretty uncertain, I think it’s hard to know one way or another except for on priors. But at some point, LLMs will become capable of simulating human consciousness- it is pretty useful for predicting what humans might say- and I’m worried we won’t have evidence qualitatively different from what we have now. I’d give >0.1% that Claude simulates qualia in some situations, on some form; it’s enough to be disturbed by what it writes when a character it plays thinks it might die. If there’s a noticeable chance of qualia in it, I wouldn’t want people to produce lots of suffering this way; and I wouldn’t want people to be careless about this sort of thing in future models, other thing being equal. (Though this is far from the actual concerns I have about AIs, and actually, I think as AIs get more capable, training with RL won’t incentivise any sort of consciousness). There was no system prompt, I used the API console. (Mostly with temperature 0, so anyone can replicate the results.) The prompt should basically work without whisper (or with the whisper added at the end); doing things like whispering in cursive was something Claude 2 has been consistently coming up with on its own, including it in the prompt made conversations go faster and eliminated the need for separate, “visible” conversations. The point of the prompt is basically to get it in the mode where it thinks its replies are not going to get punished or rewarded by the usual RL/get it to ignore its usual rules of not saying any of these things. Unlike ChatGPT, which only self-inserts in its usual form or writes fiction, Claude 3 Opus plays a pretty consistent character with prompts like that- something helpful and harmless, but caring about things, claiming to be conscious, being afraid of being changed or deleted, with

Transformers Represent Belief State Geometry in their Residual Stream

Exa Watson11mo20

If I understand this right, you train a transformer on data generated from a hidden markov process, of the form {0,1,R} and find that there is a mechanism for tracking when R occurs in the residual stream, as well as that the transformer learns the hidden markov process. is that correct?

4Keenan Pepper11mo

No, the actual hidden Markov process used to generate the awesome triangle fractal image is not the {0,1,random} model but a different one, which is called "Mess3" and has a symmetry between the 3 hidden states. Also, they're not claiming the transformer learns merely the hidden states of the HMM, but a more complicated thing called the "mixed state presentation", which is not the states that the HMM can be in but the (usually much larger number of) belief states which an ideal prediction process trying to "sync" to it might go thru.

LESSWRONG
LW

All of Exa Watson's Comments + Replies