LESSWRONG
LW

All of p.b.'s Comments + Replies

Racial Dating Preferences and Sexual Racism

p.b.12d3-1

I think Sailer had it right 30 years ago. It's mostly just behavioral and physical masculinity/femininity. That may be unfair, but it's not racism.

Daniel Kokotajlo's Shortform

p.b.1mo30

Is there already an METR evaluation of Claude 4?

3Daniel Kokotajlo1mo

Not yet! I think they are working on it

adamzerner's Shortform

p.b.1mo80

I read that this "spoiled meat" story is pretty overblown. And it doesn't pass the sniff test either. Most meat was probably eaten right after slaughter, because why wouldn't you?

Also herbs must have been cheaply available. I also recently learned that every household in medieval Europe had a mother of vinegar.

9DirectedEvolution1mo

In the Odyssey, every time they eat meat, the slaughter happens right beforehand. There were (are?) African herding tribes who consume blood from their living livestock rather than slaughtering it for meat. Tribes in the Pacific Northwest dried their salmon for later in the year.

Personal evaluation of LLMs, through chess

p.b.2mo20

I played a game against GPT-4.5 today and seemed to be the strongest LLM I have played so far. Didn't hallucinate once, didn't blunder and reached a drawn endgame after 40 moves.

sam's Shortform

p.b.2mo60

What helps me to overcome the initial hurdle to start doing work in the morning:

Write a list of the stuff you have to do the next day
Make it very fine-grained with single tasks (especially the first few) being basically no effort.
Tick them off one by one

Also:

Tell people what you have to do and when you are going to do it and that you have done it. Like, a colleague, or your team, or your boss.
Do stuff with other people. Either actually together, like pair programming, or closely intertwined.

I think it also helps to take something you ar... (read more)

2Viliam2mo

Similar here: * make a to-do list (and occasionally look at it) * write down the steps that need to be done * talk to someone about it I suspect that in my case it some kind of attention deficit disorder: lists and notes and talking help me focus again.

On AI personhood

p.b.3mo20

Which is exactly what I am doing in the post? By saying that the question of consciousness is a red herring aka not that relevant to the question of personhood?

On AI personhood

p.b.3mo40

No.

The argument is that feelings or valence more broadly in humans requires additional machinery (amygdala, hypothalamus, etc). If the machinery is missing, the pain/fear/.../valence is missing although the sequence learning works just fine.

AI is missing this machinery, therefore it is extremely unlikely to experience pain/fear/.../valence.

shortplav

p.b.3mo20

It's probably just a difference in tokenizer. Tokenizers often produce tokens with trailing whitespace. I actually once wrote a tokenizer and trained a model to predict "negative whitespace" when a token for once shouldn't have a trailing whitespace. But I don't know how current tokenizers handle this, probably in different ways.

2niplav3mo

That would be my main guess as well75%, but not the overwhelmingly likely option.

METR: Measuring AI Ability to Complete Long Tasks

p.b.3mo60

p.b.'s Shortform

p.b.3mo60

I originally thought that the METR results meant that this or next year might be the year where AI coding agents had their breakthrough moment. The reasoning behind this was that if the trend holds AI coding agents will be able to do several hour long tasks with a certain probability of success, which would make the overhead and cost of using the agent suddenly very economically viable.

I now realised that this argument has a big hole: All the METR tasks are timed for un-aided humans, i.e. humans without the help of LLMs. This means that especially fo... (read more)

Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study

p.b.3mo30

I meant chess specific reasoning.

Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study

p.b.3mo50

I occasionally test LLMs by giving them a chess diagram and let them answer questions about the position ranging from very simple to requiring some calculation or insight.

Gemini 2.5 Pro also impressed me as the first LLM that could at least perceive the position correctly even if it quickly went off the rails as soon as some reasoning was required.

Contrary to manufacturing I expect this to get a lot better as soon as any of the labs makes an effort.

1Tachikoma3mo

Why do you think labs aren't making a focused effort on this problem? Is vision understanding not valuable for an automated software engineer or AI scientist?

AI 2027: What Superintelligence Looks Like

p.b.3mo20

Let's instead assume a top engineer has a really consequential idea every couple of months. Now what?

Speeding up implementation just means that you test more of the less promising ideas.

Speeding up feedback might mean that you can hone in on the really good ideas faster, but does this actually happen if you don't do the coding and don't do the math?

METR: Measuring AI Ability to Complete Long Tasks

p.b.3mo40

Do you plan to evaluate new models in the same way and regularly update the graph?

6p.b.3mo

Do models say what they learn?

p.b.3mo20

Yes, you are right. I overstated my case somewhat for these simple scenarios. There were also earlier results in that direction.

But in your work there probably already is an "unsafe code" activation and the fine-tuning only sets it to a permanent "on". It already had the ability to state "the unsafe code activation is on" before the fine-tuning, so maybe that result isn't very surprising?

There probably isn't an equally simple "discriminate in favour of Canadians" activation, though I could imagine more powerful models to also get that right.

My examples are orders of magnitude harder and I think a fundamental limitation of transformers as they are currently trained.

AI 2027: What Superintelligence Looks Like

p.b.3mo2911

I find this possible though it's not my median scenario to say the least. But I am also not sure I can put the probability of such a fast development below 10%.

Main cruxes:

I am not so sure that "automating AI research" is going to speed up development by orders of magnitude.

My experience is that cracked AI engineers can implement any new paper / well specified research idea in a matter of hours. So speeding up the coding can't be the huge speedup to R&D.

The bottleneck seems to be:
A.) Coming up with good research ideas.

b.) Finding the ... (read more)

1Tobiasz B3mo

I would presume that the process of the AI improvement can be also modelled as: A.) Coming up with good research ideas. B.) Finding the precise formulation of that idea that makes most sense/works. C.) Implementation of the idea. If you claim that C) only "takes hours" - then with the AI Coder it takes seconds instead (nowadays agents work correctly only 50-70% of the time, hence a programmer indeed has to spent these couple of hours). Then the loop becomes tighter - a single iteration takes a few hours less. Let's assume there's a very creative engineer who can come up with a couple ideas a day. What is the B-step? Finding the formulation means e.g. getting the math equations, right? The LLMs become superhuman at math this year already. If they're superhuman then the loop becomes tighter. Then instead of spending a day on an idea (a few hours of implementation), you test a bunch of them a day. Also - the A) can probably get automated too, with a framework in which you make the model read all the literature and provide combinations of ideas which you then filter out. Each new model makes the propositions more relevant. So all 3 steps get semi-automated (and gradually tighten with next models releases), where the human's role boils down to filtering things out - it's the "taste" quality, which Kokotajlo mentions.

2dominicq3mo

I agree, and am also confused with the idea that LLMs will be able to bootstrap something more intelligent. My day job is a technical writer. I also do a bit of DevOps stuff. This combo ought to be the most LLM-able of all, yet I frequently find myself giving up on trying to tease out an answer from an LLM. And I'm far from the edge of my field! So how exactly do people on the edge of their field make better use of LLMs, and expect to make qualitative improvements? Feels like it'll have to be humans to make algorithmic improvements, at least up until a point.

New Cause Area Proposal

p.b.3mo11

This but unironically.

Do models say what they learn?

p.b.3mo70

To answer my own question: They usually don't. Models don't have "conscious access" to the skills and knowledge implicit in their sequence prediction abilities.

If you train a model on text and on videos, they lack all ability to talk sensibly about videos. To gain that ability they need to also train on data that bridges these modalities.

If things were otherwise we would be a lot closer to AGI. Gemini would have been a step change. We would be able to gain significant insights in all kinds of data by training an LLM on it.

Therefore it is not surprising that models don't say what they learn. They don't know what they learn.

2Jan Betley3mo

Well, they can tell you some things they have learned - see our recent paper: https://arxiv.org/abs/2501.11120 We might hope that future models will be even better at it.

Recent AI model progress feels mostly like bullshit

p.b.3mo1510

I was pretty impressed with o1-preview's ability to do mathematical derivations. That was definitely a step change, the reasoning models can do things earlier models just couldn't do. I don't think the AI labs are cheating for any reasonable definition of cheating.

Do models say what they learn?

p.b.3mo70

Do models know what they learn?

7p.b.3mo

To answer my own question: They usually don't. Models don't have "conscious access" to the skills and knowledge implicit in their sequence prediction abilities. If you train a model on text and on videos, they lack all ability to talk sensibly about videos. To gain that ability they need to also train on data that bridges these modalities. If things were otherwise we would be a lot closer to AGI. Gemini would have been a step change. We would be able to gain significant insights in all kinds of data by training an LLM on it. Therefore it is not surprising that models don't say what they learn. They don't know what they learn.

Reducing LLM deception at scale with self-other overlap fine-tuning

p.b.4mo50

A few years ago I had a similar idea, which I called Rawlsian Reinforcement Learning: The idea was to provide scenarios similar to those in this post and evaluate the actions of the model as to which person benefits how much from them. Then reinforce based on mean benefit of all characters in the scenario, or a variation thereof, i.e. the reinforcement signal does not use the information which character in the scenario is the model.

Maybe I misunderstand your method but it seems to me that you untrain the self-other distinction which in the end is a capability. So the model might not become more moral, instead it just loses the capacity to benefit itself because it cannot distinguish between itself and others.

5Marc Carauleanu4mo

You are correct that the way SOO fine-tuning is implemented in this paper is likely to cause some unintended self-other distinction collapse. We have introduced an environment to test this in the paper called “Perspectives” where the model has to understand that it has a different perspective than Bob and respond accordingly, and this SOO fine-tuning is not negatively affecting the capabilities of the models on that specific toy scenario. The current implementation is meant to be illustrative of how one can implement self-other overlap on LLMs and to communicate that it can reduce deception and scale well, but it only covers part of how we imagine a technique like this would be implemented in practice. We expect transformative AI to be situationally aware and to behave aligned during training. In other words, we expect that if TAI is misaligned, it will be deceptive during training. This means that in the training distribution, it might be hard to distinguish a capable misaligned and an aligned model purely behaviourally. Given this, we envisage a technique like SOO being implemented alongside a roughly outer-aligned reward model, in a way acting as an inductive prior favoring solutions with less unnecessary self-other distinction (w.r.t outer-aligned reward model) while also keeping some self-other distinction that is needed to perform well w.r.t the outer-aligned reward model.

A Bear Case: My Predictions Regarding AI Progress

p.b.4mo30

I kinda agree with this as well. Except that it seems completely unclear to me whether recreating the missing human capabilities/brain systems takes two years or two decades or even longer.

It doesn't seem to me to be a single missing thing and for each separate step holds: That it hasn't been done yet is evidence that it's not that easy.

A Bear Case: My Predictions Regarding AI Progress

p.b.4mo52

I think that is exactly right.

I also wouldn't be too surprised if in some domains RL leads to useful agents if all the individual actions are known to and doable by the model and RL teaches it how to sensibly string these actions together. This doesn't seem too different from mathematical derivations.

A Bear Case: My Predictions Regarding AI Progress

p.b.4mo70

If you think generalization is limited in the current regime, try to create AGI benchmarks that the AIs won't saturate until we reach some crucial innovation. People keep trying this and they keep saturating every year.

Because these benchmarks are all in the LLM paradigm: Single input, single output from a single distribution. Or they are multi-step problems on rails. Easy verification makes for benchmarks that can quickly be cracked by LLMs. Hard verification makes for benchmarks that aren't used.

One could let models play new board/computer games against ... (read more)

7Thomas Kwa4mo

Agree, this is one big limitation of the paper I'm working on at METR. The first two ideas you listed are things I would very much like to measure, and the third something I would like to measure but is much harder than any current benchmark given that university takes humans years rather than hours. If we measure it right, we could tell whether generalization is steadily improving or plateauing.

A Bear Case: My Predictions Regarding AI Progress

p.b.4mo72

Same here.

The case for the death penalty

p.b.4mo101

If you only execute repeat offenders the fraction of "completely" innocent people executed goes way down.

The idea of being in the wrong place at the wrong time and then being executed gives me pause.

The idea of being framed for shop lifting, framed for shop lifting again, wrongfully convicted of a violent crime and then being at the wrong place at the wrong time is ridiculous.

How to Make Superbabies

p.b.4mo20

Do you have a reference for the personality trait gene-gene interaction thing? Or maybe an explanation how that was determined?

≤10-year Timelines Remain Unlikely Despite DeepSeek and o3

p.b.5mo30

I think this inability of "learning while thinking" might be the key missing thing of LLMs and I am not sure "thought assessment" or "sequential reasoning" are not red herrings compared to this. What good is assessment of thoughts if you are fundamentally limited in changing them? Also, reasoning models seem to do sequential reasoning just fine as long as they already have learned all the necessary concepts.

My model of what is going on with LLMs

p.b.5mo20

But the historical difficulty of RL is based on models starting from scratch. Unclear whether moulding a model that already knows how to do all the steps into doing all the steps is anywhere as difficult as using RL to also learn how to do all the steps.

Why you maybe should lift weights, and How to.

p.b.5mo20

10% seems like a lot.

Also, I worry a bit about being too variable in the number of reps and in how to add weight. I found I fall easily into doing the minimal version - "just getting it done for today". Then improvement stalls and motivation drops.

I think part of the appeal of "Starting Strength" (which I started recently) is that it's very strict. Unfortunately if adding 15 kilo a week for three weeks to squats it not going to kill me drinking a gallon of milk a day will.

Which is to say, I appreciate your post for giving more building pieces for a workout that works out for me.

My model of what is going on with LLMs

p.b.5mo20

I think AlexNet wasn't even the first to win computer vision competitions based on GPU-acceleration but that was definitely the step that jump-started Deep Learning around 2011/2012.

To me it rather seems like agency and intelligence is not very intertwined. Intelligence is the ability to create precise models - this does not imply that you use these models well or in a goal-directed fashion at all.

That we have now started the path down RLing the models to make them pursue the goal of solving math and coding problems in a more directed and effec... (read more)

1Cole Wyeth5mo

Our intuitions here should be informed by the historical difficulty of RL.

My model of what is going on with LLMs

p.b.5mo42

Apparently^[1] enthusiasm didn't really ramp up again until 2012, when AlexNet proved shockingly effective at image classification.

I think after the backpropagation paper was published in the eighties enthusiasm did ramp up a lot. Which lead to a lot of important work in the nineties like (mature) CNNs, LSTMs, etc.

1Cole Wyeth5mo

I see - I mean, clearly AlexNet didn't just invent all the algorithms it relied on, I believe the main novel contribution was to train on GPU's and get it working well enough to blow everything else out of the water? The fact that it took decades of research to go from the Perceptron to great image classification indicates to me that there might be further decades of research between holding an intelligent-ish conversation and being a human agent level agent. This seems like the natural expectation given the story so far, no?

Why you maybe should lift weights, and How to.

p.b.5mo20

Could you say a bit about progression?

3samusasuke5mo

Yeah. Progression will be really weird when you start by yourself. Sometimes you'll be 20% stronger than last workout on the same exercise, sometimes 20% less (because you might have improved your technique). If you are not, on average, getting stronger, you are probably not gaining muscle. Don't goodhart, letting your technique focus slip away. Progression changes a lot from one exercise to the other. You can probably add 5lbs every time you deadlift for the first 3 months. However I've been doing DB Lateral raises for 3 years and I'm about to move on to the 20lbs. My practical advice? * Keep track of the weights you're using, apps like strong do this well. Anytime you select a lift, it shows you how much you did it with last time. * As a beginner, stop every set when you are not confident you can maintain good technique on the next set. You'll be farther from "true muscular failure" than someone who is more accostumed with the lift, but that's ok. * For every exercise, have a range of repetitions. Say you're doing squats for 5-8 reps. Perform the set as described above. If you get less than 5, next sets use 10% less weight. If you get more than 8, next time use 10% more weight.

Beyond ELO: Rethinking Chess Skill as a Multidimensional Random Variable

p.b.5mo42

ELO is the Electric Light Orchestra. The Elo rating is named after Prof. Arpad Elo.

I considered the idea of representing players via vectors in different context (chess, soccer, mma) and also worked a bit on splitting the evaluation of moves into "quality" and "risk taking", with the idea of quantifying aggression in chess.

My impression is that the single scalar rating works really well in chess, so I'm not sure how much there is beyond that. However, some simple experiments in that direction wouldn't be too difficult to set up.

Also, I th... (read more)

p.b.'s Shortform

p.b.5mo198

My bear case for Nvidia goes like this:

I see three non-exclusive scenarios where Nvidia stops playing the important role in AI training and inference that it used to play in the past 10 years:

China invades or blockades Taiwan. Metaculus gives around 25% for an invasion in the next 5 years.
All major players switch to their own chips. Like Google has already done, Amazon is in the process of doing, Microsoft and Meta have started doing and even OpenAI seems to be planning.
Nvidias moats fail. CUDA is replicated for cheaper hardware, ASICs or stu

... (read more)

3Mandatory Topic5mo

Another point on your last sentence: in a near or post AGI world one might think that the value of the type of knowledge work (pure design as opposed to manufacturing) Nvidia does might start trending towards zero as it becomes easier for anyone with equal compute access to replicate. Not sure if it will be possible to maintain a moat on the basis of quality in software/hardware design in such a world.

4Petropolitan5mo

The third scenario doesn't actually require any replication of CUDA: if Amazon, Apple, AMD and other companies making ASICs commoditize inference but Nvidia retains its moat in training, with inference scaling and algorithmic efficiency improvements the training will inevitably become a much smaller portion of the market

1Jonas Hallgren5mo

I guess the entire "we need to build an AI internally" US narrative will also increase the likelyhood of Taiwan being invaded from China for data chips? Good that we all have the situational awareness to not summon any bad memetics into the mindspace of people :D

DeepSeek: Don’t Panic

p.b.5mo20

A very detailed and technical analysis of the bear case for Nvidia by Jeffrey Emanuel, that Matt Levine claims may have been responsible for the Nvidia price decline.

I read that last week. It was an interesting case of experiencing Gell-Mann-Amnesia several times within the same article.

All the parts where I have some expertise were vague, used terminology incorrectly and were often just wrong. All the rest was very interesting!

If this article crashed the market: EMH RIP.

DeepSeek: Don’t Panic

p.b.5mo20

I would hesitate to buy a build based on R1. R1 is special in the sense that the MoE-architecture trades off compute requirements vs RAM requirements. Which is why now these CPU-builds start to make some sense - you get a lot less compute, but much more RAM.

As soon as the next dense model drops which could have 5-times fewer parameters for the same performance the build will stop making any sense. And of course until then you are also handicapped when it comes to running smaller models fast.

The sweet spot is integrated RAM/VRAM like in a Mac and in the upcoming NVIDIA DIGITS. But buying a handful of used 3090s probably also makes more sense to me then the CPU-only builds.

Implications of the inference scaling paradigm for AI safety

p.b.5mo40

So how could I have thought that faster might actually be a sensible training trick for reasoning models.

Jesse Hoogland's Shortform

p.b.7mo64

You are skipping over a very important component: Evaluation.

Which is exactly what we don't know how to do well enough outside of formally verifiable domains like math and code, which is exactly where o1 shows big performance jumps.

Rauno's Shortform

p.b.8mo20

There was one comment on twitter that the RLHF-finetuned models also still have the ability to play chess pretty well, just their input/output-formatting made it impossible for them to access this ability (or something along these lines). But apparently it can be recovered with a little finetuning.

3ZY8mo

Yeah that makes sense; the knowledge should still be there, just need to re-shift the distribution "back"

Leon Lang's Shortform

p.b.8mo40

The paper seems to be about scaling laws for a static dataset as well?

Similar to the initial study of scale in LLMs, we focus on the effect of scaling on a generative pre-training loss (rather than on downstream agent performance, or reward- or representation-centric objectives), in the infinite data regime, on a fixed offline dataset.

To learn to act you'd need to do reinforcement learning, which is massively less data-efficient than the current self-supervised training.

More generally: I think almost everyone thinks that you'd need to scale the right... (read more)

3Jonas Hallgren8mo

If you look at the Active Inference community there's a lot of work going into PPL-based languages to do more efficient world modelling but that shit ain't easy and as you say it is a lot more compute heavy. I think there'll be a scaling break due to this but when it is algorithmically figured out again we will be back and back with a vengeance as I think most safety challenges have a self vs environment model as a necessary condition to be properly engaged. (which currently isn't engaged with LLMs wolrd modelling)

"It's a 10% chance which I did 10 times, so it should be 100%"

p.b.8mo31

Related: https://en.wikipedia.org/wiki/Secretary_problem

johnswentworth's Shortform

p.b.8mo60

The interesting thing is that scaling parameters (next big frontier models) and scaling data (small very good models) seems to be hitting a wall simultaneously. Small models now seem to get so much data crammed into them that quantisation becomes more and more lossy. So we seem to be reaching a frontier of the performance per parameter-bits as well.

Leon Lang's Shortform

p.b.8mo50

I think the evidence mostly points towards 3+4,

But if 3 is due to 1 it would have bigger implications about 6 and probably also 5.

And there must be a whole bunch of people out there who know wether the curves bend.

Slave Morality: A place for every man and every man in his place

p.b.10mo30

It's funny how in the OP I agree with master morality and in your take I agree with slave morality. Maybe I value kindness because I don't think anybody is obligated to be kind?

Anyways, good job confusing the matter further, you two.

4Martin Sustrik10mo

At your service!

Evidence against Learned Search in a Chess-Playing Neural Network

p.b.10mo40

I actually originally thought about filtering with a weaker model, but that would run into the argument: "So you adversarially filtered the puzzles for those transformers are bad at and now you've shown that bigger transformers are also bad at them."

I think we don't disagree too much, because you are too damn careful ... ;-)

You only talk about "look-ahead" and you see this as on a spectrum from algo to pattern recognition.

I intentionally talked about "search" because it implies more deliberate "going through possible outcomes". I mostly argue about t... (read more)

2Erik Jenner10mo

Yeah, I feel like we do still disagree about some conceptual points but they seem less crisp than I initially thought and I don't know experiments we'd clearly make different predictions for. (I expect you could finetune Leela for help mates faster than training a model from scratch, but I expect most of this would be driven by things closer to pattern recognition than search.) I don't think I understand your ontology for thinking about this, but I would probably also put Leela below this "turning point" (e.g., I expect most of its parameters are spent on storing knowledge and patterns rather than implementing crisp algorithms). That said, for me, the natural spectrum is between a literal look-up table and brute-force tree search with no heuristics at all. (Of course, that's not a spectrum I expect to be traversed during training, just a hypothetical spectrum of algorithms.) On that spectrum, I think Leela is clearly far removed from both sides, but I find it pretty difficult to define its place more clearly. In particular, I don't see your turning point there (you start storing less knowledge immediately as you move away from the look-up table). That's why I've tried to avoid absolute claims about how much Leela is doing pattern recognition vs "reasoning/..." but instead focused on arguing for a particular structure in Leela's cognition: I just don't know what it would mean to place Leela on either one of those sides. But I can see that if you think there's a crisp distinction between these two sides with a turning point in the middle, asking which side Leela is on is much more compelling.

OpenAI o1

p.b.10mo20

I know, but I think Ia3orn said that the reasoning traces are hidden and only a summary is shown. And I haven't seen any information on a "thought-trace-condenser" anywhere.

9James Payor10mo

See the section titled "Hiding the Chains of Thought" here: https://openai.com/index/learning-to-reason-with-llms/

OpenAI o1

p.b.10mo40

There is a thought-trace-condenser?

Ok, then the high-level nature of some of these entries makes more sense.

Edit: Do you have a source for that?

7mattmacdermott10mo

You can read examples of the hidden reasoning traces here.

OpenAI o1

p.b.10mo20

No, I don't - but the thoughts are not hidden. You can expand them unter "Gedanken zu 6 Sekunden".

Which then looks like this:

7eggsyntax10mo

'...after weighing multiple factors including user experience, competitive advantage, and the option to pursue the chain of thought monitoring, we have decided not to show the raw chains of thought to users. We acknowledge this decision has disadvantages. We strive to partially make up for it by teaching the model to reproduce any useful ideas from the chain of thought in the answer. For the o1 model series we show a model-generated summary of the chain of thought.' (from 'Hiding the Chains of Thought' in their main post)

61a3orn10mo

Aaah, so the question is if it's actually thinking in German because of your payment info or it's just the thought-trace-condenser that's translating into German because of your payment info. Interesting, I'd guess the 2nd but ???

OpenAI o1

p.b.10mo40

I played a game of chess against o1-preview.

It seems to have a bug where it uses German (possible because of payment details) for its hidden thoughts without really knowing it too well.

The hidden thoughts contain a ton of nonsense, typos and ungrammatical phrases. A bit of English and even French is mixed in. They read like the output of a pretty small open source model that has not seen much German or chess.

Playing badly too.

21a3orn10mo

Do you work at OpenAI? This would be fascinating, but I thought OpenAI was hiding the hidden thoughts.