Posts

9mo

90We Should Prepare for a Larger Representation of Academia in AI Safety

2

2y

14

32Andrew Ng wants to have a conversation about extinction risk from AI

2y

2

26Evaluating Language Model Behaviours for Shutdown Avoidance in Textual Scenarios

2y

0

48[Appendix] Natural Abstractions: Key Claims, Theorems, and Critiques

2y

246Natural Abstractions: Key Claims, Theorems, and Critiques

0

2y

38Andrew Huberman on How to Optimize Sleep

26

2y

6

31Experiment Idea: RL Agents Evading Learned Shutdownability

3y

7

Wikitag Contributions

Comments

Sorted by

Newest

Leon Lang6d43

Only the output! I thought Mikhail was referring to the output here, as this is what we see for the IMO problems.

But as I see it now, the consensus seems to be something like "The chain of thought of new models does look like the IMO problem solutions, and if you don't train the model to produce final answers that look nice to humans, then they will look like the chain of thought. Probably the experimental model's answers were not yet trained to look nice".

Is this your position? I think that's pretty plausible.

1

Leon Lang6d71

Here is a screenshot of a chain of thought from the blog post you link:

This looks different from the IMO solutions to me and doesn't have the patterns I mentioned. E.g., the sentences are grammatically complete.

Leon Lang6d*40

Fwiw, I've recently used o3 a lot for requesting proofs, and it writes very differently.

Could you give an example of an RLed LLM that writes like these examples?

Though I agree with Rauno's comment that it does look like the chain of thought examples from the Baker et al. paper.

Leon Lang6d88

As I understand it, we don’t actually see the chain of thought here but only the final submitted solution. And I don’t think that a pressure to save tokens would apply to that.

Leon Lang7d170

The proofs look very different from how LLMs typically write, and I wonder how that emerged. Much more concise. Most sentences are not fully grammatically complete. A bit like how a human would write if they don't care about form and only care about content and being logically persuasive.

Leon Lang13d20

Thanks for the comment Stepan!

I think it's right that the distinction "lots of data" and "less data" doesn't really carve reality at its natural joints. I feel like your distinction between "discrete" and "continuous" also doesn't fully do this since you could imagine a case of discrete $X$ where we have only one $y$ for each $x$ in the dataset, and thus need regression, too (at least, in principle).

I think the real distinction is probably whether we have "several $y$ 's for each $x$ " in the dataset, or not. The twin dataset case has that, and so even though it's not a lot of data (only 32 pairs, or 64 total samples), we can essentially apply what I called the "lots of data" case.

Now, I have to admit that by this point I'm somewhat attached to the imperfect state of this post and won't edit it anymore. But I've strongly upvoted your comment and weakly agreed with it, and I hope some confused readers will find it.

1

Leon Lang20d20

Thanks, I've replaced the word "likelihood" by "probability" in the comment above and in the post itself!

Leon Lang23d20

Thanks, I think this is an excellent comment that gives lots of useful context.

To summarize briefly what foorforthought has already expressed, what I meant with platoninc variance explained is the explained variance independent of a specific sample or statistical model, but as you rightly point out, this still depends on lots of context that depends on crucial details of study design or the population one studies.

Leon Lang23d30

what is a measurable space?

I'm not sure if clarifying this is most useful for the purpose of understanding this post specifically, but for what it's worth: A measurable space is a set together with a set of subsets that are called "measurable". Those measurable sets are the sets to which we can then assign probabilities once we have a probability measure (which in the post we assume to be derived from a density $p$ , see my other comment under your original comment).

"the function $X$ is constant," you mean its just one outcome like a die that always lands on one side?

I think that's what the commenter you replied to means, yes. (They don't seem to be active anymore)

what makes a function measurable?

This is another technicality that might not be too useful to think about for the purpose of this post. A function is measurable if the preimages of all measurable sets are measurable. I.e.: $f : X \to Z$ , for two measurable spaces $X$ and $Z$ , is measurable, if $f^{- 1} (A) \subseteq X$ is measurable for all measurable $A \subseteq Z$ . For practical purposes, you can think of continuous functions or, in the discrete case, just any functions.

Leon Lang23d*30

I'm sorry that the terminology of random variables caused confusion!
If it helps, you can basically ignore the formalism of random variables and instead simply talk about the probability of certain events. For a random variable with values in $X$ and density $p (x)$ , an event is (up to technicalities that you shouldn't care about) any subset $A \subseteq X$ . Its probability is given by the integral

P (A) := \int_{x \in A} p (x) .

In the case that $X$ is discrete and not continuous (e.g., in the case that it is the set of all possible human DNA sequences), one would take a sum instead of an integral:

P (A) := \sum x \in A p (x) .

The connection to reality is that if we sample $x \in X$ from the random variable $X$ , then its probability of being in the event $A$ is modeled as being precisely $P (A)$ . I think with these definitions, it should be possible to read the post again without getting into the technicalities of what a random variable is.

I think this post would be much easier to learn from if it was a jupyter notebook with python code intermixed or R markdown.

In the end of the article I link to this piece of code of how to do the twin study analysis. I hope that's somewhat helpful.