Daniel Murfet

Wiki Contributions

Comments

Sorted by

Is there a reason why the Pearson correlation coefficient of the data in Figure 14 is not reported? This correlation is referred to numerous times throughout the paper.

There's no general theoretical reason that I am aware of to expect a relation between the L2 norm and the LLC. The LLC is the coefficient of the  term in the asymptotic expansion of the free energy (negative logarithm of the integral of the posterior over a local region, as a function of sample size ) while the L2 norm of the parameter shows up in the constant order term of that same expansion, if you're taking a Gaussian prior.

It might be that in particular classes of neural networks there is some architecture-specific correlation between the L2 norm and the LLC, but I am not aware of any experimental or theoretical evidence for that.

For example, in the figure below from Hoogland et al 2024 we see that there are later stages of training in a transformer trained to do linear-regression in context (blue shaded regions) where the LLC  is decreasing but the L2 norm is increasing. So the model is moving towards a "simpler" parameter with larger weight norm.

My best current guess is that it happens to be, in the grokking example, that the simpler solution has smaller weight norm. This could be true in many synthetic settings, for all I know; however, in general, it is not the case that complexity (at least as far as SLT is concerned) and weight norm are correlated.

 

That simulation sounds cool. The talk certainly doesn't contain any details and I don't have a mathematical model to share at this point. One way to make this more concrete is to think through Maxwell's demon as an LLM, for example in the context of Feynman's lectures on computation. The literature on thermodynamics of computation (various experts, like Adam Shai and Paul Riechers, are around here and know more than me) implicitly or explicitly touches on relevant issues.

The analogous laws are just information theory. 

Re: a model trained on random labels. This seems somewhat analogous to building a power plant out of dark matter; to derive physical work it isn't enough to have some degrees of freedom somewhere that have a lot of energy, one also needs a chain of couplings between those degrees of freedom and the degrees of freedom you want to act on. Similarly, if I want to use a model to reduce my uncertainty about something, I need to construct a chain of random variables with nonzero mutual information linking the question in my head to the predictive distribution of the model.

To take a concrete example: if I am thinking about a chemistry question, and there are four choices A, B, C, D. Without any other information than these letters the model cannot reduce my uncertainty (say I begin with equal belief in all four options). However if I provide a prompt describing the question, and the model has been trained on chemistry, then this information sets up a correspondence between this distribution over four letters and something the model knows about; its answer may then reduce my distribution to being equally uncertain between A, B but knowing C, D are wrong (a change of 1 bit in my entropy).

Since language models are good general compressors this seems to work in reasonable generality.

Ideally we would like the model to push our distribution towards true answers, but it doesn't necessarily know true answers, only some approximation; thus the work being done is nontrivially directed, and has a systematic overall effect due to the nature of the model's biases.

I don't know about evolution. I think it's right that the perspective has limits and can just become some empty slogans outside of some careful usage. I don't know how useful it is in actually technically reasoning about AI safety at scale, but it's a fun idea to play around with.

Marcus Hutter on AIXI and ASI safety 

Yes this seems like an important question but I admit I don't have anything coherent to say yet. A basic intuition from thermodynamics is that if you can measure the change in the internal energy between two states, and the heat transfer, you can infer how much work was done even if you're not sure how it was done. So maybe the problem is better thought of as learning to measure enough other quantities that one can infer how much cognitive work is being done.

For all I know there is a developed thermodynamic theory of learning agents out there which already does this, but I didn't find it yet...

The description of love at the conclusion of Gene Wolfe's The Wizard gets at something important, if you read it as something that both parties are simultaneously doing.

The work of Ashby I'm familiar with is "An Introduction to Cybernetics" and I'm referring to the discussion in Chapter 11 there. The references you're giving seem to be invoking the "Law" of requisite variety in the context of arguing that an AGI has to be relatively complex in order to maintain homeostatis in a complex environment, but this isn't the application of the law I have in mind.

From the book:

The law of Requisite Variety says that R's capacity as a regulator cannot exceed R's capacity as a channel of communication.

In the form just given, the law of Requisite Variety can be shown in exact relation to Shannon's Theorem 10, which says that if noise appears in a message, the amount of noise that can be removed by a correction channel is limited to the amount of information that can be carried by that channel.

Thus, his "noise" corresponds to our "disturbance", his "correction channel" to our "regulator R", and his "message of entropy H" becomes, in our case, a message of entropy zero, for it is constancy that is to be "transmitted": Thus the use of a regulator to achieve homeostasis and the use of a correction channel to suppress noise are homologous.

and

A species continues to exist primarily because its members can block the flow of variety (thought of as disturbance) to the gene-pattern, and this blockage is the species’ most fundamental need. Natural selection has shown the advantage to be gained by taking a large amount of variety (as information) partly into the system (so that it does not reach the gene-pattern) and then using this information so that the flow via R blocks the flow through the environment T.

This last quote makes clear I think what I have in mind: the environment is full of advanced AIs, they provide disturbances D, and in order to regulate the effects of those disturbances on our "cognitive genetic material" there is some requirement on the "correction channel". Maybe this seems a bit alien to the concept of control. There's a broader set of ideas I'm toying with, which could be summarised as something like "reciprocal control" where you have these channels of communication / regulation going in both directions (from human to machine, and vice versa).

The Queen's Dilemma was a little piece of that picture, which attempts to illustrate this bi-directional control flow by having the human control the machine (by setting its policy, say) and the machine control the human (in an emergent fashion, that being the dilemma).

Is restricting human agency fine if humans have little control over where it is restricted and to what degree?

Load More