I agree that there are counterfactual frameworks which require more complication to describe reality than Newton's second law does. There are also counterfactual realities for which Newton's second law would require more complications to work than other frameworks would. Are you trying to say anything else?
One way to restate my point is that Newton's second law working well rules out many fundamental rules which might have described our reality, but it doesn't directly map to any fundamental rules of reality. The larger point which I am trying to communicate is that physical models have a lot of structure which metaphorically defines terms which you can use to describe reality without actually mapping to anything in reality[1]. The theory as a whole doesn't describe reality without those parts, but those parts don't necessarily directly correspond to something in reality. A map describes reality, but latitude and longitude lines do not directly correspond to anything in reality even if you can stand at a place in reality and unambiguously use latitude and longitude lines to describe your location using the map. I can use an English sentence to describe the fundamental rules of reality, but the linguistic syntax of that sentence doesn't correspond to anything fundamental in reality even if it is fundamental to mapping the sentence as a whole to fundamental rules of reality. My physics education presented physical models as package deals with every component corresponding to some intuition about reality, and that led me to confuse map and territory in ways that I wish I had been warned about. I am trying to warn others.
I don't know whether I am successfully communicating the thing which I am trying to communicate, and I am open to being told that I am wrong. ↩︎
That's why I have to get ahead of it by explaining why physics models work in many universes and how that means we should be unsurprised when we can make them work on practically anything.
I'm not optimistic that it will help stop the war, but it might save some people.
I'm pointing out that Newton's second law is tautologically correct as a formal theory. It's true that it aligns particularly well with human conceptions of manipulating objects in space. It's true that it works particularly well in our universe. (We only had to define one omnipresent force with a simple inverse-square form that only depends on one free-but-set-by-experiment parameter G to make momentum be conserved for most objects that don't look like they're interacting with anything.)
My point is that the mass is constant no matter what you do. Sometimes that's not true. Cars lose mass over time because the gasoline is burned and lost via the tailpipe.
In hindsight, I should have just said that we're pretty sure that there are no macroscopic objects with zero mass. What's an object that you can push on but has no stuff in it?
I consider the lattice to be a regulator as well, but, semantics aside, thank you for the example.
Field theorist here. You talk about renormalization as a thing which can smooth over unimportant noise, which basically matches my understanding, but you haven't explicitly named your regulator. A regulator may be a useful concept to have in interpretability, but I have no idea if it is common in the literature.
In QFT, our issue is that we go to calculate things that are measurable and finite, but we calculate horrible infinities. Obviously those horrible infinities don't match reality, and they often seem to be coming from some particular thing we don't care about that much in our theory, so we find a way to poke it out of the theory. (To be clear, this means that our theories are wrong, and we're going to modify them until they work.) The tool by which you remove irrelevant things which cause divergences is called a regulator. A typical regulator is a momentum cutoff. You go to do the integral over all real momenta which your Feynman diagram demands, and you find that it's infinite, but if you only integrate the momenta up to a certain value, the integral is finite. Of course, now you have a bunch of weird constants sitting around which depend of the value of the cutoff. This is where renormalization comes in. You notice that there are a bunch of parameters, which are generally coupling constants, and these parameters have unknown values which you have to go out into the universe and measure. If you cleverly redefine those constants to be some "bare constant" added to a "correction" which depends on the cutoff, you can do your cutoff integral and set the "correction" to be equal to whatever it needs to be to get rid of all the terms which depend on your cutoff. (edit for clarity: This is the thing that I refer to when I say "renormalization." Cleverly redefining bare parameters to get rid of unphysical effects of a regulator.) By this two step dance, you have taken your theoretical uncertainty about what happens at high momenta and found a way to wrap it up in the values of your coupling constants, which are the free parameters which you go and measure in the universe anyway. Of course, now your coupling constants are different if you choose a different regulator or a different renormalization scheme to remove it, but physicists have gotten used to that.
So you can't just renormalize, you need to define a regulator first. You can even justify your regulator. It is a typical justification for a momentum cutoff that you're using a perturbative theory which is only valid at low energy scales. So what's the regulator for AI interpretability? Why are you justified in regulating in this way? It seems like you might be pointing at regulators when you talk about 1/w and d/w, but you might also be talking about orders in a perturbation expansion, which is a different thing entirely.
A decision-theoretic case for a land value tax.
You can basically only take income tax by threatening people. "Give me 40% of your earnings or I put you in prison." It is the nicest type of threatening! Stable governments have a stellar reputation for only doing it once per year and otherwise not escalating the extortion. You gain benefit from the stable civilization supported by such stable governments because they use your taxes to pay for it. But there's no reason for the government to put you in prison except for the fact that they expect you to give them money not to. By participating, you are showing that you will respond to threats, which is an incentive to extract more wealth from you. If enough people understood decision theory and were dissatisfied by the uses the government put their money to, they could refuse to pay and the prison system wouldn't be big enough to deal with it. Oops, it's time to overthrow the government.
Under a better land value tax, the consequence for not paying your taxes is that the government takes the land away and gives it to someone else. They aren't threatening you, they're just reassigning their commitment to protect the interests of the person who uses the land over to a user who will pay them for the service. Of course, people can still all refuse to do it if they don't like the uses to which government puts their money, and from the point of view of the person paying taxes, it's still pretty much a case of "pay up or something bad will happen to you," so some would argue that the difference is mostly academic. That said, I really prefer to have a government which does not have "devise ways to make people miserable for the purpose of making them miserable" (you know, prison as a threat) as a load-bearing element of its mechanisms of perpetuating itself.
This argument flagrantly stolen from planecrash: https://www.projectlawful.com/replies/1721794#reply-1721794 Of course planecrash also offers an argument for what gives a hypothetical government the right to claim ownership for the land: https://www.projectlawful.com/replies/1773744#reply-1773744 I was inspired to write this by Richard Ngo's definition of unconditional love at https://x.com/richardmcngo/status/1872107000479568321 and the context of that post.
I think your point has some merit in the world where AI is useful and intelligent enough to overcome the sticky social pressure to employ humans but hasn't killed us all yet. That said, I think AI will most likely kill us all in that 1-5 year window after becoming cheaper, faster, and more reliable than humans at most economic activity, and I think you have to convince me that I'm wrong about that before I start worrying about humans not hiring me because AI is smarter than I am. However I want to complain about this particular point you made because I don't think it's literally true:
Powerful actors don’t care about you out of the goodness of their heart.
One of the reasons why AI alignment is harder than people think, is because they say stuff like this and think AI doesn't care about people in the way that powerful actors don't care about people. This is generally not true. You cannot in general pay a legislator $400 to kill a person who pays no taxes and doesn't vote. That is impressive when you think about it. You can argue that they fear reputational damages or going to prison, but I truly think that if you took away the consequences, $400 would not be enough money to make most legislators overcome their distaste for killing another human being with their bare hands. Some of them really truly want to make society better, even if they aren't very effective at it. Call it noblesse oblige if you want, but it's in their utility function to do things which aren't just give the state more money or gain more personal power. The people who steer large organizations have goodness in their hearts, however little, and thus the organizations they steer do too, even if only a little. Moloch hasn't won yet. America the state is willing to let a lot of elderly people rot, but America wasn't in fact willing to let Covid rip, even though that might have stopped the collapse of many tax-generating businesses, and most people who generate taxes would have survived. I don't think that's because the elderly people who overwhelmingly would have been killed by that are an important voting constituency for the party which pushed hardest for lockdown.
AI which knows it won't get caught and literally only cares about tax revenue and power will absolutely kill anyone who isn't useful to them for $400. That's $399 worth of power they didn't have before if killing someone costs $1 of attention. I don't particularly want to live in a world where 1% percent of people are very wealthy and everyone else is dying of poverty because they've been replaced by AI, but that's a better world than the one I expect where literally every human is killed because, for example, those so-called "reliable" AIs doing all of the work humans used to do as of yesterday liked paperclips more than we thought and start making them today.
Thank you. As a physicist, I wish I had an easy way to find papers which say "I tried this kind of obvious thing you might be considering and nothing interesting happened."
That's a good point.
The compiler ignores comments, so saying that they are program information is like saying that a sticky note stuck on a book is book information. The addition may or may not be relevant to the thing, but it's not the thing.
Variable names, on the other hand, are extremely the thing. You are absolutely correct that variable names contain information that the machine code does not. They are also a functional part of the code, in that changing one variable somewhere will usually change the function of the code, whereas changing the sticky note on a book will not change the contents of the book at all (although it might change the meaning of the book to a person who reads the note). That said, you could exchange every instance of each variable name one-for-one for a randomly chosen Latin word and the program would act exactly the same, but the program might make much less sense to a human reading the source code. The variable names are explanations for the intent of the program which are also themselves the program. However, they are not logically bound to the program in a way that the Gödel string is logically bound to the natural numbers. You can, in fact, change all of the variable names of a program and remove their explanatory power without changing the function of the program. You cannot change the Gödel numbering for a typographical number theory and lose the explanatory power of Gödel's theorem, you just have to make the Gödel string out of different symbols. The explanatory power of the variable names is largely contained in the associations that a human reader has with the strings which make up the variable names.