Yes. My bad, I shouldn’t have implied all hidden-variables interpretations.
Every non-deterministic interpretation has a virtually infinite Kolmogorov complexity because it has to hardcode the outcome of each random event.
Hidden-variables interpretations are uncomputable because they are incomplete.
It’s the simplest explanation (in terms of Kolmogorov complexity).
It’s also the interpretation which by far has the most elegant explanation for the apparent randomness of reality. Most interpretations provide no mechanism for the selection of a specific outcome, which is absurd. Under the MWI, randomness emerges from determinism through indexical uncertainty, i.e., not knowing which branch you’re in. Some people, such as Sabine Hossenfelder for example, get confused by this and ask, “then why am I this version of me?”, which implicitly assumes dualism, as if there is a free-floating consciousness which could in principle inhabit any branch; this is patently untrue because you are by definition this “version” of you. If you were someone else (including someone in a different branch where one of your atoms is moved by one Planck distance) then you wouldn’t be you; you would be literally someone else.
Note that the Copenhagen interpretation is also a many-worlds explanation, but with the added assumption that all but one randomly chosen world disappears when an “observation” is made, i.e., when entanglement with your branch takes place.
It’s just a matter of definition. We say that “you” and “I” are the things that are entangled with a specific observed state. Different versions of you are entangled with different observations. Nothing is stopping you from defining a new kind of person which is a superposition of different entanglements. The reason it doesn’t “look” that way from your perspective is because of entanglement and the law of the excluded middle. What would you expect to see if you were a superposition?
Have you read Joseph Henrich’s books The Secret of Our Success, and its sequel The WEIRDest People in the World? If not, they provide a pretty comprehensive view of how humanity innovates and particularly the Western world, which is roughly in line with what you wrote here.
I kind of agree that most knowledge is useless, but the utility of knowledge and experience that people accrue is probably distributed like a bell curve, which means you can't just have more of the good knowledge without also accruing lots of useless knowledge. In addition, very often stuff that seems totally useless turns out to be very useful; you can't always tell which is which.
I completely agree. In Joseph Henrich’s book The Secret of Our Success, he shows that the amount of knowledge possessed by a society is proportional to the number of people in that society. Dwindling population leads to dwindling technology and dwindling quality of life.
Those who advocate for population decline are unwittingly advocating for the disappearance of the knowledge, experience and frankly wisdom that is required to keep the comfortable life that they take for granted going.
Keeping all that knowledge in books is not enough. Otherwise our long years in education would be unnecessary. Knowing how to apply knowledge is its own form of knowledge.
If causality is everywhere, it is nowhere; declaring “causality is involved” will have no meaning. It begs the question whether an ontology containing the concept of causality is the best one to wield for what you’re trying to achieve. Consider that causality is not axiomatic, since the laws of physics are time-reversible.
I respect Sutskever a lot, but if he believed that he could get an equivalent world model by spending an equivalent amount of compute learning from next-token prediction using any other set of real-world data samples, why would they go to such lengths to specifically obtain human-generated text for training? They might as well just do lots of random recordings (e.g., video, audio, radio signals) and pump it all into the model. In principle that could probably work, but it’s very inefficient.
Human language is a very high density encoding of world models, so by training on human language models get much of their world model “for free“, because humanity has already done a lot of pre-work by sampling reality in a wide variety of ways and compressing it into the structure of language. However, our use of language still doesn’t capture all of reality exactly and I would argue it’s not even close. (Saying otherwise is equivalent to saying we’ve already discovered almost all possible capabilities, which would entail that AI actually has a hard cap at roughly human ability.)
In order to expand its world model beyond human ability, AI has to sample reality itself, which is much less sample-efficient than sampling human behavior, hence the “soft cap”.
My introduction to Dennett, half a lifetime ago, was this talk:
That was the start of his profound influence on my thinking. I especially appreciated his continuous and unapologetic defense of the meme as a useful concept, despite the many detractors of memetics.
Sad to know that we won't be hearing from him anymore.