LESSWRONG
LW

2688
johnswentworth
59212Ω691337135320
Message
Dialogue
Subscribe

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
From Atoms To Agents
"Why Not Just..."
Basic Foundations for Agent Models
Framing Practicum
Gears Which Turn The World
Abstraction 2020
Gears of Aging
Model Comparison
johnswentworth's Shortform
johnswentworth2h20

Notably that post has a section arguing against roughly the sort of thing I'm arguing for:

Making the definition of what constitutes a low level language dependent on laws of physics is removing it from the realm of mathematics and philosophy. It is not a property of the language any more, but a property shared by the language and physical reality.

My response would be: yes, what-constitutes-a-low-level-language is obviously contingent on our physics and even on our engineering, not just on the language. I wouldn't even expect aliens in our own universe to have low-level programming languages very similar to our own. Our low level languages today are extremely dependent on specific engineering choices made in the mid 20th century which are now very locked in by practice, but do not seem particularly fundamental or overdetermined, and would not be at all natural in universes with different physics or cultures with different hardware architecture. Aliens would look at our low-level languages and recognize them as low-level for our hardware, but not at all low-level for their hardware.

Analogously: choice of a good computing machine depends on the physics of one's universe. 

I do like the guy's style of argumentation a lot, though.

Reply
johnswentworth's Shortform
johnswentworth7h40

I think that's roughly correct, but it is useful...

'The best UTM is the one that figures out the right answer the fastest' is true, but not very useful.

Another way to frame it would be: after one has figured out the laws of physics, a good-for-these-laws-of-physics Turning machine is useful for various other things, including thermodynamics. 'The best UTM is the one that figures out the right answer the fastest' isn't very useful for figuring out physics in the first place, but most of the value of understanding physics comes after it's figured out (as we can see from regular practice today).

Also, we can make partial updates along the way. If e.g. we learn that physics is probably local but haven't understood all of it yet, then we know that we probably want a local machine for our theory. If we e.g. learn that physics is causally acyclic, then we probably don't want a machine with access to atomic unbounded fixed-point solvers. Etc.

Reply
Natural Latents: Latent Variables Stable Across Ontologies
johnswentworth8h20

I think you might have misread something? The graphical statement of theorem 2 does not say that if ΛA is determined by ΛB, then ΛA is a mediator; that would indeed be false in general. It says that:

  • If ΛB is a mediator and we have agreement on observables, then...
  • ... naturality of ΛA implies that ΛA is determined by ΛB.

In particular, the theorem says that under some conditions ΛA is determined by ΛB. Determination is in the conclusion, not the premises. On the flip side, ΛA being a mediator is in the premises, not the conclusion.

Reply
johnswentworth's Shortform
johnswentworth8h40

What I have in mind re:boundedness...

If we need to use a Turing machine which is roughly equivalent to physics, then a natural next step is to drop the assumption that the machine in question is Turing complete. Just pick some class of machines which can efficiently simulate our physics, and which can be efficiently implemented in our physics. And then, one might hope, the sort of algorithmic thermodynamic theory the paper presents can carry over to that class of machines.

Probably there are some additional requirements for the machines, like some kind of composability, but I don't know exactly what they are.

This would also likely result in a direct mapping between limits on the machines (like e.g. limited time or memory) and corresponding limits on the physical systems to which the theory applies for those machines.

The resulting theory would probably read more like classical thermo, where we're doing thought experiments involving fairly arbitrary machines subject to just a few constraints, and surprisingly general theorems pop out.

Reply
johnswentworth's Shortform
johnswentworth8h42

Then you would have been wrong. No Free Lunch Theorems do not bind to reality.

Reply
johnswentworth's Shortform
johnswentworth19h20

Haven't been using that one, but I expect it would have very different results than the dataset we are using. That one would test very different things than we're currently trying to get feedback on; there's a lot more near-deterministic known structure in that one IIRC.

Reply
johnswentworth's Shortform
johnswentworth1d30

Good question, it's the right sort of question to ask here, and I don't know the answer. That does get straight into some interesting follow-up questions about e.g. the ability to physically isolate the machine from noise, which might be conceptually load-bearing for things like working with arbitrary precision quantities.

Reply1
johnswentworth's Shortform
johnswentworth1d496

One of the classic conceptual problems with a Solomonoff-style approach to probability, information, and stat mech is "Which Turing machine?". The choice of Turing machine is analogous to the choice of prior in Bayesian probability. While universality means that any two Turing machines give roughly the same answers in the limit of large data (unlike two priors in Bayesian probability, where there is no universality assumption/guarantee), they can be arbitrarily different before then.

My usual answer to this problem is "well, ultimately this is all supposed to tell us things about real computational systems, so pick something which isn't too unreasonable or complex for a real system". 

But lately I've been looking at Aram Ebtekar and Marcus Hutter's Foundations of Algorithmic Thermodynamics. Based on both the paper and some discussion with Aram (along with Steve Petersen), I think there's maybe a more satisfying answer to the choice-of-Turing-machine issue in there.

Two key pieces:

  • The "Comparison against Gibbs-Shannon entropy" section of the paper argues that uncomputability is a necessary feature, in order to assign entropy to individual states and still get a Second Law. The argument says: if there exists a short program which can provably find and output a high-entropy string S, then we can physically instantiate a machine to run that short program. Then, when that physical machine spits out the high-entropy string S, S could be used to erase another copy of S. In other words, there is some high-entropy state (S) which this physical machine + program could steer into a low-entropy state.
  • As Aram pointed out, most of the bounds have a constant for the complexity of the laws of physics. If we choose a machine for which the laws of physics have high complexity, then the bounds are quantitatively trash.

The first piece is a part of the theory which can only bind to reality insofar as our chosen Turing machine is tractable to physically implement. The second piece is a part of the theory which can only bind to reality insofar as our physics can be tractably implemented on our chosen Turing machine.

In other words: in order for this thermodynamic theory to work well, we need to choose a Turing machine which is "computationally equivalent to" physics, in the sense that our physics can run the machine without insane implementation size, and the machine can run our physics without insane implementation size.

I'm still wrapping my head around all the pieces here, so hopefully I (or, better yet, someone else) will write up a more clear explainer in the future. But this smells really promising to me. Not just for purposes of Solomonoff thermodynamics, but also as a more principled way to tackle bounded rationality of embedded systems.

Reply1
johnswentworth's Shortform
johnswentworth1d40

That would be pretty reasonable, but it would make the model comparison part even harder. I do need P[X] (and therefore Z) for model comparison; this is the challenge which always comes up for Bayesian model comparison.

Reply
Notes on fatalities from AI takeover
johnswentworth1d72

It sounds like you are not not claiming that superintelligence will have human-like scope insensitivity baked into its preferences? Which seems like an absolutely bonkers thing to claim. "1 billionth of resources" does not at all seem like a natural way for "slight caring" to manifest in an actually-advanced mind; it seems like a thing which very arguably occurs in human minds but is particularly unlikely to generalize to superintelligence precisely because the generalized version would kneecap many general capabilities quite badly.

Reply
Load More
11johnswentworth's Shortform
Ω
6y
Ω
741
37How To Dress To Improve Your Epistemics
21d
58
117Natural Latents: Latent Variables Stable Across Ontologies
Ω
1mo
Ω
19
51When Both People Are Interested, How Often Is Flirtatious Escalation Mutual?
Q
1mo
Q
14
66Do-Divergence: A Bound for Maxwell's Demon
Ω
1mo
Ω
4
186Before LLM Psychosis, There Was Yes-Man Psychosis
1mo
20
126(∃ Stochastic Natural Latent) Implies (∃ Deterministic Natural Latent)
Ω
2mo
Ω
8
68Resampling Conserves Redundancy (Approximately)
Ω
2mo
Ω
2
92Generalized Coming Out Of The Closet
2mo
51
35A Self-Dialogue on The Value Proposition of Romantic Relationships
2mo
71
80Follow-up to "My Empathy Is Rarely Kind"
2mo
42
Load More