All of Alfred Harwood's Comments + Replies

It seems that the relevant thing is not so much how many values you have tested as the domain size of the function. A function with a large domain cannot be explicitly represented with a small lookup table. But this means you also have to consider how the black box behaves when you feed it something outside of its domain, right?

This sounds right. Implicitly, I was assuming that when the black box was fed an x outside of its domain it would return an error message or at least, something which is not equal to f(x). I realise that I didn't make this clear in ... (read more)

Cool, that all sounds fair to me. I don't think we have any substantive disagreements.

hmm, we seem to be talking past each other a bit. I think my main point in response is something like this:

In non-trivial settings, (some but not all) structural differences between programs lead to differences in input/output behaviour, even if there is a large domain for which they are behaviourally equivalent.

But that sentence lacks a lot of nuance! I'll try to break it down a bit more to find if/where we disagree (so apologies if a lot of this is re-hashing).

  • I agree that if two programs produce the same input output behaviour for literally every concei
... (read more)
6Dagon
I think this is a crux (of why we're talking past each other; I don't actually know if we have a substantive disagreement).  The post was about detecting "smaller than a lookup table would support" implementations, which implied that the input/output functionally-identical-as-tested were actually tested in the broadest possible domain.  I fully agree that "tested" and "potential" input/output pairs are not the same sets, but I assert that, in a black-box situation, it CAN be tested in a very broad set of inputs, so the distinction usually won't matter.  That said, nobody has built a pure lookup table anywhere near as complete as it would take to matter (unless the universe or my experience is simulated that way, but I'll never know).   My narrower but stronger point is that "lookup table vs algorithm" is almost never as important as "what specific algorithm" for any question we want to predict about the black box.  Oh, and almost all real-world programs are a mix of algorithm and lookup.    

For most practical purposes, "calculating a function" is only and exactly a very good compression algorithm for the lookup table.  

I think I disagree. Something like this might be true if you just care about input and output behaviour (it becomes true by definition if you consider that any functions with the same input/output behaviour are just different compressions of each other). But it seems to me that how outputs are generated is an important distinction to make. 

I think the difference goes beyond 'heat dissipation or imputed qualia'. As a c... (read more)

2Dagon
Yes, that is the assumption for "some computable function" or "black box which takes in strings and spits out other strings."   I'm not sure your example (of an AI with a much wider range of possible input/output pairs than the lookup table) fits this underlying distinction.  If the input/output sets are truly identical (or even identical for all tests you can think of), then we're back to the "why do we care" question.

I agree that there is a spectrum of ways to compute f(x) ranging from efficient to inefficient (in terms of program length). But I think that lookup tables are structurally different from direct ways of computing f because they explicitly contain the relationships between inputs and outputs. We can point to a 'row' of a lookup table and say 'this corresponds to the particular input x_1 and the output y_1' and do this for all inputs and outputs in a way that we can't do with a program which directly computes f(x). I think that allowing for compression prese... (read more)

5Anon User
What I meant is that the program knows how to check the answer, but not how to compute/find one, other than by trying every answer and then checking it. (Think: you have a math equation, no idea how to solve for x, so you are just trying all possible x in a row).

Regarding your request for a practical example.

Short Answer: It's a toy model. I don't think I can come up with a practical example which would address all of your issues.

Long Answer, which I think gets at what we disagree about:

I think we are approaching this from different angles. I am interested in the GRT from an agent foundations point of view, not because I want to make better thermostats. I'm sure that GRT is pretty useless for most practical applications of control theory! I read John Wentworth's post where he suggested that the entropy-reduction p... (read more)

2Richard_Kennaway
An agent with a goal needs to use the means available to it in whatever way will achieve that goal. That is practically the definition of a control system. So you do actually want to build better thermostats, even if you haven't realised it. I'm sure that GRT is pretty useless, period. A worse thermostat will achieve an equally low entropy distribution around 40C. Reaching the goal is what matters, not precisely hitting the wrong target.

I think we probably agree that the Good Regulator Theorem could have a better name (the 'Good Entropy-Reducer Theorem'?). But unfortunately, the result is most commonly known using the name 'Good Regulator Theorem'. It seems to me that 55 years after the original paper was published, it is too late to try to re-brand.

I decided to use that name (along with the word 'regulator') so that readers would know which theorem this post is about. To avoid confusion, I made sure to be clear (right in the first few paragraphs) about the specific way that I was using the word 'regulator'. This seems like a fine compromise to me.

Minimising the entropy of Z says that Z is to have a narrow distribution, but says nothing about where the mean of that distribution should be. This does not look like anything that would be called "regulation".

I wouldn't get too hung up on the word 'regulator'. It's used in a very loose way here, as in common in old cybernetics-flavoured papers. The regulator is regulating, in the the sense that it is controlling and restricting the range of outcomes.

Time is absent from the system as described. Surely a "regulator" should keep the value of Z near constant

... (read more)
2Richard_Kennaway
Human slop (I'm referring to those old cybernetics papers rather than the present discussion) has no more to recommend it than AI slop. "Humans Who Are Not Concentrating Are Not General Intelligences", and that applies not just to how they read but also how they write. What I am thinking of (as always when this subject comes up) is control systems. A room thermostat actually regulates, not merely "regulates", the temperature of a room, at whatever value the user has set, without modelling or learning anything. It, and all of control theory (including control systems that do model or adapt), fall outside the scope of the supposed Good Regulator Theorem. Hence my asking for a practical example of something that it does apply to.

When you are considering finite tape length, how do you deal with  when  or  when  ?  

2Optimization Process
I was imagining the tape wraps around! (And hoping that whatever results fell out would port straightforwardly to infinite tapes.)

Should there be a 'd' on the end of 'Debate' in the title or am I parsing it wrong? 

1omnizoid
Yes oops

It is meant to read 350°F. The point is that the temperature is too high to be a useful domestic thermostat. I have changed the sentence to make this clear (and added a ° symbol ). The passage now reads:

Scholten gives the evocative example of a thermostat which steers the temperature of a room to 350°F with a probability close to certainty. The entropy of the final distribution over room temperatures would be very low, so in this sense the regulator is still 'good', even though the temperature it achieves is too high for it to be useful as a domestic therm

... (read more)

Your reaction seems fair, thanks for your thoughts! Its a good a suggestion to add an epistemic status - I'll be sure to add one next time I write something like this.

Got it, that makes sense. I think I was trying to get at something like this when I was talking about constraints/selection pressure (a system has less need to use abstractions if its compute is unconstrained or there is no selection pressure in the 'produce short/quick programs' direction) but your explanation makes this clearer. Thanks again for clearing this up!

Thanks for taking the time to explain this. This is a clears a lot of things up.

Let me see if I understand. So one reason that an agent might develop an abstraction is that it has a utility function that deals with that abstraction (if my utility function is ‘maximize the number of trees’, its helpful to have an abstraction for ‘trees’). But the NAH goes further than this and says that, even if an agent had a very ‘unnatural’ utility function which didn’t deal with abstractions (eg. it was something very fine-grained like ‘I value this atom being in this e... (read more)

6johnswentworth
All dead-on up until this: It's not quite that it's impossible to model the world without the use of natural abstractions. Rather, it's far instrumentally "cheaper" to use the natural abstractions (in some sense). Rather than routing through natural abstractions, a system with a highly capable world model could instead e.g. use exponentially large amounts of compute (e.g. doing full quantum-level simulation), or might need enormous amounts of data (e.g. exponentially many training cycles), or both. So we expect to see basically-all highly capable systems use natural abstractions in practice.

Its late where I am now so I'm going to read carefully and respond to comments tomorrow, but before I go to bed I want to quickly respond to your claim that you found the post hostile because I don't want to leave it hanging.

I wanted to express my disagreements/misunderstandings/whatever as clearly as I could but had no intention to express hostility. I bear no hostility towards anyone reading this, especially people who have worked hard thinking about important issues like AI alignment. Apologies to you and anyone else who found the post hostile.

2Raemon
Fwiw I didn't find the post hostile. 
1deepthoughtlife
When reading the piece, it seemed to assume far too much (and many of the assumptions are ones I obviously disagree with). I would call many of the assumptions made to be a relative of the false dichotomy (though I don't know what it is called when you present more than two possibilities as exhaustive but they really aren't.) If you were more open in your writing to the idea that you don't necessarily know what the believers in natural abstractions mean, and that the possibilities mentioned were not exhaustive, I probably would have had a less negative reaction. When combined with a dismissive tone, many (me included) will read it as hostile, regardless of actual intent (though frustration is actually just as good a possibility for why someone would write in that manner, and genuine confusion over what people believe is also likely). People are always on the lookout for potential hostility it seems (probably a safety related instinct) and usually err on the side of seeing it (though some overcorrect against the instinct instead). I'm sure I come across as hostile when I write reasonably often though that is rarely my intent.
1Jonathan Claybrough
I don't actualy think your post was hostile, but I think I get where deepthoughtlife is coming from. At the least, I can share about how I felt reading this post and point out to why, since you seem keen on avoiding the negative side. Btw I don't think you avoid causing any frustration in readers, they are too diverse, so don't worry too much about it either. The title of the piece is strongly worded and there's no epistimic status disclaimer to state this is exploratory, so I actually came in expecting much stronger arguments. Your post is good as an exposition of your thoughts and conversation started, but it's not a good counter argument to NAH imo, so shouldn't be worded as such. Like deepthoughtlife, I feel your post is confused re NAH, which is totally fine when stated as such, but a bit grating when I came in expecting more rigor or knowledge of NAH.  Here's a reaction to the first part :  - in "Systems must have similar observational apparatus" you argue that different apparatus lead to different abstractions and claim a blind deaf person is such an example, yet in practice blind deaf people can manipulate all the abstractions others can (with perhaps a different inner representation), that's what general intelligence is about. You can check out this wiki page and video for some of how it's done https://en.wikipedia.org/wiki/Tadoma . The point is that all the abstractions can be understood and must be understood by a general intelligence trying to act effectively, and in practice Helen Keler could learn to speak by using other senses than hearing, in the same way we learn all of physics despite limited native instruments.  I think I had similar reactions to other parts, feeling they were missing the point about NAH and some background assumptions. Thanks for posting!

Thanks for taking the time to explain this to me! I would like to read your links before responding to the meat of your comment, but I wanted to note something before going forward because there is a pattern I've noticed in both my verbal conversations on this subject and the comments so far.

I say something like 'lots of systems don't seem to converge on the same abstractions' and then someone else says 'yeah, I agree obviously' and then starts talking about another feature of the NAH while not taking this as evidence against the NAH.

But most posts on the ... (read more)

2Jonas Hallgren
  Yes sir!  So for me it is about looking at a specific type of systems or a specific type of system dynamics that encode the axioms required for the NAH to be true.  So, it is more the claim that "there are specific set of mathematical axioms that can be used in order to get convergence towards similar ontologies and these are applicable in AI systems." For example, if one takes the Active Inference lens on looking at concepts in the world, we generally define the boundaries between concepts as markov blankets. Suprisingly or not, markov blankets are pretty great for describing not only biological systems but also AI and some economic systems. The key underlying invariant is that these are all optimisation systems.  p(NAH|Optimisation System). So if we for example, with the perspective of markov blankets or the "natural latents" (which are functionals that work like markov blankets) don't see convergence in how different AI systems represent reality then I would say that the NAH has been disproven or that it is evidence against it.  I do however think that this exists on a spectrum and that it isn't fully true or false, it is true for a restricted set of assumptions, the question being how restricted that is. I see it more as a useful frame of viewing agent cognition processes rather than something I'm willing to bet my life on. I do think it is pointing towards a core problem similar to what ARC Theory are working on but in a different way, understanding cognition of AI systems.

Hello! My name is Alfred. I recently took part in AI Safety Camp 2024 and have been thinking about the Agent-like structure problem. Hopefully I will have some posts to share on the subject soon.