Nicolas Villarreal

Wiki Contributions

Comments

Sorted by

Thank you for the substantive response. I do think there's a few misunderstandings here of what I'm saying.

 There need not be one best world state, and a world state need not be distinguishable from all others - merely from some of them. (In fact, utility function yielding a real value compresses the world into a characteristic of things we care about in such a way.)

I'm not talking about world states which exist "out there" in the territory, which, is debatable whether they exist or not anyway, I'm talking about world states that exist within the agent, compressed however they like. Within the agent, each world state they consider as a possible goal is distinguished from the others in order for it to be meaningful in some way. The distinguishing characteristics can be decided by the agent in an arbitrary way. 

Your series of posts also assume that signs have a fixed order. This is false. For instance, different fields of mathematics treat real number as either first order signs (atomic objects) or higher-order ones, defined as relations on rational numbers.

It is no coincidence that those definitions are identical; you cannot assume that if something is expressible using higher order signs, is not also expressible in lower order.

So when I'm talking about signs I'm talking about signifier/signified pairs, when we're talking about real numbers for example, we're taking about a signifier with two different signifieds, therefore two different signs. I talk about exactly this issue in my last post:

When your goal is to create something new, something novel, your goal is necessarily a higher order sign. Things which do not yet exist cannot be directly represented as a first order sign. And how do we know that this thing which doesn't yet exist is the thing we seek? The only way is through reference to other signs, hence making it a higher order sign. For example, when we speak of a theory of quantum gravity, we are not speaking the name of an actual theory, but the theory which fulfills a role within the existing scientific framework of physics. This is different from known signs that are the output of an operation, for example a specific number that is the answer to a math question, in these cases sign function collapse is possible (we can think of 4 as either a proper name of a concept, or merely as the consequence as from a certain logical rule).

As I say, most signifiers do have both associated first order and higher order signs! But these are /not/ the same thing, they are not equivalent, as you say they are, from an information perspective. If you know the first order sign, there's no reason you would automatically know the corresponding higher order sign, and the same for vice versa, as I show in my excerpt from my most recent blog. 

My argument specifically hinges on whether it's possible for an agent to have final goals without higher order signs: it's not, precisely because first order and higher order signs do not contain the same information. 

Engaging with the perspective of orthogonality thesis itself: rejecting it means that a change in intelligence will lead, in expectation, to change in final goals. Could you name the expected direction of such a change, like "more intelligent agents will act with less kindness"?

I couldn't name a specific direction, but what I would say is that agents of similar intelligence and environment will tend towards similar final goals. Otherwise, I generally agree with this post on the topic. https://unstableontology.com/2024/09/19/the-obliqueness-thesis/ 

I don't think your dialectical reversion back to randomista logic makes much sense considering we can't exactly do random control trials to figure out any of the major questions of the social sciences. If you want to promote social science research, I think the best thing you could do is collect consistent statistics over long periods of times. You can learn a lot about modern societies just by learning how national accounts work and looking back at them many different ways. Alternatively, building agent based simulations allows you to test in flexible ways how different types of behavior, both heterogenous and homogenous, might effect macroscopic social outcomes. These are the techniques that I use and they've proven very helpful. 

If there's one other thing you're missing is this, epistemology isn't something you can rely on others for, even trying to triangulate between different viewpoints. You always have to do your own epistemology, because every way of knowing you encounter in society is a part of someone's ideological framework trying to adversarially draw you into it. 

(The specifics of your postulated definition, especially that recursion->intelligence, seems like a not-very-useful way to define things, since Turing completeness probably means that once you clear a fairly low bar, your amount of possible recursion is just a measure of your hardware, when we usually want 'intelligence' to also capture something about your software. But the more standard information-theoretic notion of coding for a goal within a world-model would also say that bigger world models need (on average) bigger codes.)

So it might be a bit confusing, but by recursion here I did not mean like how many loops you do in a program, I meant what order of signs you can create and store, which is a statement of software. Basically, how many signs can you meaningfully connect to another. Not all hardware can represent higher order signs, easy example is a single layer vs multilevel perceptron. Perhaps recursion was the wrong word, but at the time I was thinking about how a sign can refer to another sign that refers to another sign and so on, creating a chain of signifiers which is still meaningful so long as the higher order signs refer to more than one lower order sign. 

When we're taking the human perspective, it's fine to say "the smarter agent has such a richer and more complex conception of its goal," without that implying that the smarter agent's goal has to be different than the dumber agent's goal.

The point of bringing semiotics into the mix here is to show that the meaning of a sign, such as a goal, is dependent on the things we associate with it. The human perspective is just a way of expressing that goal at one moment in time with our specific associations with it

a) Actions like exploration or "play" could be derived (instrumental) behaviors, rather than final goals. The fact that exploration is given as a final goal in many present-day AI systems is certainly interesting, but isn't very relevant to the abstract theoretical argument.

In my follow up post I actually show the way in which it is relevant. 

b) Even if you assume that every smart AI has "and also, explore and play" as part of its goals, doesn't mean the other stuff can't be alien.

The argument about alien values isn't the logical one but the statistical one, any AI situated in human culture will have values that are likely to be related to the signs created and used by that culture, although we can expect outliers. 

As for the orthogonality thesis, my first goal was to dispute its logic, but I think there are also some very practical lessons here. From what I can tell, the limit on intelligence created by an inability to create higher order values kicks in at a pretty basic level, and relate to the limits all current machine learning and LLM based AI that we see emerge on out of distribution tasks. Up till now, we've just found ways to procure more data to train on, but if machine agents can never be arbitrarily curious like humans are through making higher order signs our goals, then they'll never be more generally intelligent than us. 

I think we're seeing where that relationship is breaking down presently, specifically of compute and intelligence, as, while it's difficult to see what's happening inside of top AI companies, it seems like they're developing new systems/techniques, not just scaling up the same stuff anymore. In principle, though, I'm not sure it's possible to know in advance when such a correlation will break down, unless you have a deeper model of the relationship between those correlations (first order signs) and the higher level concept in question, which, in this case we do not. 

Yes. And in one sense that is trivial, there's plenty of algorithms you can run on extremely large compute that do not lead to intelligent behavior, but in another sense it is non-trivial because all the algorithms we have that essentially "create maps" as in representations of some reality need to have that domain specified that they're supposed to represent or learn, in order to create arbitrary domains an agent needs to make second order signs their goal - see my last post.

You can reason about a topic without "the math", if anyone can poke actual holes in my logic I would love to hear a substantial objection. It also wasn't my intention to write nothing but an introduction to semiotics in terms of math.

Also, semiotics was deeply connected to information theory and cybernetics, the fundamental characteristics of signs I bring up here as being distinct come from the cybernetic construct of variety. 

Seems that Claude, besides not really understanding the arguments I'm making, argues from assertion much more than I do.