faul_sname - LessWrong

Can you give a concrete example of a situation where you'd expect this sort of agreed-upon-by-multiple-parties code to be run, and what that code would be responsible for doing? I'm imagining something along the lines of "given a geographic boundary, determine which jurisdictions that boundary intersects for the purposes of various types of tax (sales, property, etc)". But I don't know if that's wildly off from what you're imagining.

Planning in a Lattice Graph

faul_sname5h20

Fun side note: in this particular example, it doesn't actually matter how you pick your direction. "Choose the axis closest to the target direction" performs exactly as well as "choose any edge which does not make the target node unreachable when traversed at random, and then traverse that edge" or "choose the first edge where traversing that edge does not make the target node unreachable, and traverse that edge".

faul_sname's Shortform

faul_sname14h42

So I keep seeing takes about how to tell if LLMs are "really exhibiting goal-directed behavior" like a human or whether they are instead "just predicting the next token". And, to me at least, this feels like a confused sort of question that misunderstands what humans are doing when they exhibit goal-directed behavior.

Concrete example. Let's say we notice that Jim has just pushed the turn signal lever on the side of his steering wheel. Why did Jim do this?

The goal-directed-behavior story is as follows:

Jim pushed the turn signal lever because he wanted to alert surrounding drivers that he was moving right by one lane
Jim wanted to alert drivers that he was moving one lane right because he wanted to move his car one lane to the right.
Jim wanted to move his car one lane to the right in order to accomplish the goal of taking the next freeway offramp
Jim wanted to take the next freeway offramp because that was part of the most efficient route from his home to his workplace
Jim wanted to go to his workplace because his workplace pays him money
Jim wants money because money can be exchanged for goods and services
Jim wants goods and services because they get him things he terminally values like mates and food

But there's an alternative story:

When in the context of "I am a middle-class adult", the thing to do is "have a job". Years ago, this context triggered Bob to perform the action "get a job", and now he's in the context of "having a job".
When in the context of "having a job", "showing up for work" is the expected behavior.
Earlier this morning, Bob had the context "it is a workday" and "I have a job", which triggered Bob to begin the sequence of actions associated with the behavior "commuting to work"
Bob is currently approaching the exit for his work - with the context of "commuting to work", this means the expected behavior is "get in the exit lane", and now he's in the context "switching one lane to the right"
In the context of "switching one lane to the right", one of the early actions is "turn on the right turn signal by pushing the turn signal lever". And that is what Bob is doing right now.

I think this latter framework captures some parts of human behavior that the goal-directed-behavior framework misses out on. For example, let's say the following happens

Jim is going to see his good friend Bob on a Saturday morning
Jim gets on the freeway - the same freeway, in fact, that he takes to work every weekday morning
Jim gets into the exit lane for his work, even though Bob's house is still many exits away
Jim finds himself pulling onto the street his workplace is on
Jim mutters "whoops, autopilot" under his breath, pulls a u turn at the next light, and gets back on the freeway towards Bob's house

This sequence of actions is pretty nonsensical from a goal-directed-behavior perspective, but is perfectly sensible if Jim's behavior here is driven by contextual heuristics like "when it's morning and I'm next to my work's freeway offramp, I get off the freeway".

Note that I'm not saying "humans never exhibit goal-directed behavior".

Instead, I'm saying that "take a goal, and come up with a plan to achieve that goal, and execute that plan" is, itself, just one of the many contextually-activated behaviors humans exhibit.

I see no particular reason that an LLM couldn't learn to figure out when it's in a context like "the current context appears to be in the execute-the-next-step-of-the-plan stage of such-and-such goal-directed-behavior task", and produce the appropriate output token for that context.

Planning in a Lattice Graph

faul_sname1d60

Easier question: Let's say you have a single node in this graph of nodes. You want to figure out where that single node should be embedded in your 100-dimensional space, but you only care about its embedding location relative to a few specific other nodes.

You have the following affordances:

List the edges that originate at a node. The edges will always be returned in the same order for the same node, but the order is not necessarily the same for different nodes (i.e. the first edge may point along the 57th axis for one node and in the 22nd axis for a different node)
Given an edge, retrieve the node at the far end of that edge
Compare two nodes to see if they are the same node as each other

That is to say, if you have the following problem definition

import random

class Node:
    key = None
    edges = None

    def __init__(self):
        self.edges = []

class Edge:
    _src = None
    _get_dst = None
    _dst = None

    def __init__(self, src, get_dst):
        self._src = src
        self._get_dst = get_dst

    def get_dst(self):
        if self._dst is None:
            self._dst = self._get_dst()
        return self._dst

class Graph:
    def __init__(self, axis_length, n_dims):
        self.axis_length = axis_length
        self.n_dims = n_dims
        self._nodes = {}
        self._next_node_id = 1

    def get_node_at(self, coords):
        axis_order = list(range(self.n_dims))
        random.shuffle(axis_order)
        if coords not in self._nodes:
            node = Node()
            node.key = self._next_node_id
            self._next_node_id += 1
            for axis in axis_order:
                if coords[axis] == 0:
                    continue
                dst_coords = list(coords)
                dst_coords[axis] -= 1
                dst_coords = tuple(dst_coords)
                def make_edge(dst_coords):
                    def get_dst():
                        return self.get_node_at(list(coords))
                    return Edge(node, lambda: self.get_node_at(dst_coords))
                edge = make_edge(dst_coords)
                node.edges.append(edge)
            self._nodes[coords] = node
        return self._nodes[coords]

    def get_random_node(self):
        return self.get_node_at(tuple([random.randint(0, self.axis_length-1) for _ in range(self.n_dims)]))

and you want a function which will take an arbitrary node and give you the coordinates of that node in a consistent basis in finite time with arbitrarily high probability of correctness

class ComputedBasis:
	def __init__(self):
		self.node_positions_by_key = {}

	def get_coords(node):
    	# Given a node, give the coordinates of that node in some
    	# consistent basis 
    	pass

I claim that this is indeed possible to do, and the steps to do it look nothing like "compute $10^{100}$ things".

Edit: To be explicit about the motivation, once we define this function, we can find a path from our position to the sandwich using something like

def path_to_sandwich(my_node, sandwich_node):
    basis = ComputedBasis()
    my_coords = basis.get_coords(my_node)
    sandwich_coords = basis.get_coords(sandwich_node)
    for axis, (my_pos, sandwich_pos) in zip(my_coords, sandwich_coords):
        if my_pos < sandwich_pos:
           raise(f"""
               Can't get to sandwich from here!
               I can only travel towards the origin on each axis.
                   axis: {axis}
                   my_pos: {my_pos}
                   sandwich_pos: {sandwich_pos}
           """)
     return get_path(basis, my_node, sandwich_node)

def get_path(basis, start_node, goal_node):
    curr_node = start_node
    path = [curr_node]
    goal_coords = basis.get_coords(goal_node)
    while curr_node != goal_node:
        curr_coords = basis.get_coords(curr_node)
    	# Find the first axis where we need to move towards the goal along that axis.
    	for axis, (curr_pos, goal_pos) in zip(cur_coords, goal_coords):
    	    if curr_pos > goal_pos:
    	         step_coords = [p for p in curr_pos]
    	      	 step_coords[axis] -= 1
    	      	 step_coords = tuple(step_coords)
    	      	 break
    	for edge in curr_node.edges:
    	    dst_node = edge.get_dst()
    	    dst_coords = basis.get_coords(dst_node)
    	    if dst_coords == step_coords:
    	    	step_node = dst_node
    	    	break
    	curr = step_node
    	path.append(curr)
    return path

Note that my framing of the problem is slightly different, in that (0, 0, 0, ..., 0, 0, 0) is the point from which there are no outbound edges, rather than (10, 10, 10, ..., 10, 10, 10) in your version. Doesn't really make a difference logically, just makes the code more readable.

Johannes C. Mayer's Shortform

faul_sname1d20

In that post, you say that you have a graph of vertices with a particular structure. In that scenario, where is that structured graph of $10^{100}$ vertices coming from? Presumably there's some way you know the graph looks like this

rather than looking like this

If you know that your graph is a nice sparse graph that has lots of symmetries, you can take advantage of those properties to skip redundant parts of the computation (and when each of your $10^{100}$ nodes has at most 100 inbound edges and 100 outbound edges, then you only have on the order of a trillion distinct nodes (if we consider e.g. (0, 0, 0, ..., 0, 0, 1) to be identical to (0, 0, 0, ..., 0, 1, 0, ..., 0, 0, 0)).

It's probably worth looking at the process which is generating this graph, and figuring out if we can translate the output of that process directly to a coordinate in our 100-dimensional space without going through the "translate the output to a graph, and then embed that graph" intermediate step.

Goal oriented cognition in "a single forward pass"

faul_sname2d42

I think the probability of getting the exact continuation "a a a a a ..." is genuinely higher than the probability of getting the exact continuation "little girl who was born with a very special gift...", though getting a continuation in the class of "a a a a a..." is much lower-probability than getting a continuation in the class of "little girl who was born with a very special gift..", because the latter class has a much larger possibility space than the former. So there might be 1e4 different low-entropy length-32 completions with an average probability of 1e-10 each, and 9.999999e15 different high-entropy length-32 completions with an average probability of 1e-16. This adds up to normality in that if you were to randomly sample this distribution, you'd get a weird low-entropy output one time in a million, and a normal high-entropy output the other 999999 times in a million. But if you try to do something along the lines of "take the best K outputs and train the model on those", you'll end up with almost entirely weird low-entropy outputs.

But yeah, I think I misunderstood your proposal as something along the lines of "take the k most probable n-token outputs" rather than "take the k% most probable n-token outputs" or "randomly sample a bunch of n-token outputs".

Goal oriented cognition in "a single forward pass"

faul_sname2d20

And I think one way to create a 2-token reasoner is to generate all plausible completions of 2 tokens, and then propagate the joint loss of the log-probs of those two tokens.

I think this just doesn't work very well, because it incentivizes the model to output a token which makes subsequent tokens easier to predict, as long as the benefit in predictability of the subsequent token(s) outweighs the cost of the first token. Concretely, let's say you have the input "Once upon a time, there was a" and you want 32 tokens. Right now, davinci-002 will spit out something like [" little"," girl"," who"," was"," born"," with"," a"," very"," special"," gift","."," She"," could"," see"," things"," that"," others"," could"," not","."," She"," could"," see"," the"," future",","," and"," she"," could"," see"," the"," past"], with logprobs of [-2.44, -0.96, -0.90, ..., -0.28, -0.66, 0.26], summing to -35.3. But if instead, it returned [" a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"," a"], it would have logprobs like [-9.32, -7.77, -1.51, ..., -0.06, -0.05, -0.05], summing to -23.5. And indeed, if you could somehow ask a couple quadrillion people "please write a story starting with Once upon a time, there was a", I suspect that at least 1 in a million people would answer with low-entropy completions along the lines of a a a a ... (and there just aren't that many low-entropy completions). But "Once upon a time there was a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a" is not a very good completion, despite being a much higher-probability completion.

You could use a more sophisticated loss function that "sum of individual-token logprob", but I think that road leads towards PPO (nothing says that your criterion has to be "helpful/harmful/honest as judged by a human rater" though).

Johannes C. Mayer's Shortform

faul_sname2d30

Do you need to store information about each object? If so, do you need to do so before or after the operation.

If you need to store information about each object before processing (let's say 1 bit per object, for simplicity), the Landauer limit says you need of mass to store that information (at the current cosmic microwave background temperature of 2.7ºK). That's about a factor of $10^{7}$ more than the mass of the observable universe, so the current universe could not even store $10^{100}$ bits of information for you to perform your operation on in the first place.

I think if you're willing to use all the mass in the universe and wait a trillion billion years or so for the universe to cool off, you might be able store one bit of output per operation for $10^{100}$ operations, assuming you can do some sort of clever reversible computing thing to make the operations themselves approximately free.

Is there some specific computation you are thinking of that is useful if you can do it $10^{100}$ times but not useful if you can only do it $10^{50}$ times?

niplav's Shortform

faul_sname4d40

Ideally one would want to be able to compute the logical correlation without having to run the program.

I think this isn't possible in the general case. Consider two programs, one of which is "compute the sha256 digest of all 30 byte sequences and halt if the result is 9a56f6b41455314ff1973c72046b0821a56ca879e9d95628d390f8b560a4d803" and the other of which is "compute the md5 digest of all 30 byte sequences and halt if the result is 055a787f2fb4d00c9faf4dd34a233320".

Any method that was able to compute the logical correlation between those would also be a program which at a minimum reverses all cryptograhic hash functions.

Cohesion and business problems

faul_sname6d40

Your POS system exports data that your inventory software imports and uses. But I strongly suspect that this is often not possible in practice.

This sounds like exactly the sort of problem that a business might pay for a solution to, particularly if there is one particular pair of POS system / inventory software that is widely used in the industry in question, where those pieces of software don't natively play well together.

LESSWRONG
LW

Posts

Wiki Contributions

Comments