eli_sennesh comments on MIRI's Approach - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (59)
Thanks for the reply, Jacob! You make some good points.
I endorse eli_sennesh's response to this part :-)
I am not under the impression that there are "exact solutions" available, here. For example, in the case of "building world-models," you can't even get "exact" solutions using AIXI (which does Bayesian inference using a simplicity prior in order to guess what the environment looks like; and can never figure it out exactly). And this is in the simplified setting where AIXI is large enough to contain all possible environments! We, by contrast, need to understand algorithms which allow you to build a world model of the world that you're inside of; exact solutions are clearly off the table (and, as eli_sennesh notes, huge amounts of statistical modeling are on it instead).
I would readily accept a statistical-modeling-heavy answer to the question of "but how do you build multi-level world-models from percepts, in principle?"; and indeed, I'd be astonished if you avoided it.
Perhaps you read "we need to know how to do X in principle before we do it in practice" as "we need a perfect algorithm that gives you bit-exact solutions to X"? That's an understandable reading; my apologies. Let me assure you again that we're not under the illusion you can get bit-exact solutions to most of the problems we're working on.
Hmm. If you have lots and lots of computing power, you can always just... not use it. It's not clear to me how additional computing power can make the problem harder -- at worst, it can make the problem no easier. I agree, though, that algorithms for modeling the world from the inside can't just extrapolate arbitrarily, on pain of exponential complexity; so whatever it takes to build and use multi-level world-models, it can't be that.
Perhaps the point where we disagree is that you think these hurdles suggest that figuring out how to do things we can't yet do in principle is hopeless, whereas I'm under the impression that these shortcomings highlight places where we're still confused?
Ohoho! Well, actually Nate, as I personally subscribe to the bounded-rationality school of thinking, and I do actually think this has implications for AI safety. Specifically: as the agent acquires more resources (speed and memory), it can handle larger problems and enlarge its impact on the world, so to make a bounded-rational agent safe, we should, hypothetically, be able to state safety properties explicitly in terms of how much cognitive stuff (philosophically, it all adds up to different ingredients to that magic word "intelligence") the agent has.
With some kind of framework like that, we'd be able to state and prove safety theorems in the form of, "This design will grow increasingly uncertain about its value function as it grows its cognitive resources, and act more cautiously until receiving more training, and we have some analytic bound telling us exactly how fast this fall-off will happen." I can even imagine it running along the simple lines of, "As the agent's model of the world grows more complicated, the entropy/Kolmogorov complexity of that model penalizes hypotheses about the learned value function, thus causing the agent to grow increasingly passive and wait for value training as it learns and grows."
This requires a framework for normative uncertainty that formalizes acting cautiously when under value-uncertainty, but didn't someone publish a thesis on that at Oxford a year or two ago?
Can I laugh maniacally at least a little bit now?
Well, as jacob_cannell pointed out, feeding more compute-power to a bounded-rational agent ought to make it enlarge its models in terms of theory-depth, theory-preorder-connectedness, variance-explanation, and time-horizon. In very short: the branching factors and the hypothesis class get larger, making it harder to learn (if we're thinking about statistical learning theory).
There's also the specific issue of assuming Turing-machine-level compute power, assuming that "available compute steps" and "available memory" is an unbounded but finite natural number. Since you've not bounded the number, it's effectively infinite, which of course means that two agents, each of which is "programmed" as a Turing-machine with Turing-machine resources rather than strictly finite resources, can't reason about each-other: either one would need ordinal numbers to think about what the other (or itself) can do, but actually using ordinal numbers in that analysis would be necessarily wrong (in that neither actually possesses a Turing Oracle, which is equivalent to having w_0 steps of computation).
So you get a bunch of paradox theorems making your job a lot harder.
In contrast, starting from the assumption of having strictly finite computing power is like when E.T. Jaynes starts from the assumption of having finite sample data, finite log-odds, countable hypotheses, etc.: we assume what must necessarily be true in reality to start with, and then analyze the infinite case as passing to the limit of some finite number. Pascal's Mugging is solvable this way using normal computational Bayesian statistical techniques, for instance, if we assume that we can sample outcomes from our hypothesis distribution.