The quotes are correct in the sense that "P implies P" is correct; that is, the authors postulate the existence of an entity constructed in a certain way so as to have certain properties, then argue that it would indeed have those properties. True, but not necessarily consequential, as there is no compelling reason to believe in the future existence of an entity constructed in that way in the first place. Most humans aren't like that, after all, and neither are existing or in-development AI programs; nor is it a matter of lacking "intelligen...
Saying that there is an agent refers (in my view; definition for this thread) to a situation where future events are in some sense expected to be optimized according to some goals, to the extent certain other events ("actions") control those future events. There might be many sufficient conditions for that in terms of particular AI designs, but they should amount to this expectation.
So an agent is already associated with goals in terms of its actual effect on its environment. Given that agent's own future state (design) is an easily controlled pa...
Let's start with the template for an AGI, the seed for a generally intelligent expected-utility maximizer capable of recursive self-improvement.
As far as I can tell, the implementation of such a template would do nothing at all because its utility-function would be a "blank slate".
What happens if you now enclose the computation of Pi in its utility-function? Would it reflect on this goal and try to figure out its true goals? Why would it do so, where does the incentive come from?
Would complex but implicit goals change its behavior? Why would it i...
We humans don't exhibit a lot of goal-directed behavior
Do you not count reward-seeking / reinforcement-learning / AIXI-like behavior as goal-directed behavior? If not, why not? If yes, it doesn't seem possible to build an AI that makes intelligent decisions without a goal-directed architecture.
A superintelligence might be able to create a jumble of wires that happen to do intelligent things, but how are we humans supposed to stumble onto something like that, given that all existing examples of intelligent behavior and theories about intelligent decision...
if you design an AI using "shallow insights" without an explicit goal-directed architecture - some program that "just happens" to make intelligent decisions that can be viewed by us as fulfilling certain goals
I think that that's where you're looking at it differently from Eliezer et al. I think that Eliezer at least is talking about an AI which has goals, but does not, when it starts modifying itself, understand itself well enough to keep them stable. Once it gets good enough at self modification to keep its goals stable, it will do...
I'm finding it hard to imagine an agent that can get a diversity of difficult things done in a complex environment without forming goals and subgoals, which sounds to me like a requirement of general intelligence. AGI seems to require many-step plans and planning seems to require goals.
The Omohundro quote sounds like what humans do. If humans do it, machines might well do it too.
The Yudkowsky quote seems more speculative. It assumes that values are universal, and don't need to adapt to local circumstances. This would be in contrast to what has happened in evolution so far - where there are many creatures with different niches and the organisms (and their values) adapt to the niches.
I understood Omohundro's Basic AI Drives as applying only to successful (although not necessarily Friendly) GAI. If a recursively self-improving GAI had massive value drift with each iterative improvement to its ability at reaching its values, it'd end up just flailing around, doing a stochastic series of actions with superhuman efficiency.
I think the Eliezer quote is predicated on the same sort of idea--that you've designed the AI to attempt to preserve its values; you just did it imperfectly. Assuming the value of value preservation isn't among the ones...
But why would AIs behave the same way if they don't think verbally?
Part of the problem, it appears to me, is that you're ascribing a verbal understanding to a mechanical process. Consider; for AIs to have values those values must be 'stored' in a medium compatible with their calculations.
However, once an AI begins to 'improve' itself -- that is, once an AI has as an available "goal" the ability to form better goals -- then it's going to base the decisions of what an improved goal is based on the goals and values it already has. This will cause...
If the AI is an optimization process, it will try to find out what it's optimizing explicitly. If not, it's not intelligent.
I have stopped understanding why these quotes are correct. Help!
More specifically, if you design an AI using "shallow insights" without an explicit goal-directed architecture - some program that "just happens" to make intelligent decisions that can be viewed by us as fulfilling certain goals - then it has no particular reason to stabilize its goals. Isn't that anthropomorphizing? We humans don't exhibit a lot of goal-directed behavior, but we do have a verbal concept of "goals", so the verbal phantom of "figuring out our true goals" sounds meaningful to us. But why would AIs behave the same way if they don't think verbally? It looks more likely to me that an AI that acts semi-haphazardly may well continue doing so even after amassing a lot of computing power. Or is there some more compelling argument that I'm missing?