V_V comments on Versions of AIXI can be arbitrarily stupid - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (59)
But it seems like such an agent could only survive in an environment where it literally can't die, i.e., there is nothing it can do that can possibly cause death, since in order to converge on the right environment, independent of language, it has to try all possible courses of action as time goes to infinity and eventually it will do something that kills itself.
What value (either practical or philosophical, as opposed to purely mathematical), if any, do you see in this result, or in the result about episodic environments?
There are plenty of applications of reinforcement learning where it is plausible to assume that the environment is ergodic (that is, the agent can't "die" or fall into traps that permanently result in low rewards) or episodic. The Google DQN Atari game agent, for instance, operates in an episodic environment, therefore, stochastic action selection is acceptable.
Of course, this is not suitable for an AGI operating in an unconstrained physical environment.
Yes I agree there can be applications for narrow AI or even limited forms of AGI. I was assuming that Stuart was thinking in terms of FAI so my question was in that context.