My new paper: Concept learning for safe autonomous AI

Kaj_Sotala

Abstract: Sophisticated autonomous AI may need to base its behavior on fuzzy concepts that cannot be rigorously defined, such as well-being or rights. Obtaining desired AI behavior requires a way to accurately specify these concepts. We review some evidence suggesting that the human brain generates its concepts using a relatively limited set of rules and mechanisms. This suggests that it might be feasible to build AI systems that use similar criteria and mechanisms for generating their own concepts, and could thus learn similar concepts as humans do. We discuss this possibility, and also consider possible complications arising from the embodied nature of human thought, possible evolutionary vestiges in cognition, the social nature of concepts, and the need to compare conceptual representations between humans and AI systems.

I just got word that this paper was accepted for the AAAI-15 Workshop on AI and Ethics: I've uploaded a preprint here. I'm hoping that this could help seed a possibly valuable new subfield of FAI research. Thanks to Steve Rayhawk for invaluable assistance while I was writing this paper: it probably wouldn't have gotten done without his feedback motivating me to work on this.

Comments welcome.

Thanks for pointing that out, I didn't realize that the intended meaning was non-obvious! Toggle's interpretation is basically right: "rigorously defined" is referring to something like giving the system a set of necessary and sufficient criteria for when something should qualify as an instance of the concept. And "specifying" is intended to refer to something more general, such as building the system in such a way that it's capable of learning the concepts on its own, without needing an exhaustive (and impossible-to-produce) external definition of them. But now that you've pointed it out, it's quite true that the current choice of words doesn't really make that obvious: I'll clarify that for the final version of the paper.

Obtaining desired AI behavior

Looks like you're making distinctions between how you're going to build something that has the desired behavior. That how would be the specification.

These concepts could be explicitly specified set theoretically as concepts, or specified by defining boundaries in some conceptual space, or more generally, specified algorithmically as the product of information processing system with learning behavior and learning environment, without initially explicitly creating a conceptual representation.

It's not that one way is rigoro... (read more)

26

My new paper: Concept learning for safe autonomous AI

26

26

26

My new paper: Concept learning for safe autonomous AI

26

26