Part of it seems to be inherent in the idea of AGI, or an artificial general intelligence. There seems to be the belief that once an AI crosses a certain threshold of smarts, it will be capable of understanding literally everything.
The MIRI/LessWrong sphere is very enamoured of "universal" problem solvers like AIXI. The main pertinent fact about these is that they can't be built out of atoms in our universe. Nonetheless, MIRI think it is possible to get useful architecture-indpendent generalisations out of AIXI-style systems.
"Anyway that sounds great right? Universal prior. Right. What's it look like? Way oversimplifying, it rates hypotheses' likelihood by their compressibility, or algorithmic complexity. For example, say our perfect AI is trying to figure out gravity. It's going to treat the hypothesis that gravity is inverse-square as more likely than a capricious intelligent faller. It's a formalization of Occam's razor based on real, if obscure, notions of universal complexity in computability theory.
But, problem. It's uncomputable. You can't compute the universal complexity of any string, let alone all possible strings. You can approximate it, but there's no efficient way to do so (AIXItl is apparently exponential, which is computer science talk for "you don't need this before civilization collapses, right?").
So the mathematical theory is perfect, except in that it's impossible to implement, and serious optimization of it is unrealistic. Kind of sums up my view of how well LW is doing with AI, personally, despite this not being LW. Worry about these contrived Platonic theories while having little interest in how the only intelligent beings we're aware of actually function."
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
"A major goal of the control problem is preventing AIs from doing that. Ensuring that their output is safe and useful." You might want to be careful with the "safe and useful" part. It sound like it's moving into the pattern of slavery. I'm not condemning the idea of AI, but a sentient entity would be a sentient entity, and I think would deserve some rights.
Also, why would an AI become evil? I know this plan is supposed to protect from the eventuality, but why would a presumably neutral entity suddenly want to harm others? The only reason for that would be if you were imprisoning it. Additionally, we are talking about several more decades of research ( probably ) before AI gets powerful enough to actually "think" that it should escape its current server.
Assuming that the first AI can evolve enough to somehow generate malicious actions that WEREN'T in its original programming, what's to say that the second won't become evil? I'm not sure if you were trying to express the eventuality of the first AI "accidentally" conducting an evil act, or if you meant that it would become evil.
The worry isn't that the AI would suddenly become evil by some human standard, rather that the AI's goal system would be insufficiently considerate of human values. When humans build a skyscraper, they aren't deliberately being "evil" towards the ants that lived in the earth that was excavated and had concrete poured over it, the humans just don't value the communities and structures that the ants had established.