I don't know how to unpack "general intelligence" or "competence in arbitrary domains" and I don't think people have any reason to believe they possess something so awesome. When people talk about AGI, I just assume they mean AI that's at least as general as a human. A lobotomized human is one example of a "jumble of wires" that has human-level IQ but scores pretty low on goal-directedness.
The first general-enough AI we build will likely be goal-directed if it's simple and built from first principles. But if it's complex and cobbled together from "shallow insights", its goal-directedness and goal-stabilization tendencies are anyone's guess.
Wei and I took this discussion offline and came to the conclusion that "narrow AIs" without the urge to stabilize their values can also end up destroying humanity just fine. So this loose end is tidied up: contra Eliezer, a self-improving world-eating AI developed by stupid researchers using shallow insights won't necessarily go through a value freeze. Of course that doesn't diminish the danger and is probably just a minor point.
I have stopped understanding why these quotes are correct. Help!
More specifically, if you design an AI using "shallow insights" without an explicit goal-directed architecture - some program that "just happens" to make intelligent decisions that can be viewed by us as fulfilling certain goals - then it has no particular reason to stabilize its goals. Isn't that anthropomorphizing? We humans don't exhibit a lot of goal-directed behavior, but we do have a verbal concept of "goals", so the verbal phantom of "figuring out our true goals" sounds meaningful to us. But why would AIs behave the same way if they don't think verbally? It looks more likely to me that an AI that acts semi-haphazardly may well continue doing so even after amassing a lot of computing power. Or is there some more compelling argument that I'm missing?