I think those are perfectly good concerns. But they don't seem so likely that they make me want to exterminate humanity to avoid them.
I think you're describing a failure of corrigibility. Which could certainly happen, for the reason you give. But it does seem quite possible (and perhaps likely) that an agentic system will be designed primarily for corrigibility, or alternately, alignment by obedience.
The second seems like a failure of morality. Which could certainly happen. But I see very few people who both enjoy inflicting suffering, and who would continue to enjoy that even given unlimited time and resources to become happy themselves.
You are probably guessing correctly. I'm hoping that whoever gets ahold of aligned AGI will also make it corrigible, and that over time they'll trend toward a similar moral view to that generally held in this community. It doesn't have to be fast.
To be fair, I'm probably pretty biased against the idea that all we can realistically hope for is extinction. The recent [case against AGI alignment](https://www.lesswrong.com/posts/CtXaFo3hikGMWW4C9/the-case-against-ai-alignment) post was the first time I'd seen arguments that strong in that direction. I haven't ...
Yes. But that seems awfully unlikely to me. What would it need to be, two years from now? AI hype is going to keep ramping up as chatGPT and its successors are more widely used and improved.
If the odds of slipping it by governments and miltaries is slight, wouldn't the conclusion be the opposite - we should spread understanding of AGI alignment issues so that those in power have thought about them by the time they appropriate the leading projects?
This strikes me as a really practically important question. I personally may be rearranging my future based on ...
I think there's a possibility that their lives, or some of them, are vastly worse than death. See the recent post the case against value alignment for some pretty convincing concerns.
I totally agree with the core logic. I've been refraining from spreading these ideas, as much as I want to.
Here's the problem: Do you really think the whole government and military complex is dumb enough to miss this logic, right up to successful AGI? You don't think they'll roll in and nationalize the efforts when the power of AI keeps on progressively freaking people out more and more?
I think a lot of folks in the military are a lot smarter than you give them credit for. Or the issue will become much more obvious than you assume, as we get closer to gene...
Really? Can you say a little more about why you think you have that value? I guess I'm not convinced that it's really a terminal value if it varies so widely across people of otherwise similar beliefs. Presumably that's what lalartu meant as well, but I just don't get it. I like myself, so I'd like more of myself in the world!
Perhaps you're thinking of the dopamine spike when reward is actually given? I had thought the predictive spike was purely proportional to the odds of success and the amount of reward- which would indeed change with boring tasks, but not in any linear way. If you're right about that basic structure of the predictive spike I should know about it for my research; can you give a reference?
Less Wrong seems like the ideal community to think up better reputation systems. Doctorow's Whuffie is reasonably well-thought-out, but intended for a post-scarcity economy; but its ideas of distinguishing right-handed (people who agree with you) from left-handed (from people who generally don't agree with you) reputations seems like one useful ingredient. Reducing the influence of those who tend to vote together seems like another potential win.
I like to imagine a face-based system; snap an image from a smartphone, and access reputation.
I hope to see more discussion, in particular, VAuroch's suggestion.
I think the example is weak; the software was not that dangerous, the researchers were idiots who broke a vial they knew was insanely dangerous.
I think it dilutes the argument to broaden it to software in general; it could be very dangerous under exactly those circumstances (with terrible physical safety measures), but the dangers of superhuman AGI are vastly larger IMHO and deserve to remain the focus, particularly of the ultra-reduced bullet points.
I think this is as crisp and convincing a summary as I've ever seen; nice work! I also liked the book, but condensing it even further is a great idea.
"Pleased to meet you! Soooo... how is YOUR originating species doing?..."
That actually seems like an extremely reasonable question for the first interstellar meeting of superhuman AIs.
I disagree with EY on this one (I rarely do). I don't think it's so likely as to ensure rationally acting Friendly, but I do think that the possibility of encountering an equally powerful AI, and one with a headstart on resource acquisition, shouldn't be dismissed by a rational actor.
I'm game. These are some of my favorite topics. I do computational cognitive neuroscience, and my principal concern with it is how it can/will be used to build minds.
I may be confused, but it seems to me that the issue in generalizing from decision utility to utilitarian utility simply comes down to making an assumption allowing utilities among different people to be compared- to put them on the same scale. I think there's a pretty strong argument that we can do so, springing from the fact that we all are running essentially the same neural hardware. Whatever experiential value is, it's made of patterns of neural firing, and we all have basically the same patterns. While we don't run our brains exactly the same, the ...
I'm out of town or I'd be there. Hope to catch the next one.
Wow, I feel for you. I wish you good luck and good analysis.
Ha- I was there the week prior. I hope this is going to happen again. Note also that I'm re-launching a defunct Singularity meetup group for boulder/broomfield if anyone is interested.
Sorry I missed it. I hope there will be more Boulder LW meetups?
Given how many underpaid science writers are out there, I'd have to say that ~50k/year would probably do it for a pretty good one, especially given the 'good cause' bonus to happiness that any qualified individual would understand and value. But is even 1k/week in donations realistic? What are the page view numbers? I'd pay $5 for a good article on a valuable topic; how many others would as well? I suspect the numbers don't add up, but I don't even have an order-of-magnitude estimate on current or potential readers, so I can't myself say.
Upvoted; the issue of FAI itself is more interesting than whether Eliezer is making an ass of himself and thereby the SIAI message (probably a bit; claiming you're smart isn't really smart, but then he's also doing a pretty good job as publicist).
One form of productive self-doubt is to have the LW community critically examine Eliezer's central claims. Two of my attempted simplifications of those claims are posted here and here on related threads.
Those posts don't really address whether strong AI feasible; I think most AI researchers agree that it will bec...
Not sure what you mean about by 1), but certainly, recurrent neural nets are more powerful. 2) is no longer true; see for example the GeneRec algorithm. It does something much like backpropagation, but with no derivatives explicitly calculated, there's no concern with recurrent loops.
On the whole, neural net research has slowed dramatically based on the common view you've expressed; but progress continues apace, and they are not far behind cutting edge vision and speech processing algorithms, while working much more like the brain does.
I think this is an excellent question. I'm hoping it leads to more actual discussion of the possible timeline of GAI.
Here's my answer, important points first, and not quite as briefly as I'd hoped.
1) even if uFAI isn't the biggest existential risk, the very low investment and interest in it might make it the best marginal value for investment of time or money. As someone noted, having at least a few people thinking about the risk far in advance seems like a great strategy if the risk is unknown.
2) No one but SIAI is taking donations to mitigate the risk ...
I work in this field, and was under approximately the opposite impression; that voice and visual recognition are rapidly approaching human levels. If I'm wrong and there are sharp limits, I'd like to know. Thanks!
Now this is an interesting thought. Even a satisficer with several goals but no upper bound on each will use all available matter on the mix of goals it's working towards. But a limited goal (make money for GiantCo, unless you reach one trillion, then stop) seems as though it would be less dangerous. I can't remember this coming up in Eliezer's CFAI document, but suspect it's in there with holes poked in its reliability.
I think the concern stands even without a FOOM; if AI gets a good bit smarter than us, however that happens (design plus learning, or self-improvement), it's going to do whatever it wants.
As for your "ideal Bayesian" intuition, I think the challenge is deciding WHAT to apply it to. The amount of computational power needed to apply it to every thing and every concept on earth is truly staggering. There is plenty of room for algorithmic improvement, and it doesn't need to get that good to outwit (and out-engineer) us.
I think there are very good questions in here. Let me try to simplify the logic:
First, the sociological logic: if this is so obviously serious, why is no one else proclaiming it? I think the simple answer is that a) most people haven't considered it deeply and b) someone has to be first in making a fuss. Kurzweil, Stross, and Vinge (to name a few that have thought about it at least a little) seem to acknowledge a real possibility of AI disaster (they don't make probability estimates).
Now to the logical argument itself:
a) We are probably at risk from the...
I think the point is that not valuing non-interacting copies of oneself might be inconsistent. I suspect it's true; that consistency requires valuing parallel copies of ourselves just as we value future variants of ourselves and so preserve our lives. Our future selves also can't "interact" with our current self.
Quality matters if you have a community that's interested in your work; you'll get more "nice job" comments if it IS a nice job.
I don't think the lack of an earth-shattering ka-FOOM changes much of the logic of FAI. Smart enough to take over the world is enough to make human existence way better, or end it entirely.
It's quite tricky to ensure that your superintelligent AI does anything like what you wanted it to. I don't share the intuition that creating a "homeostasis" AI is any easier than an FAI. I think one move Eliezer is making in his "Creating Friendly AI" strategy is to minimize the goals you're trying to give the machine; just CEV.
I think this makes...
Well, yes; it's not straightforward to go from brains to preferences. But for any particular definition of preference, a given brain's "preference" is just a fact about that brain. If this is true, it's important to understanding morality/ethics/volition.
I think the main concern is that feed forward nets are used as a component in systems that achieve full AGI. For instance, deepmind's agent systems include a few networks and run a few times before selecting an action. Current networks are more like individual pieces of the human brain, like a visual system and a language system. Putting them together and getting them to choose and pursue goals and subgoals appropriately seems all too plausible.
Now, some people also think that just increasing the size of nets and training data sets will produce AGI, becaus... (read more)