Let's agree that the first step towards AI alignment is to refrain from building intelligent machines that are designed to kill people.
I don't get it; why would 'refraining from designing intelligent machines to kill people' help prevent AI from killing everyone? That's a really bold and outlandish claim that I think you have to actually defend and not just tell people to agree with... Like, from my perspective, you're just assuming the hard parts of the problem don't exist, and replacing all the hard parts with an easier problem ('avoid designing AIs to kill people'). It's the hard parts of the problem that seem on track to kill us; solving the easier problem doesn't seem to help.
I don't understand; why would changing its mind about what it wants help it make more paperclips?
Woah, this sounds incredibly fascinating, I've never heard of this — do you have a link to more info, or terminology to google?