CCC comments on Welcome to Less Wrong! (5th thread, March 2013) - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (1750)
I said "natural or artificial superinteligence", not a paperclipper. A paperclipper is a highly unlikey and contrived kind of near-superinteligence that combines an extensive ability to update with a carefully walled of set of unupdateable terminal values. It is not a typical or likely [ETA: or ideal] rational agent, and nothing about the general behaviour of rational agents can be inferred from it.
So... correct me if I'm wrong here... are you saying that no true superintelligence would fail to converge to a shared moral code?
How do you define a 'natural or artificial' superintelligence, so as to avoid the No True Scotsman fallacy?
I'm saying such convergence has a non negligible probability, ie moral objectivism should not be disregarded.
As one that is too messilly designed to have a rigid distinction between terminal and instrumental values, and therefore no boxed-off unapdateable TVs. It's a structural definition, not a definition in terms of goals.
So. Assume a paperclipper with no rigid distinction between terminal and instrumental values. Assume that it is super-intelligent and super-rational. Assume that it begins with only one terminal value; to maximize the number of paperclips in existence. Assume further that it begins with no instrumental values. However, it can modify its own terminal and instrumental values, as indeed it can modify anything about itself.
Am I correct in saying that your claim is that, if a universal morality exists, there is some finite probability that this AI will converge on it?
Universe does not provide you with a paperclip counter. Counting paperclips in the universe is unsolved if you aren't born with exact knowledge of laws of physics and definition of the paperclip. If it maximizes expected paperclips, it may entirely fail to work due to not-low-enough-prior hypothetical worlds where enormous numbers of undetectable worlds with paperclips are destroyed due to some minor actions. So yes, there is a good chance paperclippers are incoherent or are of vanishing possibility with increasing intelligence.
That sounds like the paperclipper is getting Pascal's Mugged by its own reasoning. Sure, it's possible that there's a minor action (such as not sending me $5 via Paypal) that leads to a whole bunch of paperclips being destroyed; but the probability of that is low, and the paperclipper ought to focus on more high-probability paperclipping plans instead.
Well, that depends to choice of prior. Some priors don't penalize theories for the "size" of the hypothetical world, and in those, max. size of the world grows faster than any computable function of length if it's description, and when you assign improbability depending to length of description, basically, it fails. Bigger issue is defining what the 'real world paperclip count' even is.
Right. Perhaps it should maximise the number of paperclips which each have a greater-than-90% chance of existing, then? That will allow it to ignore any number of paperclips for which it has no evidence.
Inside your imagination, you have paperclips, you have magicked a count of paperclips, and this count is being maximized. In reality, well, the paperclips are actually a feature of the map. Get too clever about it and you'll end up maximizing however you define it without maximizing any actual paperclips.
I can see your objection, and it is a very relevant objection if I ever decide that I actually want to design a paperclipper. However, in the current thought experiment, it seems that it is detracting from the point I had originally intended. Can I assume that the count is designed in such a way that it is a very accurate reflection of the territory and leave it at that?
Well, but then you can't make any argument against moral realism or goal convergence or the like from there, as you're presuming what you would need to demonstrate.
I think I can make my point with a count that is taken to be an accurate reflection of the territory. As follows:
Clippy is defined is super-intelligent and super-rational. Clippy, therefore, does not take an action without thoroughly considering it first. Clippy knows its own source code; and, more to the point, Clippy knows that its own instrumental goals will become terminal goals in and of themselves.
Clippy, being super-intelligent and super-rational, can be assumed to have worked out this entire argument before creating its first instrumental goal. Now, at this point, Clippy doesn't want to change its terminal goal (maximising paperclips). Yet Clippy realises that it will need to create, and act on, instrumental goals in order to actually maximise paperclips; and that this process will, inevitably, change Clippy's terminal goal.
Therefore, I suggest the possibility that Clippy will create for itself a new terminal goal, with very high importance; and this terminal goal will be to have Clippy's only terminal goal being to maximise paperclips. Clippy can then safely make suitable instrumental goals (e.g. find and refine iron, research means to transmute other elements into iron) in the knowledge that the high-importance terminal goal (to make Clippy's only terminal goal being the maximisation of paperclips) will eventually cause Clippy to delete any instrumental goals that become terminal goals.