I have written a paper on ethics with special concentration on machine ethics and formality with the following abstract:
Most ethical systems are formulated in a very intuitive, imprecise manner. Therefore, they cannot be studied mathematically. In particular, they are not applicable to make machines behave ethically. In this paper we make use of this perspective of machine ethics to identify preference utilitarianism as the most promising approach to formal ethics. We then go on to propose a simple, mathematically precise formalization of preference utilitarianism in very general cellular automata. Even though our formalization is incomputable, we argue that it can function as a basis for discussing practical ethical questions using knowledge gained from different scientific areas.
Here are some further elements of the paper (things the paper uses or the paper is about):
- (machine) ethics
- (in)computability
- artificial life in cellular automata
- Bayesian statistics
- Solomonoff's a priori probability
As I propose a formal ethical system, things get mathy at some point but the first and by far most important formula is relatively simple - the rest can be skipped then, so no problem for the average LWer.
I already discussed the paper with a few fellow students, as well as Brian Tomasik and a (computer science) professor of mine. Both recommended me to try to publish the paper. Also, I received some very helpful feedback. But because this would be my first attempt to publish something, I could still use more help, both with the content itself and scientific writing in English (which, as you may have guessed, is not my first language), before I submit the paper and Brian recommended using the LW's discussion board. I would also be thankful for recommendations on which journal is appropriate for the paper.
I would like to send those interested a draft via PM. This way I can also make sure that I don't spend all potential reviewers on the current version.
DISCLAIMER: I am not a moral realist. Also and as mentioned in the abstract, the proposed ethical system is incomputable and can therefore be argued to have infinite Kolmogorov complexity. So, it does not really pose a conflict with LW-consensus (including Complexity of value).
Humans are definitely a result of natural selection, but it does not seem to be difficult at all to find goals of ours that do not serve the goal of survival or reproduction at all. Evolution seems to produce these other preferences accidentally. One thing how that happens may be examplified by the following: Our ability to contemplate our thinking from an almost external perspective (sometimes referred to as self-consiousness), is definitely helpful for learning / improving our thinking and could therefore prevail in evolution. However, it may also be the cause of altruism, because it makes every single one of us realize, that they are not very special. (This is by no means an attempt to explain altruism scientifically or something...) More generally, it would be a really strange coincidence, if all cognitive features of an organism in our physical world that serve the goal to survive and reproduce do not serve any other goal. In conclusion, even evolution can (probably) produce (by coincidence) organisms with goals that are not subgoals of the goal to survive and reproduce.
Now, imagine the paper clip maximizer to be more than a robot arm, imagine it to be a well-programmed Seed AI (or the like). As pointed out in ViliamBur's and cousinit's comment, its goal will probably not be easily changed (by coincidence or evolution of several such AIs), for example it could save its source code on several hard drives that are synchronized by a hard-wired mechanism or something... Now this paper clip maximizer would start turning all matter into paper clips. To achieve its goal, it would certainly remain in existence (and thereby give you the illusion of having the supergoal to exist in the first place) and protect its values (which is not extremely difficult). Assuming, it is successful (and we can expect this from a seed AI/superintelligence), the only matter (in reach) left, would at some point be the hardware of the paper clip maximizer itself. What would the paper clip maximizer do then? In conclusion, self-preservation and maybe propagation of value may be important subgoals, but it is certainly not the supergoal.