Oracle AI: Human beliefs vs human values
It seems that if we can ever define the difference between human beliefs and values, we could program a safe Oracle by requiring it to maximise the accuracy of human beliefs on a question, while keeping human values fixed (or very little changing). Plus a whole load of other constraints, as usual, but that might work for a boxed Oracle answering a single question.
This is a reason to suspect it will not be easy to distinguish human beliefs and values ^_^
Design-space traps: mapping the utility-design trajectory space
This is a small section on a paper I'm writing on moral enhancement. I'm trying to briefly summarize some of the points which were already made concerning local optima in evolutionary process and safety regarding taking humanity out of those local optima. You might find the text helpful in that it summarizes a very important concept. I don't think there's nothing new here, but I hope the way I tried to more properly phrase the utility-design trajectory space topology at the end can be fruitful. I would appreciate any insights you might have about that formulation in the end, how to better develop it more rigorously and some consequences. I do have some ideas, but I would want to hear what you have to say first. Any other kind of general feedback on the text is also welcomed. But keep in mind this is just a section of a larger paper and I'm mainly interested in how to develop and what are the consequences of the framework at the end, rather than in properly developing any points in the middle.
Local optima are points where every nearby reachable positions are worse off, but there is at least one far away position which is vastly better. A strong case has been made that evolution often gets stuck on such local optima. In evolutionary processes, fitness is a monotonic function, i.e., it will necessarily increase or be maintained, any decrease in fitness will always be selected against. If there are vastly better solutions (for, e.g., solving cooperation problems) but in order to achieve those solutions organisms would have to pass through a lesser fit step, evolution will never reach that vastly better configuration. Evolutionary processes are limited by the topology of the fitness-design trajectory space, it can only go from design x to design y if there is at least one trajectory from x to y which is flat or ascendant, any trajectory momentarily descendent cannot be taken by the evolutionary processes. Say one is on the cyan ring ridge of the colored graphic.
Although there is a vastly better configuration on the red peak, one would have to travel through the blue moat in order to get there. Unless one is a process who could pass through a sharp decrease in fitness, there would be no way of improving towards the red peak. Evolution is particularly prone to local optima due to fitness monotonicity. Enhancing human beings with the use of technology does not fall prey to the fitness monotonicity or any sort of utility monotonicity in general, we could initially make changes which would be harmful in order to latter achieve a vastly better configuration. Therefore, it seems plausible there would be a technological path out of evolution’s local optimum whereby we could rescue our species from these evolutionary imprisonments. Moreover, it is considered evolutionary local optima can be easily identifiable provided a careful, evolutionary and technical informed analysis is made. Hence these would be low-hanging fruits in the task for improving evolutionary products such as humans, easily accessible and able to produce great advances to humanity with little effort.
Nevertheless, it should be noted getting out of evolutionary local optima might not always be easy or even possible. Fitness does have a relatively strong correlation with overall human utility. And although human intelligence is not so dull as evolutionary process and does accept a decrease in utility in order to achieve a better design in the end, if the downward moat is deep enough, the risk of catastrophe - or much worse, extinction -, might not be worth taking. At least by being monotonic on a dimension correlated with utility, evolution was able to rightly avoid extreme losses. Perform widespread willy-nilly human enhancement, and we might fall on the moat guarding utility-design space garden’s delicious low-hanging fruits and not come back up. Particularly so in the case of moral enhancement, there is a self-reinforcing aspect of changing morality, motivations, values and desires. It might be the case tampering with deep and fundamental human morality is irreversible, because once we fundamentally value something else, we would not have any compelling reason for wanting to come back to our old values, desires or aspirations. Thus, it seems there are indeed cases where a small step past the edge of the moat will lead us to an irreversible path. To correctly map how each technology shapes utility-design trajectory space topology is a task deeply needed in order to carefully avoid falling on moats while attempting to reach local optima low-hanging fruits, or on even more dangerous existential holes. We ought to better get stuck at local optima than absolute minima.
Utility-design trajectory space could be more properly defined as a space on Rn+u , a point there would use n-coordinates to locate all physically possible designs in all relevant dimensions n, it is defined by the laws of physics and by an utility function on u. A point will correspond to a design a iff all its neighbouring points x correspond to designs one physical step away from design a. Emergent designer processes such as evolution, human enhancement and AIs draw shapes on Rn+u by connecting points that are linked by one possible step under that process. Evolution’s hand is monotonic on dimension f, fitness, which makes for a pretty clumsy drawing. Biochemical human enhancement can more freely vary on f, but might contain other constraints elsewhere, that, e.g., uploaded minds would not. Extinctions correspond to singularities on u, once reached no other point is reachable, it designates lack of design. These points that can be reached but cannot reach need to be correctly mapped. It would also be relevant to investigate how each technology draws its specific shape on design space. Using u as some height analogue, some technologies might be inherently prone to shape moats with peaks on the middle, extinctions holes, effortless utility maximizing curves and so on. I believe moral enhancement draws a particularly bumpy hole-prone shape. FAI an ever utility-ascending shape, with all mishaps being existential holes.
Using People's Irrationality To Do Good by Leslie John
http://www.youtube.com/watch?v=MyRPL-QoZG8
Official description:
Identifying effective obesity treatment is both a clinical challenge and a public health priority. Can monetary incentives stimulate weight loss? Leslie John presents a study that examines different economic incentives for weight loss during a 16 week intervention.
Leslie John presented at the "The Science of Getting People to Do Good" research briefing at the Stanford Graduate School of Business, co-sponsored by the Center for Social Innovation.
Related Links:
http://csi.gsb.stanford.edu/special-event-science-getting-people-do-good
http://drfd.hbs.edu/fit/public/facultyInfo.do?facInfo=ovr&facId=589473
Simply sharing this resource here as it could start interesting discussions on moral and rationality.
Want Free Kindle Books?
I just came across this article that points out a Kindle Fire glitch: one that apparently lets you download books gratis if you cancel your purchase while the book is still downloading, and then quickly open it. The tablet downloads the book fully once it's been opened, letting you read it to your heart's content without actually having bought it.
This should be a great way for anyone wanting good(yet insanely expensive) Kindle books, provided this doesn't contradict your moral compunctions. Personally I neither own a Kindle Fire nor do I have a Kindle account, but just wanted to point this out for anyone interested here.
Of course, Amazon might have already fixed the glitch with a software update-but this has been strangely under-reported so maybe not.
Seeking advice on a moral dilemma
I just found 120 Euro (about $172) on the floor in the hallway in a hostel in Berlin. What should I do, and why?
- It's not inconceivable that the hostel might just take the money if I turn it in.
- I'll be at this hostel for about two more days.
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)