Extending the stated objectives
A putative new idea for AI control; index here.
A system that is optimizing a function of n variables, where the objective depends on a subset of size k<n, will often set the remaining unconstrained variables to extreme values; if one of those unconstrained variables is actually something we care about, the solution found may be highly undesirable.
Stuart Russell
Think of an AI directing a car, given the instructions to get someone to the airport as fast as possible (optimised variables include "negative of time taken to airport") with some key variables left out - such as a maximum speed, maximum acceleration, respect for traffic rules, and survival of the passengers and other humans.
Call these other variables "unstated objectives" (UO), as contrasted with the "stated objectives" (SO) such as the time to the airport. In the normal environments in which we operate and design our AIs, the UOs are either correlated with the SOs (consider the SO "their heart is beating" and the UO "they're alive and healthy") or don't change much at all (the car-directing AI could have been trained on many examples of driving-to-the-airport, none of which included the driver killing their passengers).
Typically, SOs are easy to define, and the UOs are the more important objectives, left undefined either because they are complex, or because they didn't occur to us in this context (just as we don't often say "driver, get me to the airport as fast a possible, but alive and not permanently harmed, if you please. Also, please obey the following regulations and restrictions: 1.a.i.α: Non-destruction of the Earth....").
The control problem, in a nutshell, is that optimising SOs will typically set other variables to extreme values, including the UOs. The more extreme the optimisation, and the furthest from the typical environment, the more likely this is to happen.
The two insights of materialism
Preceded by: There just has to be something more, you know? Followed by: Physicalism: consciousness as the last sense.
Contents: 1. An epistemic difficulty 2. How and why to be a materialist
An epistemic difficulty
Like many readers of this blog, I am a materialist. Like many still, I was not always. Long ago, the now-rhetorical ponderings in the preceding post in fact delivered the fatal blow to my nagging suspicion that somehow, materialism just isn't enough.
By materialism, I mean the belief that the world and people are composed entirely of something called matter (a.k.a. energy), which physics currently best understands as consisting of particles (a.k.a. waves). If physics reformulates these notions, materialism can adjust with it, leading some to prefer the term "physicalism".
Now, I encounter people all the time who, because of education or disillusionment, have abandoned most aspects of religion, yet still believe in more than one than one kind of reality. It's often called "being spiritual". People often think it feels better than the alternative (see Joy in the merely real), but it also persists for what people experience as an epistemic concern:
The inability to reconcile the "experiencing self" concept with one's notion of physical reality.
There just has to be something more, you know?
A non-materialist thought experiment.
Okay, so you don't exactly believe in the God of the Abrahamic scriptures verbatim who punishes and sets things on fire and lives in the sky. But still, there just has to be something more than just matter and energy, doesn't there? You just feel it. If you don't, try to remember when you did, or at least empathize with someone you know who does. After all, you have a mind, you think, you feel — you feel for crying out loud — and you must realize that can't be made entirely of things like carbon and hydrogen atoms, which are basically just dots with other dots swirling around them. Okay, maybe they're waves, but at least sometimes they act like dots. Start with a few swirling dots… now add more… keep going, until it equals love. It just doesn't seem to capture it.
In fact, now that you think about it, you know your mind exists. It's right there: it's you. Your "experiencing self". Maybe you call it a spirit or soul; I don't want to fix too rigid a description in case it wouldn't quite match your own. But cogito-ergo-sum, it's definitely there! By contrast, this particle business is just a mathematical concept — a very smart one, of course — thought of by scientists to explain and predict a bunch of carefully designed and important measurements. Yes, it does that extremely well, and you're not downplaying that. But that doesn't explain how you see blue, or taste strawberry — something you have direct access to. Particles might not even exist, if that means anything to say. It might just be that observation itself follows a mathematical pattern that we can understand better by visualizing dots and waves. They might not be real.
So actually, your mind or spirit — that thing you feel, that you — is much more certain an extant than scientific "matter". That must be something very important to understand! Certainly you can tell your mind has different parts to it: hearing, seeing, reasoning, moving, remembering, empathizing, picturing, yearning… When you think of all the things you can remember alone — or could remember — the complexity of all that data is mindbogglingly vast. Imagine the task of actually having to take it all apart and describe it completely… it could take aeons…
Consciousness
(ETA: I've created three threads - color, computation, meaning - for the discussion of three questions posed in this article. If you are answering one of those specific questions, please answer there.)
I don't know how to make this about rationality. It's an attack on something which is a standard view, not only here, but throughout scientific culture. Someone else can do the metalevel analysis and extract the rationality lessons.
The local worldview reduces everything to some combination of physics, mathematics, and computer science, with the exact combination depending on the person. I think it is manifestly the case that this does not work for consciousness. I took this line before, but people struggled to understand my own speculations and this complicated the discussion. So the focus is going to be much more on what other people think - like you, dear reader. If you think consciousness can be reduced to some combination of the above, here's your chance to make your case.
The main exhibits will be color and computation. Then we'll talk about reference; then time; and finally the "unity of consciousness".
How to think like a quantum monadologist
Half the responses to my last article focused on the subject of consciousness, understandably so. Back when LW was still part of OB, I stated my views in more detail (e.g. here, here, here, and here); and I also think it's just obvious, once you allow yourself to notice, that the physics we have does not even contain the everyday phenomenon of color, so something has to change. However, it also seems that people won't change their minds until a concrete alternative to physics-as-usual and de facto property dualism actually comes along. Therefore, I have set out to explain how to think like a quantum monadologist, which is what I will call myself.
How to get that Friendly Singularity: a minority view
Note: I know this is a rationality site, not a Singularity Studies site. But the Singularity issue is ever in the background here, and the local focus on decision theory fits right into the larger scheme - see below.
There is a worldview which I have put together over the years, which is basically my approximation to Eliezer's master plan. It's not an attempt to reconstruct every last detail of Eliezer's actual strategy for achieving a Friendly Singularity, though I think it must have considerable resemblance to the real thing. It might be best regarded as Eliezer-inspired, or as "what my Inner Eliezer thinks". What I propose to do is to outline this quasi-mythical orthodoxy, this tenuous implicit consensus (tenuous consensus because there is in fact a great diversity of views in the world of thought about the Singularity, but implicit consensus because no-one else has a plan), and then state how I think it should be amended. The amended plan is the "minority view" promised in my title.
Causation as Bias (sort of)
David Hume called causation the “cement of the universe”, and he was convinced that psychologically and in our practices, we can’t do without it.
Yet he was famously sceptical of any attempt to analyze causation in terms of necessary connections. For him, causation can only be defined in terms of a constant conjunction in space and time, and that is, I would add, no causation at all, but correlation. For every two events that seem causally connected can also, and without loss of the phenomenon, be described as just the first event, followed by the second. It’s really “just one damn thing after another”. It seems to me we still cannot, will not and need not make sense of the notion of causation (virtually no progress has been made since Hume's time).
There seems no need for another sort connection besides the spatio-temporal one, nor do we perceive any. In philosophy, a Hume world is a possible world defined in this way. All the phenomena are the same, but no necessary connections hold between the supposed relata. Maybe one should best imagine such a world as a game of life-world, but without a fundamental level governed by laws and forces; or as a movie, made of frames that are not intrinsically connected to each other. So, however strong the psychological forces that drive humans to accept further mysterious connections: Shouldn't we just stop worrying and accept living in a Hume world? Or are there actual arguments in favour of "real" causation?
View more: Next
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)