Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

High impact from low impact

2 Stuart_Armstrong 17 April 2015 04:01PM

Part of the problem with a reduced impact AI is that it will, by definition, only have a reduced impact.

Some of the designs try and get around the problem by allowing a special "output channel" on which impact can be large. But that feels like cheating. Here is a design that accomplishes the same without using that kind of hack.

Imagine there is an asteroid that will hit the Earth, and we have a laser that could destroy it. But we need to aim the laser properly, so need coordinates. There is a reduced impact AI that is motivated to give the coordinates correctly, but also motivated to have reduced impact - and saving the planet from an asteroid with certainty is not reduced impact.

Now imagine that instead there are two AIs, X and Y. By abuse of notation, let ¬X refer to the even that the output signal from X is scrambled away from the the original output.

Then we ask X to give us the x-coordinates for the laser, under the assumption of ¬Y (that AI Y's signal will be scrambled). Similarly, we Y to give us the y-coordinates of the laser, under the assumption ¬X.

Then X will reason "since ¬Y, the laser will certainly miss its target, as the y-coordinates will be wrong. Therefore it is reduced impact to output the correct x-coordinates, so I shall." Similarly, Y will output the right y-coordinates, the laser will fire and destroy the asteroid, having a huge impact, hooray!

The approach is not fully general yet, because we can have "subagent problems". X could create an agent that behave nicely given ¬Y (the assumption it was given), but completely crazily given Y (the reality). But it shows how we could get high impact from slight tweaks to reduced impact.

Comment author: artemium 15 April 2015 07:06:32AM *  0 points [-]

One possibility is to implement the design which will makes agent strongly sensitive to the negative utility when he invests more time and resources on unnecessary actions after he ,with high-enough probability , achieved its original goal.

In the paperclip example : wasting time an resources in order to build more paperclips or building more sensors/cameras for analyzing the result should create enough negative utility to the agent compared to alternative actions.

Comment author: Stuart_Armstrong 17 April 2015 03:13:39PM 0 points [-]

This has problems with the creation of subagents: http://lesswrong.com/lw/lur/detecting_agents_and_subagents/

You can use a few resources to create subagents without that restriction.

Comment author: djm 17 April 2015 04:55:54AM 0 points [-]

Would minimising the number of CPU cycles work as a lazy incentive.

This assumes that lesser CPU cycles will produce an outcome that is satisified rather than optimised, though in our current state of understanding any optimisation routines take a lot more computing effort than 'rough enough' solutions.

Perhaps getting the AGI's to go Green will kill two birds with one stone.

Comment author: Stuart_Armstrong 17 April 2015 03:12:21PM 0 points [-]

This has problems with the creation of subagents: http://lesswrong.com/lw/lur/detecting_agents_and_subagents/

You can use a few CPU cycles to create subagents without that restriction.

Comment author: solipsist 15 April 2015 02:10:47AM 2 points [-]

So it has a current utility of (1-ε)10, and can increase this by reducing ε - hence by building even more paperclips.

I take ε to be the probability that something weird is happening like you're hallucinating your paperclips. Why would building more paperclips reduce ε? If you are dreaming, you're just making more dream paperclips.

I'm sure you'd spend your time with trying to find increasingly elaborate ways to probe for bugs in Descartes' demon's simulation. It is not clear to me why your increasingly paranoid bug probes would involve making paperclips.

Comment author: Stuart_Armstrong 17 April 2015 03:10:27PM 0 points [-]

It is not clear to me why your increasingly paranoid bug probes would involve making paperclips.

It need not. The problem occurs for any measure that burns resources (and probing the universe for bugs in the Descartes demon would be spectacular at burning resources).

Comment author: Luke_A_Somers 14 April 2015 10:39:32PM 0 points [-]

Hasn't this been mentioned before, as satisficing on probability of object-level satisfaction?

Comment author: Stuart_Armstrong 17 April 2015 03:08:58PM 0 points [-]

Well, the agent design hasn't been mentioned before, but the idea seems plausible enough that an equivalent one could have been found.

Comment author: TheAncientGeek 14 April 2015 09:05:07PM 0 points [-]

Exactly the sane....that is the point of predictive accuracy being orthogonal to ontological accuracy...you can vary the latter without affecting the firmer,

Comment author: Stuart_Armstrong 17 April 2015 03:08:10PM 0 points [-]

"just social constructs" is (almost always) not a purely ontological statement, though. And those who think that it's a social construct, but that the predictions of germ theories are still accurate... well, it doesn't really matter what they think, they just seem to have different labels to the rest of us for the same things.

Comment author: TheAncientGeek 14 April 2015 07:34:17PM *  -1 points [-]

Being relatively liberal about symbol grounding makes it easier to answer Searle, but harder to answer other people, such as people who think germs or atoms are just social constructs.

Comment author: Stuart_Armstrong 14 April 2015 07:45:05PM 1 point [-]

but harder to answer other people, such as people who think germs or atoms are just social constructs.

What predictions do they make when looking into microscopes or treating infectious diseases?

Comment author: TheAncientGeek 14 April 2015 07:00:59PM -3 points [-]

By hypothesis, it isnt the real reality. Effectively, you are defending physical realism by abandoning realism.

Comment author: Stuart_Armstrong 14 April 2015 07:15:15PM 2 points [-]

Pretty much, yes.

Comment author: TheAncientGeek 14 April 2015 06:56:20PM -1 points [-]

The more complex your model, and the more complex reality is, the closer the correspondence between them, and the more your internal model acts as if it is "learning something" (making incorrect predictions, processing the data, then making better ones), the less scope these is for your symbols to be ungrounded.

That seems to merely assert what I was arguing against .. I was arguing that preditictve accuracy is orthogonal to ontological correctness...and that grounding is to do with ontological correctness.

It's always possible, but the level of coincidence needed to have the wrong model that behave exactly the same as the right one is huge.

Right and wrong don't have univocal meaning, here. A random model will have poor predictive accuracy, but you can still have two models of equivalent predictive accuracy, but different ontological implications.

And, I'd say, having the wrong model that gives the right predictions is just the same as having the randomized labels.

You seem to be picturing a model as a graph with labelled vertices, and assuming that two equally good models most have the same structure. That is not so.

For instance, the Ptolemaic system can be made as accurate as you want for generating predictions, by adding extra epicycles ... although it is false, in the sense of lacking ontological accuracy, since epicycles don't exist.

Another way is to notice that ontological revolutions can make merely modest changes to predictive abilities. Relativity inverted the absolute space and time of Newtonian physics, but its predictions were so close that subtle experiments were required to distinguish the two,,

In that case, there is still, a difference in empirical predictiveness. In the extreme case there is not: you can have two ontologies that always make the sane predictions, the one being dual to the other. An example is wave particle duality in quantum mechanics.

The fourth way is based on sceptical hypotheses, such as Brain in a Vat and Simulated Reality. Sceptical hypotheses can be rejected, for instance by appeals to Occams Razor, but they cannot be refuted empirically, since any piece of empirical evidence is subject to sceptical interpretation. Occams's Razor is not empirical.

Science conceives of perception as based in causation, and causation as being comprised of chains of causes and effects, with only the ultimate effect, the sensation evoked in the observer, being directly accessible to the observer. The cause of the sensation, the other end of the causal chain, the thing observed, has to be inferred from the sensation, the ultimate effect -- and it cannot be inferred uniquely, since, in general, more than one cause can produce the same effect. A further proxy can always be inserted into a series of proxies. All illusions, from holograms to stage conjuring, work by producing the effect, the percept, in an unexpected way. A BIV or Matrix observer would assume that the precept of a horse is caused by a horse, but it would actually by a mad scientist pressing buttons.

A BIV or Matrix inhabitant could come up with science that works, that is useful, for many purposes, so long as their virtual reality had some stable rules. They could infer that dropping an (apparent) brick onto their (apparent) foot would cause pain, and so on. It would be like the player of a computer game being skilled in the game, knowing its internal physics.The science of the Matrix inhabitants would work, in a sense, but the workability of their science would be limited to relating apparent causes to apparent effects, not to grounding causes and effects in ultimate reality. But empiricism cannot tell us that we are not in the same situation.

In the words of Werner Heisenberg (Physics and Philosophy, 1958) "We have to remember that what we observe is not nature herself, but nature exposed to our method of questioning"

Comment author: Stuart_Armstrong 14 April 2015 07:03:01PM 1 point [-]

We don't seem to be disagreeing about anything factual. You just want grounding to be in "the fundamental ontology", while I'm content with them being grounded in the set of everything we could observe. If you like, I'm using Occam or simplicity priors on ontologies; if there are real objects behind the ones we can observe but we never know about them, I'd still count our symbols as grounded. (that's why I'd count virtual Napoleon's symbols as being grounded in virtual Waterloo, incidentally)

Anti-Pascaline satisficer

3 Stuart_Armstrong 14 April 2015 06:49PM

It occurred to me that the anti-Pascaline agent design could be used as part of a satisficer approach.

The obvious thing to reduce dangerous optimisation pressure is to make a bounded utility function, with an easily achievable bound. Such as giving them a utility linear in paperclips that maxs out at 10.

The problem with this is that, if the entity is a maximiser (which it might become), it can never be sure that it's achieved its goals. Even after building 10 paperclips, and an extra 2 to be sure, and an extra 20 to be really sure, and an extra 3^^^3 to be really really sure, and extra cameras to count them, with redundant robots patrolling the cameras to make sure that they're all behaving well, etc... There's still an ε chance that it might have just dreamed this, say, or that its memory is faulty. So it has a current utility of (1-ε)10, and can increase this by reducing ε - hence by building even more paperclips.

Hum... ε, you say? This seems a place where the anti-Pascaline design could help. Here we would use it at the lower bound of utility. It currently has probability ε of having utility < 10 (ie it has not built 10 paperclips) and (1-ε) of having utility = 10. Therefore and anti-Pascaline agent with ε lower bound would round this off to 10, discounting the unlikely event that it has been deluded, and thus it has no need to build more paperclips or paperclip counting devices.

Note that this is an un-optimising approach, not an anti-optimising one, so the agent may still build more paperclips anyway - it just has no pressure to do so.

View more: Next