You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Chatbots or set answers, not WBEs

5 Stuart_Armstrong 08 September 2015 05:17PM

A putative new idea for AI control; index here.

In a previous post, I talked about using a WBE to define a safe output for a reduced impact AI.

I've realised that the WBE isn't needed. Its only role was to ensure that the AI's output could have been credibly produced by something other than the AI - "I'm sorry, Dave. I'm afraid I can't do that." is unlikely to be the output of a random letter generator.

But a whole WBE is not needed. If the output is short, a chatbot with access to a huge corpus of human responses could function well. We can specialise it in the direction we need - if we are asking for financial advice, we can mandate a specialised vocabulary or train it on financial news sources.

So instead of training the reduced impact AI to behave as the 'best human advisor', we are are training it to behave as the 'luckiest chatbot'. This allows to calculate odds with greater precision, and has the advantage of no needing to wait for a WBE.

For some questions, we can do even better. Suppose we have a thousand different stocks, and are asking which one would increase in value the most during the coming year. The 'chatbot' here is simply an algorithm that picks a stock at random. So we now have an exact base rate - 1/1000 - and predetermined answers from the AI.

[EDIT:] Another alternative is to get online users to submit answers to the question. Then the AI selects the best answer from the choices. And if the AI is not turned on, a random answer is selected.

AI, cure this fake person's fake cancer!

10 Stuart_Armstrong 24 August 2015 04:42PM

A putative new idea for AI control; index here.

An idea for how an we might successfully get useful work out of a powerful AI.

 

The ultimate box

Assume that we have an extremely detailed model of a sealed room, with a human in it and enough food, drink, air, entertainment, energy, etc... for the human to survive for a month. We have some medical equipment in the room - maybe a programmable set of surgical tools, some equipment for mixing chemicals, a loud-speaker for communication, and anything else we think might be necessary. All these objects are specified within the model.

We also have some defined input channels into this abstract room, and output channels from this room.

The AI's preferences will be defined entirely with respect to what happens in this abstract room. In a sense, this is the ultimate AI box: instead of taking a physical box and attempting to cut it out from the rest of the universe via hardware or motivational restrictions, we define an abstract box where there is no "rest of the universe" at all.

 

Cure cancer! Now! And again!

What can we do with such a setup? Well, one thing we could do is to define the human in such a way that they have some from of advanced cancer. We define what "alive and not having cancer" counts as, as well as we can (the definition need not be fully rigorous). Then the AI is motivated to output some series of commands to the abstract room that results in the abstract human inside not having cancer. And, as a secondary part of its goal, it outputs the results of its process.

continue reading »

The AI, the best human advisor

7 Stuart_Armstrong 13 July 2015 03:33PM

A putative new idea for AI control; index here.

All my ideas on reduced impact AI follow the same format: some restriction to ensure a reduced impact (I generally use something like "could a future AI detect this one existed"?) and some "trick" to allow a way of still achieving a larger impact. Further more, there is always a "nice" way of achieving this larger impact (do the thing we really want you to do, and only that!), so that we hope the AI can be forced into that behaviour. Examples of "tricks" are:

In this post, I'll present another variant - one that I think is the most interesting so far. I won't change the reduced impact condition, but will present a different "trick".

 

AI pretends to be human

As in this post, imagine that we have a reduced-impact AI in a box that conceals it from us. The AI has some goal, and will output some data to further that goal, then turn itself off. And then we'll shake the box around or something, so we can't be sure that the AI was turned on.

There are two issues to deal with, here: a) The message itself is strong evidence that the AI was turned on, and b) if we exclude the message from the reduced impact, the message becomes dangerous.

To deal with a), first of all. This approach gets much easier if we have a whole brain emulation (WBE). Then we can house the WBE in the box with the AI. If the AI doesn't send out a message, then the WBE sends out is own message instead. There are variants we can try here if we don't have WBE - maybe some human volunteer willing to either die or take amnesia pills at the end of the experiment.

So, receiving a coherent message is not a surprising thing to come out the output channel. However, an AI motivated strongly by reduced impact, will simply seek to imitate as precisely as possible what the WBE would send. We still need a trick to get something useful.

 

AI pretends to be BEST human

Imagine that we wanted the AI to give us investment advice to make us rich. Say, a binary task: get us ten million dollars over the course of the year.

Now, this is the kind of thing that the WBE could conceivably achieve. We could run the WBE fast, for three subjective weeks, and it has a chance of giving us advice that would get us that rich. Assume that it only has 0.1% chance of doing so. The AI, on the other hand, could produce advice that makes us that rich with almost certainty.

Therefore if we get advice, follow it, and end up at least ten million dollars richer, we can conclude that the odds update of the AI having given us the advice (rather than the WBE doing so) is about 1:1000.

And this gives us the solution to b). The "trick", the escape clause, is by conditioning on the outcome of our investments. So we're asking: given that we have become rich (and given the other facts about the universe in 50 years time) what additional evidence is there that the AI was turned on?

Now, a sensible WBE is almost certain to not come up with a scheme that destroys the universe with self-tiling agents - or, indeed, does anything too crazy. We could instruct the WBE to this effect, and we're only running it for three weeks, which prevents the kind of drifts that you see in more extreme situations. Thus a message from the WBE, no matter how likely it is to make us rich, is almost certain to not lead to tilling the universe with self-replicating nano-bots.

So here, the AI merely has to come up with a message that could have come from the WBE, with at least 0.01% probability, and that would indeed make us rich. And, therefore, the message cannot contain secret code that would optimise the universe.

Note that we need that the goal we are looking for be not too extreme in impact (the higher the impact of the goal, the more latitude the AI has to have impact itself, "hiding" inside the goal's impact), and we should restrict to event of probability, say, greater than 1/10,000 (as when the probability gets too small, odd events start happening).

It seems that this scheme could result in AIs that always give the very best possible advice a human (or WBE) could ever give, which would be a very useful result.

 

Whose probability?

We have a probability distribution over events. The AI itself will have a probability distribution over events. The future hypothetical super-super-AI it is using to compute reduced impact has a probability distribution over events - and the AI has a probability distribution over that probability distribution. If all of them agree on the probability of us getting richer (given WBE advice and given not), then this scheme should work.

If they disagree, there might be problems. A more complex approach could directly take into account the divergent probability estimates; I'll think of that and return to the issue later.

Bostrom versus Transcendence

11 Stuart_Armstrong 18 April 2014 08:31AM

[LINK] AmA by computational neuroscientists behind 'the world's largest functional brain model'

7 michaelcurzi 03 December 2012 07:35PM

Not sure if this has been covered on LW, but it seems highly relevant to WBE development. Link here:

http://www.reddit.com/r/IAmA/comments/147gqm/we_are_the_computational_neuroscientists_behind/

A few questioners mention the Singularity and make Skynet jokes.

The abstract from their paper in Science:

A central challenge for cognitive and systems neuroscience is to relate the incredibly complex behavior of animals to the equally complex activity of their brains. Recently described, large-scale neural models have not bridged this gap between neural activity and biological function. In this work, we present a 2.5-million-neuron model of the brain (called “Spaun”) that bridges this gap by exhibiting many different behaviors. The model is presented only with visual image sequences, and it draws all of its responses with a physically modeled arm. Although simplified, the model captures many aspects of neuroanatomy, neurophysiology, and psychological behavior, which we demonstrate via eight diverse tasks.

I'm curious to see LWers' perspectives on the project.

New WBE implementation

16 Louie 30 November 2012 11:16AM

It usually isn't profitable to pay attention to science news, since science journalists largely misintrepret new "breakthroughs". But I am somewhat interested in this story about "artificial brains" coming out of Canada.

Most large neuron simulations I've read about before don't actually do anything. But apparently there's a somewhat large new WBE implementation at the University of Waterloo that performs sub-humanly on several tasks while having similar weaknesses to human brains.

Curious what others think of this recent development.

 

Singularity Summit 2011 Workshop Report

7 lukeprog 01 March 2012 06:12AM

Here is a short new publication from the Singularity Institute, on the 2-day workshop that followed Singularity Summit 2011.

Note the new publication design. We are currently porting our earlier publications to this template, too.

[LINK] Brain region changes shape with learning the layout of London

1 whpearson 12 December 2011 05:01PM

General Intro:

http://www.bbc.co.uk/news/health-16086233

Older research:

http://www.fil.ion.ucl.ac.uk/Maguire/Maguire2006.pdf

Abstract of latest research

Possible implications for WBE (Is it possible to get short term function correct without having the ability to do long term structural changes?)

Also possible implications for learning lots of information, cabbies with "the Knowledge" had worse visual information recall.

I haven't gone through it all myself yet.

Against WBE (Whole Brain Emulation)

0 Curiouskid 27 November 2011 02:42PM

problem: I've read arguments for WBE, but I can't find any against. 

Most people agree that WBE is the first step to FAI (EDIT: I mean to say that if we were going to try to build AGI in the safest way possible, WBE would be the first step. I did not mean to imply that I thought WBE would come before AGI). I've read a significant portion of Bostrom's WBE roadmap. My question is, are there any good arguments against the feasibility of WBE? A quick google search did not turn up anything other than 

 This video. Given that many people consider the scenario in which WBE comes before AGI, to be safer than the converse, shouldn't we be talking about this more? What probability do you guys assign to the likelihood that WBE comes before AGI?

Bostrom's WBE roadmap details what technological advancement is needed to get towards WBE:

Different required technologies have different support and drivers for development. Computers are developed independently of any emulation goal, driven by mass market forces and the need for special high performance hardware. Moore’s law and related exponential trends appear likely to continue some distance into the future, and the feedback loops powering them are unlikely to rapidly disappear (see further discussion in Appendix B: Computer Performance Development). There is independent (and often sizeable) investment into computer games, virtual reality, physics simulation and medical simulations. Like computers, these fields produce their own revenue streams and do not require WBE‐specific or scientific encouragement.

A large number of the other technologies, such as microscopy, image processing, and computational neuroscience are driven by research and niche applications. This means less funding, more variability of the funding, and dependence on smaller groups developing them. Scanning technologies are tied to how much money there is in research (including brain emulation research) unless medical or other applications can be found. Validation techniques are not widely used in neuroscience yet, but could (and should) become standard as systems biology becomes more common and widely applied.  

 

Finally there are a few areas relatively specific to WBE: large‐scale neuroscience, physical handling of large amounts of tissue blocks, achieving high scanning volumes, measuring functional information from the images, automated identification of cell types, synapses, connectivity and parameters. These areas are the ones that need most support in order to enable WBE.  The latter group is also the hardest to forecast, since it has weak drivers and a small number of researchers. The first group is easier to extrapolate by using current trends, with the assumption that they remain unbroken sufficiently far into the future. 

 

Implications for those trying to accelerate the future:

Because much of the technological requirements are going to be driven by business-as-usual funding and standard application, anybody who wants to help bring about WBE faster (and hence FAI) should focus on either donating towards the niche applications that won't receive a lot of funding otherwise, or try to become a researcher in those areas (but what good would becoming a researcher be if there's no funding?). Also, how probable is it that once the business-as-usual technologies become more advanced, more government/corporate funding will go towards the niche applications? 

Will the ems save us from the robots?

8 Stuart_Armstrong 24 November 2011 07:23PM

At the FHI, we are currently working on a project around whole brain emulations (WBE), or uploads. One important question is if getting to whole brain emulations first would make subsequent AGI creation

  1. more or less likely to happen,
  2. more or less likely to be survivable.

If you have any opinions or ideas on this, please submit them here. No need to present an organised overall argument; we'll be doing that. What would help most is any unusual suggestion, that we might not have thought of, for how WBE would affect AGI.

EDIT: Many thanks to everyone who suggested ideas here, they've been taken under consideration.