Comment author: turchin 28 July 2015 11:42:54PM *  2 points [-]

Dumb agent could also cause human extinction. "To kill all humans" is computationly simpler task than to create superintelligence. And it may be simplier by many orders of magnitude.

Comment author: AndreInfante 29 July 2015 12:35:23AM 2 points [-]

I seriously doubt that. Plenty of humans want to kill everyone (or, at least, large groups of people). Very few succeed. These agents would be a good deal less capable.

Comment author: jacob_cannell 28 July 2015 04:14:23AM *  0 points [-]

Okay, so we just have to determine human terminal values in detail, and plug them into a powerful maximizer.

No - not at all. Perhaps you have read too much MIRI material, and not enough of the neuroscience and machine learning I referenced. An infant is not born with human 'terminal values'. It is born with some minimal initial reward learning circuitry to bootstrap learning of complex values from adults.

Stop thinking of AGI as some wierd mathy program. Instead think of brain emulations - and then you have obvious answers to all of these questions.

Saying the phrase "safe sandbox sim" is much easier than making a virtual machine that can withstand a superhuman intelligence trying to get out of it.

You apparently didn't read my article or links to earlier discussion? We can easily limit the capability of minds by controlling knowledge. A million smart evil humans is dangerous - but only if they have modern knowledge. If they have only say medieval knowledge, they are hardly dangerous. Also - they don't realize they are in a sim. Also - the point of the sandbox sims is to test architectures, reward learning systems, and most importantly - altruism. Designs that work well in these safe sims are then copied into less safe sims and finally the real world.

Consider the orthogonality thesis - AI of any intelligence level can be combined with any values. Thus we can test values on young/limited AI before scaling up their power.

Sandbox sims can be arbitrarily safe. It is the only truly practical workable proposal to date. It is also the closest to what is already used in industry. Thus it is the solution by default.

Even if your software is perfect, it can still figure out that its world is artificial and figure out ways of blackmailing its captors

Ridiculous nonsense. Many humans today are aware of the sim argument. The gnostics were aware in some sense 2,000 years ago. Do you think any of them broke out? Are you trying to break out? How?

If it's maximizing its own utility, which is necessary if you want it to behave anything like a child, what's to stop it from learning human greed and cruelty, and becoming an eternal tyrant?

Again, stop thinking we create a single AI program and then we are done. It will be a largescale evolutionary process, with endless selection, testing, and refinement. We can select for super altruistic moral beings - like bhudda/gandhi/jesus level. We can take the human capability for altruism, refine it, and expand on it vastly.

For starters, you want to be able to prove formally that its goals will remain stable as it self-modifies,

Quixotic waste of time.

Comment author: AndreInfante 28 July 2015 05:31:28AM *  -1 points [-]

So, to sum up, your plan is to create an arbitrarily safe VM, and use it to run brain-emulation-style denovo AIs patterned on human babies (presumably with additional infrastructure to emulate the hard-coded changes that occur in the brain during development to adulthood: adult humans are not babies + education). You then want to raise many, many iterations of these things under different conditions to try to produce morally superior specimens, then turn those AIs loose and let them self modify to godhood.

Is that accurate? (Seriously, let me know if I'm misrepresenting your position).


A few problems immediately come to mind. We'll set aside the moral horror of what you just described as a necessary evil to avert the apocalypse, for the time being.

More practically, I think you're being racist against weird mathy programs.

For starters, I think weird mathy programs will be a good deal easier to develop than digital people. Human beings are not just general optimizers. We have modules that function roughly like one, which we use under some limited circumstances, but anyone who's ever struggled with procrastination or put their keys in the refrigerator knows that your goal-oriented systems are entangled with a huge number of cheap heuristics at various levels, many of which are not remotely goal-oriented.

All of this stuff is deeply tangled up with what we think of as the human 'utility function,' because evolution has no incentive to design a clean separation between planning and values. Replicating all of that accurately enough to get something that thinks and behaves like a person is likely much harder than making a weird mathy program that's good at modelling the world and coming up with plans.

There's also the point that there really isn't a good way to make a brain emulation smarter. Weird, mathy programs - even ones that use neural networks as subroutines - often have obvious avenues to making them smarter, and many can scale smoothly with processing resources. Brain emulations are much harder to bootstrap, and it'd be very difficult to preserve their behavior through the transition.

My best guess is, they'd probably go nuts and end up as an eldritch horror. And if not, they're still going to get curb stomped by the first weird mathy program to come along, because they're saddled with all of our human imperfections and unnecessary complexity. The upshot of all of this is that they don't serve the purpose of protecting us from future UFAIs.

Finally, the process you described isn't really something you can start on (aside from the VM angle) until you already have human level AGIs, and a deep and total understanding of all of the operation of the human brain. Then, while you're setting up your crazy AI concentration camp and burning tens of thousands of man-years of compute time searching for AI Buddha, some bright spark in a basement with a GPU cluster has the much easier task of just cludging together something smart enough to recursively self-improve. You're in a race with a bunch of people trying to solve a much easier problem, and (unlike MIRI) you don't have decades of lead time to get a head start on the problem. Your large-scale evolutionary process would take much, much too much time and money to actually save the world.

In short, I think it's a really bad idea. Although now that I understand what you're getting at, it's less obviously silly than what I originally thought you were proposing. I apologize.

Comment author: jacob_cannell 28 July 2015 01:31:59AM *  1 point [-]

The point is that it's ridiculous to say that human beings are 'universal learning machines'

No - it is not. See the article for the in depth argument and citations backing up this statement.

you can just raise any learning algorithm as a human child and it'll turn out fine.

Well almost - A ULM also requires a utility function or reward circuitry with some initial complexity, but we can also use the same universal learning algorithms to learn that component. It is just another circuit, and we can learn any circuit that evolution learned.

And that's all it takes to make them consistently UnFriendly, regardless of how well they're raised.

Sure - which is why I discussed sim sandbox testing. Did you read about my sim sandbox idea? We test designs in a safe sandbox sim, and we don't copy sociopaths.

Obviously, AIs are going to be more different from us than that

No, this isn't obvious at all. AGI is going to be built from the same principles as the brain - because the brain is a universal learning machine. The AGI's mind structure will be learned from training and experiential data such that the AI learns how to think like humans and learns how to be human - just like humans do. Human minds are software constructs - without that software we would just be animals (feral humans). An artificial brain is just another computer that can run the human mind software.

That hard coding is not going to be present in an arbitrary AI, which means we have to go and duplicate it out of a human brain. Which is HARD.

Yes, but it's only a part of the brain and a fraction of the brain's complexity, so obviously it can't be harder than reverse engineering the whole brain.

Comment author: AndreInfante 28 July 2015 03:40:37AM *  0 points [-]

A ULM also requires a utility function or reward circuitry with some initial complexity, but we can also use the same universal learning algorithms to learn that component. It is just another circuit, and we can learn any circuit that evolution learned.

Okay, so we just have to determine human terminal values in detail, and plug them into a powerful maximizer. I'm not sure I see how that's different from the standard problem statement for friendly AI. Learning values by observing people is exactly what MIRI is working on, and it's not a trivial problem.

For example: say your universal learning algorithm observes a human being fail a math test. How does it determine that the human being didn't want to fail the math test? How does it cleanly separate values from their (flawed) implementation? What does it do when peoples' values differ? These are hard questions, and precisely the ones that are being worked on by the AI risk people.

Other points of critique:

Saying the phrase "safe sandbox sim" is much easier than making a virtual machine that can withstand a superhuman intelligence trying to get out of it. Even if your software is perfect, it can still figure out that its world is artificial and figure out ways of blackmailing its captors. Probably doing what MIRI is looking into, and designing agents that won't resist attempts to modify them (corrigibility) is a more robust solution.

You want to be careful about just plugging in a learned human utility function into a powerful maximizer, and then raising it. If it's maximizing its own utility, which is necessary if you want it to behave anything like a child, what's to stop it from learning human greed and cruelty, and becoming an eternal tyrant? I don't trust a typical human to be god.

And even if you give up on that idea, and have to maximize a utility function defined in terms of humanity's values, you still have problems. For starters, you want to be able to prove formally that its goals will remain stable as it self-modifies, and it won't create powerful sub-agents who don't share those goals. Which is the other class of problems that MIRI works on.

Comment author: AndreInfante 27 July 2015 09:59:37PM *  7 points [-]

Here's one from a friend of mine. It's not exactly an argument against AI risk, but it is an argument that the problem may be less urgent than it's traditionally presented.

  1. There's plenty of reason to believe that Moore's Law will slow down in the near future

  2. Progress on AI algorithms has historically been rather slow.

  3. AI programming is an extremely high level cognitive task, and will likely be among the hardest things to get an AI to do.

  4. These three things together suggest that there will be a 'grace period' between the development of general agents, and the creation of a FOOM-capable AI.

  5. Our best guess for the duration of this grace period is on the order of multiple decades.

  6. During this time, general-but-dumb agents will be widely used for economic purposes.

  7. These agents will have exactly the same perverse instantiation problems as a FOOM-capable AI, but on a much smaller scale. When they start trying to turn people into paperclips, the fallout will be limited by their intelligence.

  8. This will ensure that the problem is taken seriously, and these dumb agents will make it much easier to solve FAI-related problems, by giving us an actual test bed for our ideas where they can't go too badly wrong.


This is a plausible-but-not-guaranteed scenario for the future, which feels much less grim than the standard AI-risk narrative. You might be able to extend it into something more robust.

Comment author: [deleted] 23 July 2015 09:25:41PM *  6 points [-]

http://kruel.co/2012/07/17/ai-risk-critiques-index/

Kruel's critique sounded very convincing when I first read it.

In response to comment by [deleted] on Steelmaning AI risk critiques
Comment author: AndreInfante 27 July 2015 08:50:01PM *  1 point [-]

(1) Intelligence is an extendible method that enables software to satisfy human preferences. (2) If human preferences can be satisfied by an extendible method, humans have the capacity to extend the method. (3) Extending the method that satisfies human preferences will yield software that is better at satisfying human preferences. (4) Magic happens. (5) There will be software that can satisfy all human preferences perfectly but which will instead satisfy orthogonal preferences, causing human extinction.

This is deeply silly. The thing about arguing from definitions is that you can prove anything you want if you just pick a sufficiently bad definition. That definition of intelligence is a sufficiently bad definition.

EDIT:

To extend this rebuttal in more detail:

I'm going to accept the definition of 'intelligence' given above. Now, here's a parallel argument of my own:

  1. Entelligence is an extendible method for satisfying an arbitrary set of preferences that are not human preferences.

  2. If these preferences can be satisfied by an extendible method, then the entelligent agent has the capacity to extend the method.

  3. Extending the method that satisfies these non-human preferences will yield software that's better at satisfying non-human preferences.

  4. The inevitable happens.

  5. There will be software that will satisfy non-human preferences, causing human extinction.


Now, I pose to you: how do we make sure that we're making intelligent software, and not "entelligent" software, under the above definitions? Obviously, this puts us back to the original problem of how to make a safe AI.

The original argument is rhetorical slight of hand. The given definition of intelligence implicitly assumes that the problem doesn't exist, and all AI's will be safe, and then goes on to prove that all AIs will be safe.

It's really, fundamentally silly.

Comment author: jacob_cannell 27 July 2015 06:26:14AM *  2 points [-]

Super obvious re-rebut: sociopaths exist, and yet civilization endures.

Also, we can rather obviously test in safe simulation sandboxes and avoid copying sociopaths. The argument that sociopaths are a fundemental showstopper must be based then on some magical view of the brain (because obviously evolution succeeds in producing non sociopaths, so we can copy its techniques if they are nonmagical).

Remember the argument is against existential threat level UFAI, not some fraction of evil AIs in a large population.

Comment author: AndreInfante 27 July 2015 08:43:52PM *  0 points [-]

I think you misunderstand my argument. The point is that it's ridiculous to say that human beings are 'universal learning machines' and you can just raise any learning algorithm as a human child and it'll turn out fine. We can't even raise 2-5% of HUMAN CHILDREN as human children and have it reliably turn out okay.

Sociopaths are different from baseline humans by a tiny degree. It's got to be a small number of single-gene mutations. A tiny shift in information. And that's all it takes to make them consistently UnFriendly, regardless of how well they're raised. Obviously, AIs are going to be more different from us than that. And that's a pretty good reason to think that we can't just blithely assume that putting Skynet through preschool is going to keep us safe.

Human values are obviously hard coded in large part, and the hard coded portions seem to be crucial. That hard coding is not going to be present in an arbitrary AI, which means we have to go and duplicate it out of a human brain. Which is HARD. Which is why we're having this discussion in the first place.

Comment author: jacob_cannell 24 July 2015 04:42:05AM *  9 points [-]

Here is a novel argument you may or may not have heard: We live in the best of all probable worlds due to simulation anthropics. Future FAI civs spend a significant amount of their resources to resimulate and resurrect past humanity - winning the sim race by a landslide (as UFAI is not strongly motivated to sim us in large numbers). As a result of this anthropic selection force, we find ourselves in a universe that is very lucky - it is far more likely to lead to FAI than you would otherwise think.

The best standard argument is this: the brain is a universal learning machine - the same general architecture that will necessarily form the basis for any practical AGI. In addition the brain is already near optimal in terms of what can be done for 10 watts with any irreversible learning machine (this is relatively easy to show from wiring energy analysis). Thus any practical AGI is going to be roughly brain like, similar to baby emulations. All of the techniques used to raise humans safely can thus be used to raise AGI safely. LW/MIRI historically reject this argument based - as far as I can tell - on a handwavey notion of 'anthropomorphic bias', which has no technical foundation.

I've presented the above argument about four years ago, but I never bothered to spend the time backing it up in excruciating formal detail. Until more recently. The last 5 years of progress in AI strongly supports this anthropomorphic AGI viewpoint.

Comment author: AndreInfante 24 July 2015 08:14:10PM 3 points [-]

To rebut: sociopaths exist.

Comment author: Andy_McKenzie 26 May 2015 02:32:01AM 22 points [-]

Sure. Basically, there are two groups, each of which has made a major contribution:

1) Shawn Mikula and his group. They have made substantial progress (some would say, almost solved) of how to make the neuronal connections and other brain structures such as white matter tracts in a full mouse brain traceable using electron microscopy. Electron microscopy is the lowest level of imaging currently feasible, and can clearly resolve structures that are thought to be key to memory such as synapses.

2) The 21CM group, including Robert McIntyre. They have developed a totally new method of preserving a brain that should yield both highly practical and technical sound preservation. In a sense it combines the methods discussed by Gwern in his article Plastination vs Cryonics, because it first uses a method traditionally associated with "plastination" (glutaraldehyde perfusion), and then uses a method traditionally associated with cryonics, i.e. perfusion with a cryoprotective agent and then low temperature storage and, presumably, vitrification, which means that damage from ice crystal formation should be avoided and the brain should turn a glass state.

Apologies if this is still too technical and I'm happy to answer any follow-up questions. Many key steps remain but this is progress worthy of celebrating and, in my view, supporting.

Comment author: AndreInfante 03 June 2015 06:34:32AM 1 point [-]

What are the advantages to the hybrid approach as compared to traditional cryonics? Histological preservation? Thermal cracking? Toxicity?

Comment author: lukeprog 02 November 2014 11:09:21PM 3 points [-]
Comment author: AndreInfante 29 January 2015 12:24:49PM 0 points [-]

Thank you!

Comment author: lukeprog 31 October 2014 11:08:51PM 6 points [-]

The press typically describes DeepMind as "highly secretive," but actually they publish a ton of their research — including this paper — in all the usual venues: NIPS, arxiv, etc.

Comment author: AndreInfante 02 November 2014 08:48:47PM 2 points [-]

That sounds fascinating. Could you link to some non-paywalled examples?

View more: Prev | Next