abramdemski comments on Thoughts on the Singularity Institute (SI) - Less Wrong

256 Post author: HoldenKarnofsky 11 May 2012 04:31AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (1270)

You are viewing a single comment's thread. Show more comments above.

Comment author: dspeyer 11 May 2012 02:47:26AM 6 points [-]

Any sufficiently advanced tool is indistinguishable from [an] agent.

Let's see if we can use concreteness to reason about this a little more thoroughly...

As I understand it, the nightmare looks something like this. I ask Google SuperMaps for the fastest route from NYC to Albany. It recognizes that computing this requires traffic information, so it diverts several self-driving cars to collect real-time data. Those cars run over pedestrians who were irrelevant to my query.

The obvious fix: forbid SuperMaps to alter anything outside of its own scratch data. It works with the data already gathered. Later a Google engineer might ask it what data would be more useful, or what courses of action might cheaply gather that data, but the engineer decides what if anything to actually do.

This superficially resembles a box, but there's no actual box involved. The AI's own code forbids plans like that.

But that's for a question-answering tool. Let's take another scenario:

I tell my super-intelligent car to take me to Albany as fast as possible. It sends emotionally manipulative emails to anyone else who would otherwise be on the road encouraging them to stay home.

I don't see an obvious fix here.

So the short answer seems to be that it matters what the tool is for. A purely question-answering tool would be extremely useful, but not as useful as a general purpose one.

Could humans with a oracular super-AI police the development and deployment of active super-AIs?

Comment author: abramdemski 11 May 2012 05:36:33AM *  1 point [-]

I tell my super-intelligent car to take me to Albany as fast as possible. It sends emotionally manipulative emails to anyone else who would otherwise be on the road encouraging them to stay home.

Then it's running in agent mode? My impression was that a tool-mode system presents you with a plan, but takes no actions. So all tool-mode systems are basically question-answering systems.

Perhaps we can meaningfully extend the distinction to some kinds of "semi-autonomous" tools, but that would be a different idea, wouldn't it?

(Edit) After reading more comments, "a different idea" which seems to match this kind of desire... http://lesswrong.com/lw/cbs/thoughts_on_the_singularity_institute_si/6jys

Comment author: David_Gerard 11 May 2012 01:57:05PM *  14 points [-]

Then it's running in agent mode? My impression was that a tool-mode system presents you with a plan, but takes no actions. So all tool-mode systems are basically question-answering systems.

I'm a sysadmin. When I want to get something done, I routinely come up with something that answers the question, and when it does that reliably I give it the power to do stuff on as little human input as possible. Often in daemon mode, to absolutely minimise how much it needs to bug me. Question-answerer->tool->agent is a natural progression just in process automation. (And this is why they're called "daemons".)

It's only long experience and many errors that's taught me how to do this such that the created agents won't crap all over everything. Even then I still get surprises.

Comment author: private_messaging 11 May 2012 03:21:42PM *  1 point [-]

Well, do your 'agents' build a model of the world, fidelity of which they improve? I don't think those really are agents in the AI sense, and definitely not in self improvement sense.

Comment author: David_Gerard 11 May 2012 03:28:55PM *  10 points [-]

They may act according to various parameters they read in from the system environment. I expect they will be developed to a level of complication where they have something that could reasonably be termed a model of the world. The present approach is closer to perceptual control theory, where the sysadmin has the model and PCT is part of the implementation. 'Cos it's more predictable to the mere human designer.

Capacity for self-improvement is an entirely different thing, and I can't see a sysadmin wanting that - the sysadmin would run any such improvements themselves, one at a time. (Semi-automated code refactoring, for example.) The whole point is to automate processes the sysadmin already understands but doesn't want to do by hand - any sysadmin's job being to automate themselves out of the loop, because there's always more work to do. (Because even in the future, nothing works.)

I would be unsurprised if someone markets a self-improving system for this purpose. For it to go FOOM, it also needs to invent new optimisations, which is presently a bit difficult.

Edit: And even a mere daemon-like automated tool can do stuff a lot of people regard as unFriendly, e.g. high frequency trading algorithms.

Comment author: TheAncientGeek 05 July 2014 05:45:18PM *  0 points [-]

It's not a natural progression in the sense of occurring without human intervention. That is rather relevant if the idea ofAI safety is going to be based on using tool AI strictly as tool AI.

Comment author: TheOtherDave 11 May 2012 02:12:03PM *  1 point [-]

Then it's running in agent mode? My impression was that a tool-mode system presents you with a plan, but takes no actions. So all tool-mode systems are basically question-answering systems.

My own impression differs.

It becomes increasingly clear that "tool" in this context is sufficiently subject to different definitions that it's not a particularly useful term.

Comment author: abramdemski 12 May 2012 07:00:27AM 2 points [-]

I've been assuming the definition from the article. I would agree that the term "tool AI" is unclear, but I would not agree that the definition in the article is unclear.