Practical tools and agents

private_messaging

Presently, the 'utility maximizers' work as following: given a mathematical function f(x) , a solver finds the x that corresponds to a maximum (or, typically, minimum) of f(x) . The x is usually a vector describing the action of the agent, the f is a mathematically defined function which may e.g. simulate some world evolution and compute the expected worth of end state, given action x, as in f(x)=h(g(x)) where h computes worth of world state g(x), and g computes the world state at some future time assuming that action x was taken.

For instance, the f may represent some metric of risk, discomfort, and time, over a path chosen by a self driving car, in a driving simulator (which is not reductionist). In this case this metric (which is always non-negative) is to be minimized.

In a very trivial case, such as finding the cannon elevation at which the cannonball will land closest to the target, in vacuum, the solution can be found analytically.

In more complex cases multitude of methods are typically employed, combining iteration of potential solutions with analytical and iterative solving for local maximum or minimum. If this is combined with sensors and the model-updater, and actuators, an agent like a self driving car can be made.

Those are the utility functions as used in the field of artificial intelligence.

A system can be strongly superhuman at finding maximums to functions, and ultimately can be very general purpose, allowing it's use to build models which are efficiently invertible into a solution. However it must be understood that the intelligent component finds mathematical solutions to, ultimately, mathematical relations.

The utility functions as known and discussed on LW seem entirely different in nature. Them are defined on the real word, using natural language that conveys intent, and seem to be a rather ill defined concept for which the bottom-up formal definition may not even exist. The implementation of such concept, if at all possible, would seem to require a major breakthrough in the philosophy of mind.

This is an explanation of an important technical distinction mentioned in Holden Karnofsky's post.

On the discussion in general: It may well be the case that it is very difficult or impossible to define a system such as self driving car in terms of the concepts that are used on LW to talk about intelligences. In particular, the LW's notion of "utility" does not seem to allow to accurately describe the kind of tool that Holden Karnofsky was speaking of, in terms of this utility.

In a very trivial case, such as finding the cannon elevation at which the cannonball will land closest to the target, in vacuum, the solution can be found analytically.

Those are the utility functions as used in the field of artificial intelligence.

This is an explanation of an important technical distinction mentioned in Holden Karnofsky's post.

This one. The argument on LW goes as "you can't define distinction between tool and agent, so we're right".

Now, to those with knowledge of the field, it is akin to some supposedly engineer claiming you can't define distinction between a bolt and a screw, as a way to defy the statement that "you can avoid splitting the brittle wood if you drill a hole and use a bolt, rather than use a screw", which was a rebuttal to "a threaded fastener would split the brittle wood piece". The only things it demonstrates is ignorance, incompetence, and lack of work towards actually fulfilling the stated mission.

For this 'oracle' link, it clearly illustrates the mechanism of generation of strings employed to talk about the AI risks. You start with the scary idea, then you progress to necessity for each type of AI to be shown scary, then you proceed to each subtype, then you make more and more detailed strings designed to approximate the strings that result from entirely different process of starting from basics (and study of the field) and proceeding upwards to risk estimate.

That task is aided by fact that it is impossible to define a provably safe AI in English (or in technobabble) due to vagueness/ambiguousity, and due to fact that language predominantly attributes real world desires when describing anything that seems animate. That is, when you have a system that takes in sequences and generates functions that approximate the sequences (thus allowing prediction of next element in sequences, without over-training on noise), you can describe it as predictor in English and now you got 'implicit' goal of changing the world to match the prediction. Followed by "everyone give us money or it is going to kill us all, we're the only ones whom understand this implied desire! [everyone else's more wrong because we call ourselves less wrong seem to be implied]". Speaking of which use of language is a powerful irrationality technique.

Meanwhile, in the practice, such stuff is not only not implicit, it is incredibly difficult to implement even if you wanted to implement it. Ultimately, many of the 'implied' qualities that are very hard to avoid in English descriptions of AI are, also, incredibly difficult to introduce when programming. We have predictor-type algorithms, which can be strongly superhuman if given enough computing power - and none of them would exhibit a trace of 'implicit' desire to change the world.

There's the notion that anything which doesn't 'understand' your implied, is not powerful enough (not powerful enough for what?), that's just rationalization, and is not otherwise substantiated or even defined. Or even relevant. Let's make example in other field. Clearly, any space propulsion we know is possible, is not powerful enough to get to faster than speed of light. Very true. Shouldn't be used to imply that we'll have faster than light travel.

5

Practical tools and agents

5

5

5

Practical tools and agents

5

5