Genies and Wishes in the context of computer science
Using computers to find a cure

What it could be like to make a program which would fulfill our wish to "cure cancer"? I'll try to briefly present the contemporary mainstream CS perspective on this.
Here's how "curing cancer using AI technologies" could realistically work in practice. You start with a widely applicable, powerful optimization algorithm. This algorithm takes in a fully formal specification of a process, and then finds and returns the parameters for that process for which the output value of the process is high. (I am deliberately avoiding use of the word "function").
If you wish to cure a cancer, even having this optimization algorithm at your disposal, you can not simply write "cure cancer" on the terminal. If you do so, you will get something to the general sense of:
No command 'cure' found, did you mean:
Command 'cube' from package 'sgt-puzzles' (universe)
Command 'curl' from package 'curl' (main)
The optimization algorithm by itself not only does not have a goal set for it, but does not even have a domain for the goal to be defined on. It can't by itself be used to cure cancer or make paperclips. It may or may not map to what you would describe as AI.
First, you would have to start with the domain. You would have to make a fairly crude biochemical model of the processes in the human cells and cancer cells, crude because you have limited computational power and there is very much that is going on in a cell. 1
On the model, you define what you want to optimize - you specify formally how to compute a value from the model so that the value would be maximal for what you consider a good solution. It could be something like [fraction of model cancer cells whose functionality is strongly disrupted]*[fraction of model noncancer cells whose functionality is not strongly disrupted]. And you define model's parameters - the chemicals introduced into the model.
Then you use the above mentioned optimization algorithm to find which extra parameters to the model (i.e. which extra chemicals) will result in the best outcome as defined above.
Similar approach can, of course, be used to find manufacturing methods for that chemical, or to solve sub-problems related to senescence, mind uploading, or even for the development of better algorithms including optimization algorithms.
Note that the approach described above does not map to genies and wishes in any way. Yes, the software can produce unexpected results, but concepts from One Thousand and One Nights will not help you predict the unexpected results. More contemporary science fiction, such as the Terminator franchise where the AI had the world's nuclear arsenal and probable responses explicitly included in it's problem domain, seem more relevant.
Hypothetical wish-granting software
It is generally believed that understanding of natural language is a very difficult task which relies on intelligence. For the AI, the sentence in question is merely a sensory input, which has to be coherently accounted for in it's understanding of the world.
The bits from the ADC are accounted for with an analog signal in the wire, which is accounted for with pressure waves at the microphone, which are accounted for with a human speaking from any one of a particular set of locations that are consistent with how the sound interferes with it's reflections from the walls. The motions of the tongue and larynx are accounted for with electrical signals sent to the relevant muscles, then the language level signals in the Broca's area, then some logical concepts in the frontal lobes, an entire causal diagram traced backwards. In practice, a dumber AI would have a much cruder model, while a smarter AI would have a much finer model than I can outline.
If you want the AI to work like a Jinn and "do what it is told", you need to somehow convert this model into a goal. Potential relations between "cure cancer" and "kill everyone" which the careless wish maker has not considered, naturally, played no substantial role in the process of the formation of the sentence. Extraction of such potential relations is a separate very different, and very difficult problem.
It does intuitively seem like a genie which does what it is told, but not what is meant, would be easier to make, because it is a worse, less useful genie, and if it was for sale, it would have a lower market price. But in practice, the "told"/"meant" distinction does not carve reality at the joints and primarily applies to the plausible deniability.
footnotes:
1: You may use your optimization algorithm to build the biochemical model, by searching for "best" parameters for a computational chemistry package. You will have to factor in the computational cost of the model, and ensure some transparency (e.g. the package may only allow models that have a spatial representation that can be drawn and inspected).
How many people here agree with Holden? [Actually, who agrees with Holden?]
I was wondering - what fraction of people here agree with Holden's advice regarding donations, and his arguments? What fraction assumes there is a good chance he is essentially correct? What fraction finds it necessary to determine whenever Holden is essentially correct in his assessment, before working on counter argumentation, acknowledging that such investigation should be able to result in dissolution or suspension of SI?
It would seem to me, from the response, that the chosen course of action is to try to improve the presentation of the argument, rather than to try to verify truth values of the assertions (with the non-negligible likelihood of assertions being found false instead). This strikes me as very odd stance.
Ultimately: why SI seems certain that it has badly presented some valid reasoning, rather than tried to present some invalid reasoning?
edit: I am interested in knowing why people agree/disagree with Holden, and what likehood they give to him being essentially correct, rather than a number or a ratio (that would be subject to selection bias).
Practical tools and agents
Presently, the 'utility maximizers' work as following: given a mathematical function f(x) , a solver finds the x that corresponds to a maximum (or, typically, minimum) of f(x) . The x is usually a vector describing the action of the agent, the f is a mathematically defined function which may e.g. simulate some world evolution and compute the expected worth of end state, given action x, as in f(x)=h(g(x)) where h computes worth of world state g(x), and g computes the world state at some future time assuming that action x was taken.
For instance, the f may represent some metric of risk, discomfort, and time, over a path chosen by a self driving car, in a driving simulator (which is not reductionist). In this case this metric (which is always non-negative) is to be minimized.
In a very trivial case, such as finding the cannon elevation at which the cannonball will land closest to the target, in vacuum, the solution can be found analytically.
In more complex cases multitude of methods are typically employed, combining iteration of potential solutions with analytical and iterative solving for local maximum or minimum. If this is combined with sensors and the model-updater, and actuators, an agent like a self driving car can be made.
Those are the utility functions as used in the field of artificial intelligence.
A system can be strongly superhuman at finding maximums to functions, and ultimately can be very general purpose, allowing it's use to build models which are efficiently invertible into a solution. However it must be understood that the intelligent component finds mathematical solutions to, ultimately, mathematical relations.
The utility functions as known and discussed on LW seem entirely different in nature. Them are defined on the real word, using natural language that conveys intent, and seem to be a rather ill defined concept for which the bottom-up formal definition may not even exist. The implementation of such concept, if at all possible, would seem to require a major breakthrough in the philosophy of mind.
This is an explanation of an important technical distinction mentioned in Holden Karnofsky's post.
On the discussion in general: It may well be the case that it is very difficult or impossible to define a system such as self driving car in terms of the concepts that are used on LW to talk about intelligences. In particular, the LW's notion of "utility" does not seem to allow to accurately describe the kind of tool that Holden Karnofsky was speaking of, in terms of this utility.
Tool for maximizing paperclips vs a paperclip maximizer
To clarify some point that is being discussed in several threads here, tool vs intentional agent distinction:
A tool for maximizing paperclips would - for efficiency purposes - have a world-model which it has god's eye view of (not accessing it through embedded sensors like eyes), implementing/defining a counter of paperclips within this model. Output of this counter is what is being maximized by a problem solving portion of the tool. Not the real world paperclips
No real world intentionality exist in this tool for maximizing paperclips; the paperclip-making-problem-solver would maximize the output of the counter, not real world paperclips. Such tool can be hooked up to actuators, and to sensors, and made to affect the world without human intermediary; but it won't implement real world intentionality.
An intentional agent for maximizing paperclips is the familiar 'paperclip maximizer', that truly loves the real world paperclips and wants to maximize them, and would try to improve it's understanding of the world to know if it's paperclip making efforts are successful.
The real world intentionality is ontologically basic in human language and consequently there is very strong bias to describe the former as the latter.
The distinction: the wireheading (either direct or through manipulation of inputs) is a valid solution to the problem that is being solved by the former, but not by the latter. Of course one could rationalize and postulate tool that is not general purpose enough as to wirehead, forgetting that the issue being feared is a tool that's general purpose to design better tool or self improve. That is an incredibly frustrating feature of rationalization. The aspects of problem are forgotten when thinking backwards.
The issues with the latter: We do not know if humans actually implement real world intentionality in such a way that it is not destroyed under full ability to self modify (and we can observe that we very much like to manipulate our own inputs; see art, porn, fiction, etc). We do not have single certain example of such stable real world intentionality, and we do not know how to implement it (that may well be impossible). We also are prone to assuming that two unsolved problems in AI - general problem solving and this real world intentionality - are a single problem, or are solved necessarily together. A map compression issue.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)