JGWeissman comments on The Curve of Capability - Less Wrong

18 Post author: rwallace 04 November 2010 08:22PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (264)

You are viewing a single comment's thread. Show more comments above.

Comment author: Perplexed 06 November 2010 04:22:37PM 6 points [-]

I was curious if there might exist something analogous to the halting problem regarding provability of goal-stability and general friendliness.

There is a lot of confusion out there as to whether the logic of the halting problem means that the idea of proving program correctness is somehow seriously limited. It does not mean that. All it means is that there can be programs so badly written that it is impossible to say what they do.

However, if programs are written for some purpose, and their creator can say why he expects the programs to do what he meant them to do, then that understanding of the programs can be turned into a checkable proof of program correctness.

It may be the case that it is impossible to decide whether an arbitrary program even halts. But we are not interested in arbitrary programs. We are interested in well-written programs accompanied by a proof of their correctness. Checking such a proof is not only a feasible task, it is an almost trivial task.

To say this again, it may well be the case that there are programs for which it is impossible to find a proof that they work. But no one sane would run such a program with any particular expectations regarding its behavior. However, if a program was constructed for some purpose, and the constructor can give reasons for believing that the program does what it is supposed to do, then a machine can check whether those reasons are sufficient.

It is a reasonable conjecture that for every correct program there exists a provably correct program which does the same thing and requires only marginally more resources to do it.

In other words, self-improvement would demand resources, therefore the AGI could not profit from its ability to self-improve regarding the necessary acquisition of resources to be able to self-improve in the first place.

Huh?

If that is meant to be an argument against a singleton AI being able to FOOM without using external resources, then you may be right. But why would it not have access to external resources? And why limit consideration to singleton AIs?

Comment author: JGWeissman 06 November 2010 04:54:47PM 1 point [-]

All it means is that there can be programs so badly written that it is impossible to say what they do.

I would like to see all introductions to the Halting Problem explain this point. Unfortunately, it seems that "computer scientists" have only limited interest in real physical computers and how people program them.

Comment author: fiddlemath 06 November 2010 10:27:43PM *  2 points [-]

I'm working on my ph.d. in program verification. Every problem we're trying to solve is as hard as the halting problem, and so we make the assumption, essentially, that we're operating over real programs: programs that humans are likely to write, and actually want to run. It's the only way we can get any purchase on the problem.

Trouble is, the field doesn't have any recognizable standard for what makes a program "human-writable", so we don't talk much about that assumption. We should really get a formal model, so we have some basis for expecting that a particular formal method will work well before we implement it... but that would be harder to publish, so no one in academia is likely to do it.

Comment author: andreas 07 November 2010 12:07:22AM 2 points [-]

Similarly, inference (conditioning) is incomputable in general, even if your prior is computable. However, if you assume that observations are corrupted by independent, absolutely continuous noise, conditioning becomes computable.

Comment author: gwern 03 January 2011 06:48:39PM 0 points [-]

Offhand, I don't see any good way of specifying it in general, either (even the weirdest program might be written as part of some sort of arcane security exploit). Why don't you guys limit yourselves to some empirical subset of programs that humans have written, like 'Debian 5.0'?