Comment author: crap 25 December 2012 06:22:58PM 2 points [-]

What's about moral objections to creation of multitude of agents for the purposes of evaluation?

Comment author: bryjnar 26 December 2012 06:25:31AM 0 points [-]

They explicitly don't address that:

Second, it might seem that this approach to determining Personal CEV will require a reasonable level of accuracy in simulation. If so, there might be concerns about the creation of, and responsibility to, potential moral agents.

Comment author: shminux 26 December 2012 02:37:39AM 2 points [-]

Sorry if this came across as a status game. Let me give you one example.

experiencing one life can leave you incapable of experiencing another in an unbiased way.

This is a loop Sobel solves with the amnesia model. (A concurrent clone model would be a better description, to avoid any problems with influences between lives, such as physical changes). There is still however the issue of giving advice to your past self after removing amnesia, even though you " might be incapable of adequately evaluating the lives they’ve experienced based on their current, more knowledgeable, evaluative perspective." This loses the sight of the original purpose: the evaluating criteria should be acceptable to the original person, and no such criteria have been set in advance. Same with the parliament: the evaluation depends on the future experiences, feeding into the loop. To remedy the issue, you can decide to create and freeze the arbitration rules in advance. For example, you might choose as your utility function some weighted average of longevity, happiness, procreation, influence on the world around you, etc. Then score the utility of each simulated life, and then pick one of, say, top 10 as your "initial dynamic". Or the top life you find acceptable. (Not restricting to automatically picking the highest-utility one, in order to avoid the "literal genie" pitfall.) You can repeat as you see fit as you go on, adjusting the criteria (hence "dynamic").

While you are by no means guaranteed to end up with the "best life possible" life after breaking the reasoning loop, you at least are spared problems like "better off dead" and "insane parliament", both of which result from a preference feedback loop.

Comment author: bryjnar 26 December 2012 06:17:00AM 1 point [-]

Ooookay. The whole "loop" thing feels like a leaky abstraction to me. If you had to do that much work to explain the loopiness (which I'm still not sold on) and why it's a problem, perhaps saying it's "loopy" isn't adding much.

This loses the sight of the original purpose: the evaluating criteria should be acceptable to the original person

I think I may still be misunderstanding you, but this seems wrong. The whole point is that even if you're on some kind of weird drugs that make you think that drinking bleach would be great, the idealised version of you would not be under such an influence, etc. Hence it might well be that the idealised advisors evaluate things in ways that you would find unaccepable. That's WAD.

Also, I find your other proposal hard to follow: surely if you've got a well-defined utility function already, then none of this is necessary?

Comment author: benelliott 26 December 2012 01:30:53AM *  2 points [-]

Nitpick, the Lowenheim-Skolem Theorems arre not quite that general. If we allow languages with uncountably many symbols and sets of uncountably many axioms then we can lower bound the cardinality (by bringing in uncountably many constants and for each pair adding the axiom that they are not equal). The technically correct claim would be that any set of axioms either have a finite upper bound on their models, or have models of every infinite cardinality at least as large as the alphabet in which they are expressed.

But at least it's (the Compactness Theorem) simpler than the Completeness Theorem

It is!? Does anyone know a proof of Compactness that doesn't use completeness as a lemma?

Comment author: bryjnar 26 December 2012 02:19:46AM 1 point [-]

It is!? Does anyone know a proof of Compactness that doesn't use completeness as a lemma?

Yes. Or, at least, I did once! That's the way we proved it the logic course I did. The proof is a lot harder. But considering that the implication from Completeness is pretty trivial, that's not saying much.

Comment author: bryjnar 26 December 2012 02:13:43AM 4 points [-]

Great post! It's really nice to see some engagement with modern philosophy :)

I do wonder slightly how useful this particular topic is, though. CEV and Ideal Avisor theories are about quite different things. Furthermore, since Ideal Advisor theories are working very much with ideals, the "advisors" they consider are usually supposed to be very much like actual humans. CEV, on the other hand, is precisely supposed to be an effective approximation, and so it would seem surprising if it were to actually proceed by modelling a large number of instances of a person and then enhancing them cognitively. So if instead it proceeds by some more approximate (or alternatively, less brute-force) method, then it's not clear that we should be able to apply our usual reasoning about human beings to the "values advisor" that you'd get out of the end of CEV. That seems to undermine Sobel's arguments as applied to CEV.

Comment author: shminux 25 December 2012 09:55:49PM *  1 point [-]

Ever since I worked, in the course of my PhD, with the Godel metric, a solution of the equations of GR which contains closed timlelike curves, I've been noticing how strange loops mess up arguments, calculations and intuition whenever they creep in. My approach has been to search and unwind them before proceeding any further. That's one way to resolve the grandfather paradox, for example.

The issue you are discussing is rife with loops. Notice them. Unwind them. Restate the problem without them. This is not always an easy task, some loops can be pretty insidious. Here is an example from your post:

an ideal version of that agent (fully informed, perfectly rational, etc.) would advise the non-ideal version

One of the ways of removing a potential loop is already suggested in your post:

"our volition be extrapolated once and acted on.

"Once" is what breaks the loop.

Now, to list several loops in Sobel's arguments. Some of these are not obvious, but they are there nonetheless, if you look carefully.

two of the idealized viewpoints disagree about what is to be preferred

experiencing one life can leave you incapable of experiencing another in an unbiased way.

the idealized agent, having experienced such a level of perfection, might come to the conclusion that their non-ideal counterpart is so limited as to be better off dead.

Some of these versions are then assigned as a parliament where they vote on various choices and make trades with one another.

Meditation. Find the loops in each of the above quotes and consider how they can be avoided.

Comment author: bryjnar 26 December 2012 02:01:22AM 8 points [-]

This comment reads to me like: "Haha, I think there are problems with your argument, but I'm not going to tell you what they are, I'm just going to hint obliquely in a way that makes me look clever."

If you actually do have issues with Sobel's arguments, do you think you could actually say what they are?

Comment author: Eliezer_Yudkowsky 25 December 2012 10:50:19PM 26 points [-]

Apparently this is being read by major philosophers now, which is good on the one hand, but on the other hand a really quick review of historical context:

The background problem here is that we want an effective decision procedure of bounded complexity which can actually be implemented in sufficiently advanced Artificial Intelligences.

The first difficulty is the "effective" part. Suppose you want to build a chess-playing program. A philosophy undergrad wisely informs you that you ought to instruct your chess-playing program to make "good moves". You reply that you need a more "effective" specification of what a good move is, so that you can get your program to do it. The undergrad tells you that a good move is one which is wise, highly informed, which will not later be revealed to be a bad move, and so on. What you actually need here is something along the lines of "A good move is one which, when combined with the other player's moves, results in a board state which the following computable predicate verifies as 'winning'". Once you realize the other player is trying to perform a symmetric but opposed procedure, you can model the chessboard's future using search trees. Pragmatically, you're still a long way off from beating Kasparov. But given unbounded finite computing power you could play perfect chess. In turn, this means you're able to get started on the problem of approximating good moves, now that you have an effectively specified definition of maximally good moves, even though you can't evaluate the latter definition using available computing power.

A lot of the motivation in CEV is that we're trying to describe a beneficial AI in terms that allow beneficial-ness to actually be computed or approximated. The AI observes a human and builds up an abstract predictive model of how that human makes decisions - this is an in-principle straightforward problem the way that playing perfect chess is straightforward; Solomonoff induction ideally says how to build good predictive models. What should the AI do with this predictive model, though? An accurate model will accurately predict that the human will choose to drink the glass of bleach, but in an intuitive sense, it seems like we'd want the AI to give the human water.

But suppose we can idealize this decision model in a way which separates terminal values from empirical beliefs. Then we can substitute the AI's world-model for the human's world-model and re-run the decision model. If the AI is much more intelligent than us, this takes care of the bleach-vs.-water case, since the AI knows that the glass contains bleach and that the human values water.

This is the basic paradigm of CEV - build up predictively accurate abstract models of a human decision process, then manipulate them in effectively specified ways to 'construe a volition'. (I would ordinarily say 'extrapolate', but the paper above gave a specific definition of 'extrapolate' that sounds more like surgery followed by prediction than a general, 'look over this accurate human decision model and do X with it').

The appeal of Rawls's reflective equilibrium / Ideal Advisor models is that they describe a construal procedure that sounds effectively computable and approximable: add more veridical knowledge to the decision process (the AI's knowledge, in the case where the AI is smarter than us), run the decision process for a longer time, and allow the model more veridical knowledge of itself and possible even some set of choices for modifying itself. Similarly, the appeal of Bostrom's parliament is not so much that it sounds like a plausible ultimate metaethical theory but that it gives us an effective-sounding procedure for resolving multiple possible volitions (even within a single person) into a coherent output.

More generally, CEV is a case of what Bostrom termed an 'indirect normativity' strategy. If we think values are complex - see e.g. William Frankena's list of terminal values not obviously reducible to one another - a robust strategy would involve trying to teach the AI how to look at humans and absorb and idealize values from them, so as to avoid the problem of accidentally leaving out one value.

The motivation for indirect normativity - for delving into metaethics rather than giving a superintelligent AI a laundry list of cool-sounding wishes - is that we want to pick something close enough to a correct core metaethical structure that it will compactly cover everything human beings want, ought to want, or might later regret asking for, without relying on the ability of human programmers to visualize the outcome in advance. ("I wish you'd get me that glass!" cough cough dies)

Most of the empirical challenge in CEV would stem from the fact that a predictively accurate model of human decisions would be a highly messy structure, and 'construing a volition' suitable for coherent advice isn't a trivial problem. (It sounds to me on a first reading like neither 'idealization' nor 'extrapolation' as defined in the above document may be sufficient for this. Any rational agent needs a coherent utility function, but getting this out of a messy accurate predictive human model is not as simple as conducting a point surgery and extrapolating forward in time, nor as simple as supposing infinite knowledge.)

To compete with CEV in its intended ecological niche (useful advice to (designers of) sufficiently advanced AIs) means looking for alternate theories of how to produce reliable epistemic advice about what-to-do in the presence of messy human values, with sufficient indirection to automatically cover imaginable use-cases of things we didn't think to ask for or might later regret, which theories are close enough to being effectively specified that AI programmers can implement them (though perhaps as something requiring development work to imbue in an AI, rather than a direct computer program).

Comment author: bryjnar 26 December 2012 01:58:36AM *  2 points [-]

A lot of what you've said sounds like you're just reiterating what Luke says quite clearly near the beginning: Ideal Advisor theories are "metaphysical", and CEV is epistemic, i.e. Ideal Advisor theories are usually trying to give an account of what is good, whereas, as you say, CEV is just about trying to find a good effective approximation to the good. In that sense, this article is comparing apples to oranges. But the point is that some criticisms may carry over.

[EDIT: this comment is pretty off the mark, given that I appear to be unable to read the first sentence of comments I'm replying to. "historical context" facepalm]

Comment author: AlexMennen 25 December 2012 08:25:10PM 3 points [-]

b) I kind of feel like Godel's theorem could be dropped from this post. While it's nice to reiterate the general point that "If you're using Godel's theorem in an argument and you're not a professional logician, you should probably stop", I don't think it actually helps the thrust of this post much. I'd just use Compactness.

Disagree. I actually understand Godel's incompleteness theorem, and started out misunderstanding it until a discussion similar to the one presented in this post, so this may help clear up the incompleteness theorem for some people. And unlike the Compactness theorem, Godel's completeness theorem at least seems fairly intuitive. Proving the existence of nonstandard models from the Compactness theorem seems kind of like pulling a rabbit out of a hat if you can't show me why the Compactness theorem is true.

Your semantics is impoverished if you can prove everything with finite syntactical proofs.

Do you have any basis for this claim?

Comment author: bryjnar 26 December 2012 01:54:10AM 2 points [-]

I absolutely agree that this will help people stop being confused about Godel's theorem, I just don't know why EY does it in this particular post.

Do you have any basis for this claim?

Nope, it's pure polemic ;) Intuitively I feel like it's a realism/instrumentalism issue: claiming that the only things which are true are provable feels like collapsing the true and the knowable. In this case the decision is about which tool to use, but using a tool like first-order logic that has these weird properties seems suspicious.

Comment author: [deleted] 25 December 2012 04:21:06PM 2 points [-]

I'm assuming by subset you mean non-strict subset

I was, but that's not necessary -- a countably infinite set can be bijectively mapped onto {2, 3, 4, ...} which is a proper subset of N after all! ;-)

Comment author: bryjnar 25 December 2012 09:56:50PM 0 points [-]

Oh yeah - brain fail ;)

Comment author: Qiaochu_Yuan 25 December 2012 10:50:51AM *  21 points [-]

Mathematical comment that might amuse LWers: the compactness theorem is equivalent to the ultrafilter lemma, which in turn is essentially equivalent to the statement that Arrow's impossibility theorem is false if the number of voters is allowed to be infinite. More precisely, non-principal ultrafilters are the same as methods for determining elections based on votes from infinitely many voters in a way that satisfies all of the conditions in Arrow's theorem.

Mathematical comment that some LWers might find relevant: the compactness theorem is independent of ZF, which roughly speaking one should take as meaning that it is not possible to write down a non-principal ultrafilter explicitly. If you're sufficiently ultrafinitist, you might not trust a line of reasoning that involved the compactness theorem but purported to be related to a practical real-world problem (e.g. FAI).

Comment author: bryjnar 25 December 2012 02:20:46PM 3 points [-]

the compactness theorem is equivalent to the ultrafilter lemma, which in turn is essentially equivalent to the statement that Arrow's impossibility theorem is false if the number of voters is allowed to be infinite.

Well, I can confirm that I think that that's super cool!

the compactness theorem is independent of ZF

As wuncidunci says, that's only true if you allow uncountable languages. I can't think of many cases off the top of my head where you would really want that... countable is usually enough.

Also: more evidence that the higher model theory of first-order logic is highly dependent on set theory!

Comment author: Ezekiel 25 December 2012 11:27:08AM 3 points [-]

So everyone in the human-superiority crowd gloating about how they're superior to mere machines and formal systems, because they can see that Godel's Statement is true just by their sacred and mysterious mathematical intuition... "...Is actually committing a horrendous logical fallacy [...] though there's a less stupid version of the same argument which invokes second-order logic."

So... not everyone. In Godel, Escher, Bach, Hofstadter presents the second-order explanation of Godel's Incompleteness Theorem, and then goes on to discuss the "human-superiority" crowd. Granted, he doesn't give it much weight - but for reasons that have nothing to do with first- versus second-order logic.

Don't bash a camp just because some of their arguments are bad. Bash them because their strongest argument is bad, or shut up.

(To avoid misunderstanding: I think said camp is in fact wrong.)

Comment author: bryjnar 25 December 2012 02:14:51PM 2 points [-]

I think it's worth addressing that kind of argument because it is fairly well known. Penrose, for example, makes a huge deal over it. Although mostly I think of Penrose as a case study in how being a great mathematician doesn't make you a great philosopher, he's still fairly visible.

View more: Prev | Next