Wiki Contributions

Comments

Sorted by
A.H.100

HCH is not defined in this post, nor in the link, about it.

For those reading who do not know what HCH means (like me!), HCH is a recursive acronym which stands for 'Humans Consulting HCH', an idea I think originating with Paul Christiano related to iterated amplification. It involves humans being able to recursively consult copies/simulations of themselves to solve a problem. It is discussed and explained in more detail in these two posts:

Humans Consulting HCH

Strong HCH

A.H.20

I Googled things like 'Aubrey de Grey on Demis Hassabis' for 5 minutes and couldn't find anything matching this description. The closest I could find was this interview with de Grey where he says:

I actually know a lot of people who are at the cutting edge of AI research. I actually know Demis Hassabis, the guy who runs DeepMind, from when he was an undergraduate at Cambridge several years after me. We’ve kept in touch and try to connect every so often.

He says they know each other and keep in touch but its not really a character reference. 

(I'm not claiming that de Grey hasn't praised Hassabis' character. Just that if he did, a brief search doesn't yield a publicly available record of this)

A.H.10

Well, you can't have some states as "avoid at all costs" and others as "achieve at all costs", because having them in the same lottery leads to nonsense, no matter what averaging you use. And allowing only one of the two seems arbitrary. So it seems cleanest to disallow both.

Fine. But the purpose of exploring different averaging methods is to see whether it expands the richness of the kind of behaviour we want to describe. The point is that using arithmetic averaging is a choice which limits the kind of behaviour we can get. Maybe we want to describe behaviours which can't be described under expected utility. Having an 'avoid at all costs state' is one such behaviour which finds natural description using a non-arithmetic averaging which can't be described in more typical VNM terms. 

If your position is 'I would never want to describe normative ethics using anything other than expected utility' then that's fine, but some people (like me) are interested in looking at what alternatives to expected utility might be. That's why I wrote this post. As it stands, I didn't find geometric averaging very satisfactory (as I wrote in the post), but I think things like this are worth exploring.

But geometric averaging wouldn't let you do that either, or am I missing something?

You are right. Geometric averaging on its own doesn't give allow violations of independence. But some other protocol for deciding over lotteries does. It's described more in the Garrabrant post linked above.

A.H.10

(apologies for taking a couple of days to respond, work has been busy)

I think your robot example nicely demonstrates the difference between our intuitions. As cubefox pointed out in another comment, what representation you want to use depends on what you take as basic.

There are certain types of preferences/behaviours which cannot be expressed using arithmetic averaging. These are the ones which violate VNM, and I think violating VNM axioms isn't totally crazy. I think its worth exploring these VNM-violating preferences and seeing what they look like when more fleshed out. That's what I tried to do in this post.

If I wanted a robot that violated one of the VNM axioms, then I wouldn't be able to describe it by 'nailing down the averaging method to use ordinary arithmetic averaging and assigning goodness values'. For example, if there were certain states of the world which I wanted to avoid at all costs (and thus violate the continuity axiom), I could assign zero utility to it and use geometric averaging. I couldn't do this with arithmetic averaging and any finite utilities [1].

A better example is Scott Garrabrant's argument regarding abandoning the VNM axiom of independence. If I wanted to program a robot which sometimes preferred lotteries to any definite outcome, I wouldn't be able to program the robot using arithmetic averaging over goodness values.

I think that these examples show that there is at least some independence between averaging methods and utility/goodness. 

  1. ^

    (ok, I guess you could assign 'negative infinity' utility to those states if you wanted. But once you're doing stuff like that, it seems to me that geometric averaging is a much more intuitive way to describe these preferences. )

A.H.10

Thanks for pointing this out, I missed a word. I have added it now.

A.H.40

Without wishing to be facetious: how much (if any) of the post did you read?  If you disagree with me, that's fine, but I feel like I'm answering questions which I already addressed in the post!

Are you arguing that we ought to (1) assign some "goodness" values to outcomes, and then (2) maximize the geometric expectation of "goodness" resulting from our actions?

I'm not arguing that we ought to maximize the geometric expectation of "goodness" resulting from our actions. I'm exploring what it might look like if we did. In the conclusion, (and indeed, many other parts of the post) I'm pretty ambivalent. 

But then wouldn't any argument for (2) depend on the details of how (1) is done? For example, if "goodnesses" were logarithmic in the first place, then wouldn't you want to use arithmetic averaging?

I don't think so. I think you could have a preference ordering over 'certain' world states and the you are still left with choosing a method for deciding between lotteries where the outcome is uncertain. I describe that this is my position in the section titled 'Geometric Expectation   Logarithmic Utility'. 

Is there some description of how we should assign goodnesses in (1) without a kind of firm ground that VNM gives?

This is what philosophers of normative ethics do! People disagree on the how exactly to do it, but that doesn't stop them from trying! My post tries to be agnostic as to what exactly it is we care about and how we assign utility to different world states, since I'm focusing on the difference between averaging methods.

A.H.80

The word 'utility' can be used in two different ways: normative and descriptive. 

You are describing 'utility' in the descriptive sense. I am using it in the normative sense. These are explained in the first paragraph of the Wikipedia page for 'utility'.

As I explained in the opening paragraph, I'm using the word 'utility' to mean the goodness/desirability/value of an outcome. This is normative: if an outcome is 'good' then there is the implication that you ought to pursue it.

A.H.10

Thanks for the comment. Naively, I agree that this sounds like a good idea, but I need to know more about it.

Do you know if anyone has explicitly written down the value learning solution to the corrigibility problem and treated it a bit more rigorously ?

Load More