mattmacdermott

Wiki Contributions

Comments

Sorted by

For some reason I've been muttering the phrase, "instrumental goals all the way up" to myself for about a year, so I'm glad somebody's come up with an idea to attach it to.

One time I was camping in the woods with some friends. We were sat around the fire in the middle of the night, listening to the sound of the woods, when one of my friends got out a bluetooth speaker and started playing donk at full volume (donk is a kind of funny, somewhat obnoxious style of dance music).

I strongly felt that this was a bad bad bad thing to be doing, and was basically pleading with my friend to turn it off. Everyone else thought it was funny and that I was being a bit dramatic -- there was nobody around for hundreds of metres, so we weren't disturbing anyone.

I think my friends felt that because we were away from people, we weren't "stepping on the toes of any instrumentally convergent subgoals" with our noise pollution. Whereas I had the vague feeling that we were disturbing all these squirrels and pigeons and or whatever that were probably sleeping in the trees, so we were "stepping on the toes of instrumentally convergent subgoals" to an awful degree.

Which is all to say, for happy instrumental convergence to be good news for other agents in your vicinity, it seems like you probably do still need to care about those agents for some reason?

I'd like to do some experiments using your loan application setting. Is it possible to share the dataset?

But - what might the model that AGI uses to downright visibility and serve up ideas look like?

What I was meaning to get at is that your brain is an AGI that does this for you automatically.

Fine, but it still seems like a reason one could give for death being net good (which is your chief criterion for being a deathist).

I do think it's a weaker reason than the second one. The following argument in defence of it is mainly for fun:

I slightly have the feeling that it's like that decision theory problem where the devil offers you pieces of a poisoned apple one by one. First half, then a quarter, then an eighth, than a sixteenth... You'll be fine unless you eat the whole apple, in which case you'll be poisoned. Each time you're offered a piece it's rational to take it, but following that policy means you get poisoned.

The analogy is that I consider living for eternity to be scary, and you say, "well, you can stop any time". True, but it's always going to be rational for me to live for one more year, and that way lies eternity.

Other (more compelling to me) reasons for being a "deathist":

  • Eternity can seem kinda terrifying.
  • In particular, death is insurance against the worst outcomes lasting forever. Things will always return to neutral eventually and stay there.

Your brain is for holding ideas, not for having them

Notes systems are nice for storing ideas but they tend to get clogged up with stuff you don't need, and you might never see the stuff you do need again. Wouldn't it be better if

  • ideas got automatically downweighted in visibility over time according to their importance, as judged by an AGI who is intimately familiar with every aspect of your life
  • ideas got automatically served up to you at relevant moments as judged by that AGI.

Your brain is that notes system. On the other hand, writing notes is a great way to come up with new ideas.

and nobody else ever seems to do anything useful as a result of such fights

I would guess a large fraction of the potential value of debating these things comes from its impact on people who aren’t the main proponents of the research program, but are observers deciding on their own direction.

Is that priced in to the feeling that the debates don’t lead anywhere useful?

The notion of ‘fairness’ discussed in e.g. the FDT paper is something like: it’s fair to respond to your policy, i.e. what you would do in any counterfactual situation, but it’s not fair to respond to the way that policy is decided.

I think the hope is that you might get a result like “for all fair decision problems, decision-making procedure A is better than decision-making procedure B by some criterion to do with the outcomes it leads to”.

Without the fairness assumption you could create an instant counterexample to any such result by writing down a decision problem where decision-making procedure A is explicitly penalised e.g. omega checks if you use A and gives you minus a million points if so.

a Bayesian interpretation where you don't need to renormalize after every likelihood computation

How does this differ from using Bayes' rule in odds ratio form? In that case you only ever have to renormalise if at some point you want to convert to probabilities.

Load More