wedrifid comments on SotW: Be Specific - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (306)
What happens if you ask it to maximize your CEV, though?
Lemme remember, the idea with CEV was what you'd desired if you thought faster and more reliably. Okay I ponder what will happen to you if your mind was BusyBeaver(10) times faster (way scarier number than 3^^^^3), without your body working any faster. 1 second passes.
It'll fuck with you. Because that is what it does. It has plenty of scope to do so because CEV is not fully defined as of now. I'm not sure precisely how it would go about doing so. I just assume it does in some way I haven't thought of yet.
The meaning it attributes to CEV when it wants to exploit it to make things terrible is very different to the meaning it attributes to CEV when we try to use it to force it to understand us. It's almost as bad as some humans in that regard!
The understatement of the year. CEV is vaguest crap ever with lowest hope of becoming less vague.
That's a rather significant claim.
It's very uncommon to see crap this vague in development for such a long time by such a clever person, without it becoming less vague.
As far as I am aware this crap isn't in development. It isn't the highest research priority so the other SingInst researchers haven't been working on it much and Eliezer himself is mostly focused on writing a rationality book. Other things like decision theory are being worked on - which has involved replacing vague as crap TDT with less-vague UDT and UDT2.
I would like to see more work published on CEV. The most recent I am familiar with is this.
As I've figured out while writing the last few posts, TDT hasn't been explained well, but it is a genuinely formalizable theory. (You'll have to trust me until Part III or check the decision-theory mailing list.) But it's a different theory from ADT and UDT, and the latter ones are preferable.
You mean you have something in mind about how to handle counterfactuals over logically impossible worlds, or simply “I'm not sure it can't be done”?
I mean, I've written an algorithm (in the context of the tournament) which does what TDT should do (just as the algorithm in my last post does what CDT should do). The nice part about specifying the context so precisely is that I can dodge many of the hairy issues which come up in practice, and just show the essence of the decision theories.