What resources would you recommend for learning advanced statistics?
How about you ask the AI "if you were to ask a counterfactual version of you who lives in a world where the president died, what would it advise you to do?". This counterfactual AI is motivated to take nice actions, so it would advise the real AI to take nice actions as well, right?
An interesting post, but I don't know if it implies that "strong AI may be near". Indeed, the author has written another post in which he says that we are "really, really far away" from human-level intelligence: https://karpathy.github.io/2012/10/22/state-of-computer-vision/.
Another one on computing: The Elements of Computing Systems. This book explains how computers work by teaching you to build a computer from scratch, staring with logic gates. By the end you have a working (emulation of) a computer, every component of which you built. It's great if you already know how to program and want to learn how computers work at a lower level.
Why not make it so that the agent in selecting A1 act as a UN-agent that believe that it will continue to optimize according to UN even in the event of the button being pressed rather than a UN agent that believe that the button will never be pressed: that is pick U such that
U(a1,o,a2) = UN(a1,o,a2) if o is in Press or US(a1,o,a2) + f(a1,o) - g(a1,o) if o is not in Press
where f(a1,o) is the maximum value of UN(a1,o,b) for b in A2 and g(a1,o) is the maximum value of US(a1,o,b) for b in A2.
This would avoid the perverse manipulation incentives problem detailed on section 4.2 of the paper.
How does this differ from indifference?
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
It's striking how much value there is in academia that I didn't notice, and that a base-level rational person would've noticed if they'd asked "what are the main blind spots of the rationality community, and how can I steelman the opposing positions?". Not a good sign about me, certainly.
Also, is that your actual email address?
I think the idea is that you're supposed to deduce the last name and domain name from identifying details in the post.