- We might try to write a program that doesn’t pursue a goal, and fail. Issue [2] sounds pretty strange—it’s not the kind of bug most software has. But when you are programming with gradient descent, strange things can happen.
I found this part of Optimization and goals very helpful for thinking about Tool AI - thanks.
(Crossposted from ordinary ideas).
I’ve recently been thinking about AI safety, and some of the writeups might be interesting to some LWers:
I’m excited about a few possible next steps: