Malignancy in the prior seems like a strong crux of the goal-design part of alignment to me. Whether your prior is going to be used to model: * processes in the multiverse containing the AI which does said modeling, * processes which would output all of some blog so we...
(Epistemic status: I think this is right?) Alice is the CEO of ArmaggAI, and Bob is the CEO of BigModelsAI, two major AI capabilities organizations. They're racing to be the first to build a superintelligence aligned to their respective CEV which would take over the universe and satisfy their values....
There's one thing history seems to have been trying to teach us: that the contents of the future are determined by power, economics, politics, and other conflict-theoritic matters. Turns out, nope! Almost all of what the future contains is determined by which of the two following engineering problems is solved...
this work was done by Tamsin Leake and Julia Persson at Orthogonal. thanks to mesaoptimizer for his help putting together this post. what does the QACI plan for formal-goal alignment actually look like when formalized as math? in this post, we'll be presenting our current formalization, which we believe has...
We recently announced [Orthogonal, an agent foundations alignment research organization. In this post, I give a thorough explanation of the formal-goal alignment framework, the motivation behind it, and the theory of change it fits in. The overall shape of what we're doing is: * Building a formal goal which would...
We are putting together Orthogonal, a non-profit alignment research organization focused on agent foundations, based in Europe. We are pursuing the formal alignment flavor of agent foundations in order to solve alignment in a manner which would scale to superintelligence in order to robustly overcome AI risk. If we can...