Hi there, my background is in AI research and recently I have discovered some AI Alignment communities centered around here. The more I read about AI Alignment, the more I have a feeling that the whole field is basically a fictional-world-building exercise.
Some problems I have noticed: The basic concepts (e.g. what are the basic properties of the AI that are being discussed) are left undefined. The questions answered are build on unrealistic premises about how AI systems might work. Mathiness - using vaguely defined mathematical terms to describe complex problems and then solving them with additional vaguely defined mathematical operations. Combination of mathematical thinking and hand-wavy reasoning that lead to preferred conclusions.
Maybe I am reading it wrong. How would you steelman the argument that AI Alignment is actually a rigorous field? Do you consider AI Alignment to be scientific? If so, how is it Popper-falsifiable?
I think your critique would be better understood were it more concrete. For example, if you write something like
"In the paper X, authors claim that AI alignment requires the following set of assumptions {Y}, which they formalize using a set of axioms {Z}, used to prove a number of theorems {T}. However, the stated assumptions are not well motivated, because [...] Furthermore, the transition from Y to Z is not unique, because of [a counterexample]. Even if the axioms Z are granted, the theorems do not follow without [additional unstated restrictions]. Given the above caveats, the main results of the paper, while mathematically sound and potentially novel, are unlikely to contribute to the intended goal of AI Alignment because [...]."
then it would be easier for the MIRI-adjacent AI Alignment community to engage with your argument.
For the record: I feel that Embedded Agency is a horrible introduction to AI alignment. But my opinion is a minority opinion on this forum.