Hi there, my background is in AI research and recently I have discovered some AI Alignment communities centered around here. The more I read about AI Alignment, the more I have a feeling that the whole field is basically a fictional-world-building exercise.
Some problems I have noticed: The basic concepts (e.g. what are the basic properties of the AI that are being discussed) are left undefined. The questions answered are build on unrealistic premises about how AI systems might work. Mathiness - using vaguely defined mathematical terms to describe complex problems and then solving them with additional vaguely defined mathematical operations. Combination of mathematical thinking and hand-wavy reasoning that lead to preferred conclusions.
Maybe I am reading it wrong. How would you steelman the argument that AI Alignment is actually a rigorous field? Do you consider AI Alignment to be scientific? If so, how is it Popper-falsifiable?
I think you get it mostly right, and then you just make a different conclusion.
The part where you agree is:
And the conclusion that AI safety people make is:
While your conclusion seems to be:
So, if you want to be a proper Popperian, you probably need to sit and wait until actual superintelligent machines are made and actually start doing horrible things, and then (assuming that you survive) you can collect and analyze examples of the horrible things happening, propose falsifiable hypotheses on how to avoid these specific horrible things happening again, do the proper experiments, measure the p-values, and publish in respected scientific journals. This is how respectable people would approach the problem.
The alternative is to do the parts that you can do now... and handwave the rest of it, hoping that later someone else will fill in the missing parts. For example, you can collect examples of surprising things that current (not superintelligent) machines are making when solving problems. And the handwavy part is "...and now imagine this, but extrapolated for a superintelligence".
Or you can make a guess about which mathematical problems may turn out to be relevant for AI safety (although you cannot be sure you guessed right), and then work on those mathematical problems rigorously. In which case the situation is like: "yeah, this math problem is solved okay from the scientific perspective, it's just its relevance for AI safety that is dubious".
I am not familiar with the AI safety research, so I cannot provide more information about it. But my impression is that it is similar to a combination of what I just described: examples of potential problems (with non-superintelligent machines), and mathematical details which may or may not be relevant.
The problem with "pop Popperianism" is that it describes what to do when you already have a scientific hypothesis fully formed. It does not concern itself with how to get to that point. Yes, the field of AI safety is currently mostly trying to get to that point. That is the inevitable first step.
I think in this case brackets is pretty good. I agree with Martin that it's good to avoid using quote marks when it might be mistaken for a literal quote.