Linkpost for https://cims.nyu.edu/~sbowman/bowman2021hype.pdf. To appear on arXiv shortly.
I'm sharing a position paper I put together as an attempt to introduce general NLP researchers to AI risk concerns. From a few discussions at *ACL conferences, it seems like a pretty large majority of active researchers aren't aware of the arguments at all, or at least aren't aware that they have any connection to NLP and large language model work.
The paper makes a slightly odd multi-step argument to try to connect to active debates in the field:
- It's become extremely common in NLP papers/talks to claim or imply that NNs are too brittle to use, that they aren't doing anything that could plausibly resemble language understanding, and that this is a pretty deep feature of NNs that we don't know how to fix. These claims sometimes come with evidence, but it's often bad evidence, like citations to failures in old systems that we've since improved upon significantly. Weirdly, this even happens in papers that themselves show positive results involving NNs.
- This seems to be coming from concerns about real-world harms: Current systems are pretty biased, and we don't have great methods for dealing with that, so there's a pretty widely-shared feeling that we shouldn't be deploying big NNs nearly as often as we are. The reasoning seems to go: If we downplay the effectiveness of this technology, that'll discourage its deployment.
- But is that actually the right way to minimize the risk of harms? We should expect the impacts of these technologies to grow dramatically as they get better—the basic AI risk arguments go here—and we'll need to be prepared for those impacts. Downplaying the progress that we're making, both to each other and to outside stakeholders, limits our ability to foresee potentially-impactful progress or prepare for it.
I'll be submitting this to ACL in a month. Comments/criticism welcome, here or privately (bowman@nyu.edu).
I agree with the critiques you make of specific papers (in section 2), but I'm less convinced by your diagnosis that these papers are attempting to manage/combat hype in a misguided way.
IMO, "underclaiming" is ubiquitous in academic papers across many fields -- including fields unrelated to NLP or ML, and fields where there's little to no hype to manage. Why do academics underclaim? Common reasons include:
Anyone who's read papers in ML, numerical analysis, statistical inference, computer graphics, etc. is familiar with this phenomenon; there's a reason this tweet is funny.
I suspect 1+2+3 above, rather than hype management, explains the specific mistakes you discuss.
For example, Zhang et al 2020 seems like a case of #2. They cite Jia and Liang as evidence about a problem with earlier models, a problem they are trying to solve with their new method. It would be strange to "manage hype" by saying NLP systems can't do X, and then in the same breath present a new system which you claim does X!
Jang and Lukasiewicz (2021) is also a case of #2, describing a flaw primarily in order to motivate their own proposed fix.
Meanwhile, Xu et al 2020 seems like #3: it's a broad review paper on "adversarial attacks" which gives a brief description of Jia and Liang 2017 alongside brief descriptions of many other results, many of them outside NLP. It's true that the authors should not have used the word "SOTA" here, but it seems more plausible that this is mere sloppiness (they copied other, years-old descriptions of the Jia and Liang result) rather than an attempt to push a specific perspective about NLP.
I think a more useful framing might go something like:
Yeah, this all sounds right, and it's fairly close to the narrative I was using for my previous draft, which had a section on some of these motives.
The best defense I can give of the switch to the hype-centric framing, FWIW:
- The paper is inevitably going to have to do a lot of chastising of authors. Giving the most charitable possible framing of the motivations of the authors I'm chastising means that I'm less likely to lose the trust/readership of those authors and anyone who identifies with them.
- An increasingly large fraction of NLP work—possibly even a m
... (read more)