My main takeaway would be that this seems like quite strong evidence towards the view expressed in https://www.beren.io/2023-11-05-Open-source-AI-has-been-vital-for-alignment/, that most safety research doesn't come from the top labs.
Indirectly, because 90 papers seems like a tiny number, vs. what got published on arxiv during that same time interval. (Depending on how one counts) I wouldn't be surprised if there were > 90 papers from outside the labs even looking only at the unlearning category.
Counting the number of papers isn't going to be a good strategy.
I do think total research outside of labs looks competitive with research from labs and probably research done outside of labs has produced more differential safety progress in total.
I also think open weight models are probably good so far in terms of making AI more likely to go well (putting aside norms and precedents of releases), though I don't think this is necessarily implied from "more research happens outside of labs".
Dataset of papers. Github.
One reason to be interested in this kind of work is as a precursor to measuring the value of different labs' safety publications. Right now the state-of-the-art technique for that is count the papers (and maybe elicit vibes from friends). But counting the papers is crude; it misses how good the papers are (and sometimes misses valuable non-paper research artifacts).
See more of IAPS's research. My favorite piece of IAPS research is still Deployment Corrections from a year ago.