Review
The link at the beginning of the post, https://www.lesswrong.com/posts/6yQaqz6p2hFtryR5B/joseph-bloom-on-decision-transformers-and-whether, returns "Sorry, you don't have access to this page. This is usually because the post in question has been removed by the author."
Did you mean https://www.lesswrong.com/posts/bBuBDJBYHt39Q5zZy/decision-transformer-interpretability?
This is a section of an interview I conducted with my brother, Joseph. We'll soon publish the interview sections about his research on Decision Transformers and mech interp opinions.
Preview Snippets
Interview
The Before-Life: Biology, Stats, Proteomics
RB: I think it’d be good to get some context on you and your work for people who don’t already know you or your history.
JB: I have a double major in computational biology and "statistics and stochastic processes". While getting those degrees, I worked at a protein engineering lab using machine learning techniques to design and study proteins. I was quite disenchanted with science after I graduated as my lab had lost funding so I decided to study a business analytics degree (with a view to earning to give).
This was a bit of an unusual degree. It was a 1 year MBA but very technical with lots of statistics and coding. I had only a few months left when I was offered a job in a proteomics start-up doing product and science work. I did this for 2 years and during that time wrote python and R packages, did some algorithm development and published a paper benchmarking our product. In the middle of last year, I read dying with dignity and a few months later quit, having received FTX re-grant to help me upskill.
Don't go down without a fight
RB: Can you tell me a bit about your experience reading dying with dignity and how that affected you?
JB: Dying with dignity had a huge impact on me and it’s hard to pin-point exactly what about it made it so pivotal. I've had an immense respect for Eliezer and rationality for a long time, so I took Eliezer quite seriously and what he was saying scared me. It shocked me out of a world where there were smarter other people solving all the problems and placed me in a new reality where no one was coming to save us.
I also felt like he was saying something very final and very pessimistic, but though he may have reached that point of pessimism after a long time, I hadn't earned that right in some sense. It felt almost like a challenge of sorts. I said to myself “well, I should also think really hard about this and be sure we can’t do better than just dignity points”. So in that sense, I think I was sold on the idea that I shouldn’t go down without a fight.
Decision Process
RB: Okay, and where did you go from there. You say you quit your job (that’s pretty drastic!) and described upskilling. Did you immediately know what to upskill in? My recollection is you actually spent a while thinking about what to work on.
JB: It took me quite a while to decide what precisely to do next. After reading dying with dignity, I identified two main challenges:
RB: Alright, and it’s the case you ended up deciding to work on AI stuff, not biology, notwithstanding your background. I’m very interested in detail on that decision, as I think it’s one some people are still making (though the AI hype sure is real).
What was your process for the decision and what are the arguments that ultimately swayed you?
JB: I made a kanban board, which is a project management tool used to organise engineering tasks, used it to manage time investments in several activities:
It took about 2-3 months total, before I was mostly sold on AI safety.
AI vs Bio?
RB: Okay, and at the end of the day what were the arguments in each direction?
JB: I eventually concluded that there were too many possible ways for me to contribute to AI safety that it seemed likely that my prior on needing to have a different skillset or higher G factor were off. That would have held if only super technical AI alignment work was needed but things like governance or working as a machine learning engineer in an AI safety lab were plausibly things I could do.
On the biosecurity side, the x-risk scenarios didn’t have high enough probability. Once you actually tried to model it, it didn’t stack up against AI. Particularly because the likelier scenarios didn’t result in extinction and the extinction scenarios weren't very likely (in the near term). I estimated that AI risk didn’t appear to have this property or certainly not to the same extent. There are some very plausible and scary scenarios in AI safety.[1]
RB: Can you sketch out the likely and less likely scenarios here for bio x-risk?
JB: I think the scale of biological catastrophe varies from “covid but worse” to “x-risk level pathogens”. In terms of likelihood, as the danger of the pathogen you are concerned about increases, the probability of that pathogen being generated by any given process decreases. Moreover, the scenarios which make very bad biological catastrophes plausible require hypothetical bad actors and possibly technological progress. There are two main scenarios that I think get discussed:
In practice, I think conversations on this topic can go around in circles a bit where you have to hypothesize bad actors to get back to an x-risk pathogen but the associated complexity penalty pushes you back to asking about base-rates on lab accidents. Then with lab accidents you ask why, without intention, the pathogen would be x-risk level.
RB: And yet there are still people working on bio. Why? Are they making a mistake?
JB: I don't think biological risks are existential and if you're working on them for that reason, you could collect enough evidence to change your mind fairly quickly. It seems fairly likely that we live in a world where GCBRs (“Global Catastrophic biological Risks”) exist but not “XBR” existential biological risks.
And if you aren’t a long termist or you have a much much stronger comparative advantage in biology, I can see routes to that being the more reasonable choice. However, I’ll say that arguments on the AI side are pretty convincing so it would be exceptionally rare for anyone to be more effectively placed in biosecurity to minimize x-risk.
One caveat that could be important here is that if I'm wrong about there being existential biological risks, I'd be really interested in knowing why and I think lots of people would be interested in that evidence too. So while I'm saying I wouldn't work in this cause area for x-risk motivated reasons, if there are people out there with a decent shot at reducing any uncertainty here (especially if you think there's evidence that will make that risk more plausible), then that might be worth finding.
RB: But as for as x-risk goes, you don’t think there’s much of a case for working on bio stuff?
JB: Not really. It comes down to how much optimization for “badness” (I won’t go into details here) you can get into your pathogen and a bad actor would need to overcome some pretty hefty challenges here. I would be very surprised if anyone could do this in the near future and once you start asking questions on a longer timeline, AI just wins because it's scary even in the very near term.
Mistakes among aspiring researchers
RB: What mistake do you observe people making in AI Alignment? / What secrets do you have for your success?
JB: Disclaimer: I've been able to get funding but I don't think that would be impossible in a world in which I had bad takes. So, not ascenting to the implication strongly, here are some thoughts that come to mind.
I suspect there's really a class of problems where people entering alignment, such as SERIMATS or ARENA scholars, are doing too much theory or empirical work.
Part of this is that people can struggle to work at both the level of a specific useful project and the level of thinking about alignment as a whole. If you do mostly theoretical work then it can be very hard to touch reality, get feedback (or even grants).
The same is true at the other end of the spectrum where I know people doing empirical work who don't think or read much about AI alignment more broadly.This can make it very hard for you to do self-directed work or reason well about WHY your hands on work is valuable/interesting from an alignment perspective.
There’s also a variation on this back at the “more theoretical” end of the spectrum where people will do a “one off” empirical project but lack insights or knowledge from the broadly deep learning field that can handicap them. There's just so many ways to not know relevant things
All that said, you won't be surprised that I think my secret is that I can sit in the middle. (and that I’ve had several years of start-up/engineering experience). Possibly it’s also that I’m ruthlessly impatient and that makes me acquire technical skills and ways of working that help me do good engineering work without having to rely on others.. I don’t think I’ve been super successful yet, and think I’ve got a lot to work on, but relative to many people I know, I guess I’ve been able to build real stuff (such training pipelines, implement interpretability methods, get some results) and I can make arguments for why that might be valuable. One other thing I've done is invest in people/relationships such as how I’ve helped Neel Nanda with TransformerLens or Callum McDougall with ARENA. Working with people doing good work just leads to so many conversations and opportunities that can be very valuable, though it can come at the cost of a certain amount of distraction.
RB: Ok, just making sure I’ve understood: You see that aspiring Alignment researchers often make the mistake of being too hands or too theory as opposed to hitting them both. You credit your start-up experience for teaching you to both do engineering work independently and justify it’s value.
Might you say a problem is that too many aspiring Alignment researchers haven’t done anything as a holistic project as working in a startup (particularly at the intersection of business and engineering, as you were)?
JB: I think that’s a good summary and I agree that it would be better if alignment researchers had those kinds of experiences. Specifically, I predict that dependencies where people can’t be productive unless another person comes along and provides the reasoning, or the practical engineering work or some other ingredient are a ubiquitous issue. I think lots of people could be quite enabled by developing skills that help them test their ideas independently or come up with and justify project ideas independently.
Joseph: A pretty relevant post I read at the time that was impactful was Longtermists Should Work on AI - There is No "AI Neutral" Scenario