For building the skills to make a transformer, I'd highly recommend Karpathy's youtube channel. He hasn't gotten to transformers yet, as he's covering earlier models first. Which is useful, as knowing how to implement a neural network properly will affect your ability to implement a transformer. Yes, these are NLP models, but I think the soft rule of not looking at any NLP architectures is dumb. If the models don't contain the core insights of transforrmers/SOTA NLP architectures, then what's the issue?
To understand what a transformer is, I'd recommend this article. Also, I'd warn against not using pytorch for large models. Unless you know CUDA, that's a bad idea.
EDIT: This is a good post, and I'm glad you (and your girlfriend?) wrote it.
It is not always obvious whether your skills are sufficiently good to work for one of the various AI safety and alignment organizations. There are many options to calibrate and improve your skills including just applying to an org or talking with other people within the alignment community.
One additional option is to test your skills by working on projects that are closely related to or a building block of the work being done in alignment orgs. By now, there are multiple curricula out there, e.g. the one by Jacob Hilton or the one by Gabriel Mukobi.
One core building block of these curricula is to understand transformers in detail and a common recommendation is to check if you can build one from scratch. Thus, my girlfriend and I have recently set ourselves the challenge to build various transformers from scratch in PyTorch. We think this was a useful exercise and want to present the challenge in more detail and share some tips and tricks. You can find our code here.
Building a transformer from scratch
The following is a suggestion on how to build a transformer from scratch and train it. There are, of course, many details we omit but I think it covers the most important basics.
Goals
From the ground up we want to
Bonus goals
Soft rules
For this calibration challenge, we used the following rules. Note, that these are “soft rules” and nobody is going to enforce them but it’s in your interest to make some rules before you start.
We were
Things to look out for
Here are some suggestions on what to look out for during the project
I think that the “does it feel right” indicators are more important than the exact timings. There can be lots of random sources of error during the coding or training of neural networks that can take some time to debug. If you felt very comfortable, this might be a sign that you should apply to a technical AI alignment job. If it felt pretty hard, this might be a sign that you should skill up for a bit and then apply.
The final product
In some cases, you might want to show the result of your work to someone else. I’d recommend creating a GitHub repository for the project and creating a jupyter notebook or .py file for every major subpart. You can find our repo here. Don’t take our code as a benchmark to work towards, there might be errors and we might have violated some basic guidelines of professional NLP coding due to our inexperience.
Problems we encountered
How to think about AI safety up-skilling projects
In my opinion, there are three important considerations.
Final words
I hope this is helpful. In case something is unclear, please let me know. In general, I’d be interested to see more “AI safety up-skilling challenges”, e.g. providing more detail to a subsection of Jacob’s or Gabriel’s post.