This is a cool idea, and I have no doubt it helped somewhat, but IMO it falls prey to the same mistake I see made by the makers of almost every video series/online course/list of resources for ML math: assuming that math is mostly about concepts and facts.
It's only about 5% that. Maybe less. I and many others in ML have seen the same videos and remembered the concepts for a while too. And forgotten them, in time. More than once! On the other hand, I've seen how persistently and operationally fluent (especially in ML and interpretability) people become when they actually learned math the way it must be learned: via hundreds of hours of laborious exercises, proofs, derivations, etc. Videos and lectures are a small fraction of what's ultimately needed.
For most of ML, it's probably fine--you'll never need to do a proof or do more than simple linear algebra operations by hand. But if you want to do the really hard stuff, especially in interpretability, I don't think there's any substitute for cranking through those hours.
To be clear, I think this weekend was a great start on that--if you continue immediately to taking full courses and doing the exercises. I'm a top-down learner, so it would certainly help me. But unless it's practiced in very short order, it will be forgotten, and just become a collection of terms you recognize when others talk about them.
thank you! I really liked the idea seeing it like an expriment! I will try to apply the same in building an unity idle game in 24h :)
TL:DR I designed an experiment where I committed to spend two 12 hour days trying to learn as much deep-learning math as possible, basically from scratch.
Table of Contents
Origins and Motivations
For a long time, I’ve felt intimidated by the technical aspects of alignment research. I had never taken classes on linear algebra or multivariable calculus or deep learning, and when I cracked open many AI papers, I was terrified by symbols and words I didn’t understand.
7 months ago I wrote up a short doc about how I was going to remedy my lack of technical knowledge: I collected some textbooks and some online courses, and I decided to hire a tutor to meet a few hours a week. I had the first two weeks of meetings, it was awesome, then regular meetings got disrupted by travel, and I never came back to it.
When I thought about my accumulating debt of technical knowledge, my cached answer was “Oh, that might take six months to get up to speed. I don’t have the time.”
Then, watching my productivity on other projects over the intervening months, I noticed two things:
Also, when I asked myself what I thought the main bottlenecks were for addressing my technical debt problem, I identified two categories:
Then, as my mind wandered, I started to put 2 and 2 together: Perhaps these new things I had noticed about my productivity, could be used to address the bottlenecks in my technical debt? I decided to embark on an experiment: how much technical background on deep learning could I learn in a single weekend? My understanding of the benefits of this experiment were as follows:
Results
Takeaways
The Experiment Set-Up
The Curriculum that I used
Intro to deep learning (I kept returning to these videos throughout the experiment, rewatching and understanding slightly more)
Linear algebra (this took me 2hr 25 mins, and ~36 mins of breaks)
Calc 3 (this took me 2hr 57 mins and ~50 mins of breaks)
- ResNets: https://www.youtube.com/watch?v=ZILIbUvp5lk (took me 18 mins)
- RNNs (optional): https://www.youtube.com/watch?v=_aCuOwF1ZjU (took me 13 mins)
- Transformers: https://www.youtube.com/watch?v=4Bdc55j80l8&t=609s
(I spent like, two hours on the above video which ex-post was not great. I would recommend others choose a different explainer on Transformers. )
- RL basics https://www.youtube.com/watch?v=JgvyzIkgxF0 (took me 25 mins)
- policy gradients / ppo: https://www.youtube.com/watch?v=5P7I-xPq8u8&t=318s
I could not understand the above video after rewatching it several times (I think the curricula skipped some prerequisites for this) so I had to have Thomas Larsen walk me through it on his own for around an hour. Thanks Thomas!
- RLHF: rob miles video: https://www.youtube.com/watch?v=PYylPRX6z4Q (took me 23 mins)
Documentation on Hours
(I used toggl track to record my time, and was fairly happy with the software. However, I made many errors / didn’t record breaks correctly, etc. So take these numbers with a grain of salt.)
Saturday
3b1b Video 1
28 min
Video 1 Summarizing
15 min
???
7 min
3b1b Video 2
29 min
Break
4 min
Video 2 Summarizing
20 min
3b1b Video 3
13 min
Break
9 min
White-Boarding
4 min
Linnear Algebra - first four videos
58 min
Cleaning up notes
7 min
Chapter 4 Linnear Algebra
16 min
White boarding
14 min
Three Dimensional Linnear Transformations
14 min
???
12 min
Chapter 9
10 min
break
27 min
Chapter 13
14 min
Calculus?
3 min
Backpropagation, Chapter 4
10 min
break
9 min
Multivariable Calculus
1hr 6min
meditation
9 min
Multivariable Calculus
10 min
meditation
7 min
Multivariable Calculus
26 min
break
25 min
Multivariable Calculus
47 min
Multivariable Calculus
28 min
Watching Neural Nets Ch. 3 again
30 min
????
45 min
Trying to explain and failing
30 min
Sunday
Rewatching Backpropagation
22 min
Resnets Video
18 min
RNN's Video
13 min
Transformers
21 min
break
5 min
Transformers
36 min
RL Basics
25 min
break
23 min
More RL
23 min
Talking to Thomas about Transformers and Reinforcement Learning and PPO
120 mins