There's a crux I seem to have with a lot of LWers that I've struggled to put my finger on for a long time but I think reduces to some combination of:
I tend to be more inclined towards the latter in each case, whereas I think a lot of LWers are inclined towards the former, with the potential exception of the author of realism about rationality, who seems to have opinions that overlap with many of my own. While I still feel uncomfortable with the above binaries, I've now gathered enough examples to at least list them as evidence for what I'm talking about.
A few LWers have positively reviewed Linear Algebra Done Right (LADR), in particular complimenting it for revealing the inner workings of Linear Algebra. I too recently read most of this book and did a lot of the exercises. And... I liked it but seemingly less than the other reviewers. In particular, I enjoyed getting a lot of practice reading definition-theorem-proof style math and doing lots of
...I have similar differences with many people on LW and agree there is something of an unacknowledged aesthetic here.
Attention conservation notice: if you've read Michael Nielsen's stuff about Anki, this probably won't be new for you. Also, this is all very personal and YMMV.
In a number of discussions of Anki here and elsewhere, I've seen Anki's value measured in terms of time saved by not having to look stuff up. For example, Gwern's spaced repetition post includes a calculation of when it's worth it to Anki-ize threshold, although I would be surprised if Gwern hasn't already thought about the claim going to make.
While I occasionally use Anki to remember things that I would otherwise have to Google, e.g. statistics, I almost never Anki-ize things so that I can avoid Googling them in the future. And I don't think in terms of time saved when deciding what to Anki-ize.
Instead, (as Michael Nielsen discusses in his posts) I almost always Anki-ize with the goal of building a connected graph of knowledge atoms about an area in which I'm interested. As a result, I tend to evaluate what to Anki-ize based on two criteria:
Weird thing I wish existed: I wish there were more videos of what I think of as 'math/programming speedruns'. For those familiar with speedrunning video games, this would be similar except the idea would be to do the same thing for a math proof or programming problem. While it might seem like this would be quite boring since the solution to the problem/proof is known, I still think there's an element of skill to and would enjoy watching someone do everything they can to get to a solution, proof, etc. as quickly as possible (in an editor, on paper, LaTex, etc.).
This is kind of similar to streaming ACM/math olympiad competition solving except I'm equally more in people doing this for known problems/proofs than I am for tricky but obscure problems. E.g., speed-running the SVD derivation.
While I'm posting this in the hope that others are also really interested, my sense is that this would be incredibly niche even amongst people who like math so I'm not surprised it doesn't exist...
Watching my kitten learn/play has been interesting from a "how do animals compare to current AIs perspective?" At a high level, I think I've updated slightly towards RL agents being further along the evolutionary progress ladder than I'd previously thought.
I've seen critiques of RL agents not being able to do long-term planning as evidence for them not being as smart as animals, and while I think that's probably accurate, I have noticed that my kitten takes a surprisingly long time to learn even 2-step plans. For example, when it plays with a toy on a string, I'll often try putting the toy on a chair that it only knows how to reach by jumping onto another chair first. It took many attempts before it learned to jump onto the other chair and then climb to where I'd put the toy, even though it had previously done that while exploring many times. And even then, it seems to be at risk of "catastrophic forgetting" where we'll be playing in the same way later and it won't remember to do the 2-step move. Related to this, its learning is fairly narrow even for basic skills, e.g. I have 4 identical chairs around a table but it will be afraid of jumping onto one even though it's very comforta
...I keep seeing rationalist-adjacent discussions on Twitter that seem to bottom out with the arguments of the general (very caricatured, sorry) form: "stop forcing yourself and get unblocked and then X effortlessly" where X equals learn, socialize, etc. In particular, a lot of focus seems to be on how children and adults can just pursue what's fun or enjoyable if they get rid of their underlying trauma and they'll naturally learn fast and gravitate towards interesting (but also useful in the long term) topics, with some inspiration from David Deutsch.
On one hand, this sounds great, but it's so foreign to my experience of learning things and seems to lack the kind of evidence I'd expect before changing my cognitive strategies so dramatically. In fairness, I probably am too far in the direction of doing things because I "should", but I still don't think going to the other extreme is the right correction.
In particular, having read Mason Currey's Daily Rituals, I have a strong prior that even the most successful artists and scientists are at risk of developing akrasia and need to systematize their schedules heavily to ensure that they get their butts in the chair and work. Given this, wh...
Interesting Bill Thurston quote, sadly from his obituary:
I’ve always taken a “lazy” attitude toward calculations. I’ve often ended up spending an inordinate amount of time trying to figure out an easy way to see something, preferably to see it in my head without having to write down a long chain of reasoning. I became convinced early on that it can make a huge difference to find ways to take a step-by-step proof or description and find a way to parallelize it, to see it all together all at once—but it often takes a lot of struggle to be able to do that. I think it’s much more common for people to approach long case-by-case and step-by-step proofs and computations as tedious but necessary work, rather than something to figure out a way to avoid. By now, I’ve found lots of “big picture” ways to look at the things I understand, so it’s not as hard.
To prevent mis-interpretation, I think people often look at quotes like this (I've seen similar ones about Feynman) and think "ah yes, see anyone can do it". But IME the thing he's describing is much harder to achieve than the "case-by-case"/"step-by-step" stuff.
Blockchain idea inspired by 80,000 Hours's interview of Vitalik Buterin: a lot of podcasts either have terrible transcriptions or presumably pay a service to transcribe their sessions. However, even these services make minor typos such as "ASX" instead of "ASICs" in the linked interview.
Now, most people who read these transcripts presumably notice at least a subset of these typos but don't want to go through the effort of emailing podcasters to tell them about it. On the flip side, there's no good way for hosts to scalabl...
I've recently been obsessing over the idea of trying to "make math more like programming". I'm not sure if it's just because I feel fluent at programming and still not very fluent at abstract math or also because programming really does have a feedback loop that you don't get in math.
Regardless, based on my reading it seems like there's a general consensus in math that even the most modern theorem provers, like Lean and Coq, are much less efficient than typical "informal" math reasoning. That said, I wonder if this ignores some of the benefits that program
...I've been reading a bit about John Conway since his (unfortunate) death. One thing I keep noticing is that everyone seems to emphasize how core having fun was to John Conway's way of doing math. One question I'm interested in in general is how important fun and curiosity are for doing good research.
I've considered posting a question about this that uses John Conway as an example of someone who 1) was genuinely curious and fun-loving but 2) also had other gifts that played a large role in his ability to do great math. But, I don't want to be insensitive giv
...It seems like (unless I just haven't discovered it yet) there's a sore need for a framework for causal model comparison, analogous to Bayesian model comparison. If you read Pearl (and his students), they rightfully point out that you can't get causal claims without causal assumptions but don't talk much about how you actually formulate the causal model in the first place ("domain knowledge"). As a result, if you look at the literature, researchers mostly seem to use a small set of causal models that may or may not describe phenomena, e.g. the classic "inst
...Epistemic status: Thinking out loud.
Scientific puzzle I notice I'm quite confused about: what's going on with the relationship between thinking and the brain's energy consumption?
On one hand, I'd always been told that thinking harder sadly doesn't burn more energy than normal activity. I believed that and had even come up with a plausible story about how evolution optimizes for genetic fitness not intelligence, and introspective access is pretty bad as it is, so it's not that surprising that we can't crank up our brains energy con
...ML-related math trick: I find it easier to imagine a 4D tensor, say of dimensions , as a big matrix with dimensions within which are nested matrices of dimensions . The nice thing about this is, at least for me, it makes it easier to imagine applying operations over the matrices in parallel, which is something I've had to thing about a number of times doing ML-related programming, e.g. trying to figure out how write the code to apply a 1D convolution-like operation to an entire batch in parallel.
In this fascinating article, Gary Marcus (now better known as a Deep Learning critic, for better or worse) profiles Jill Price, a woman who has an exceptional autobiographical memory. However, unlike others that studied Price, Marcus plays the role of the skeptic and comes to the conclusion that Price's memory is not exceptional in general, but instead only for the facts about her life, which she obsesses over constantly.
Now obsessing over autobiographical memories is not something I'd recommend to people, but re
...Sometimes there are articles I want to share, like this one, where I don't generally trust the author and they may have quite (what I consider) wrong views overall but I really like some of their writing. On one hand, sharing the parts I like without crediting the author seems 1) intellectually / epistemically dishonest and 2) unfair to the author. On the other hand, providing a lot of disclaimers about not generally trusting the author feels weird because I feel uncomfortable publicly describing why I find them untrustworthy.
Not really sure what to do her
...It seems like if we take self-supervised learning (plus a sprinkling of causality) seriously as key human functions, we can more directly enhance our learning by doing much more prediction / checking of predictions while we learn. (I think this is also what predictive processing implies but don't understand that framework as well.)
Weird thought I had based on a tweet about gradient descent in the brain: it seems like one under-explored perspective on computational graphs is the causal one. That is, we can view propagating gradients through the computational graph as assessing the effect of an intervention on some variable on all of a nodes' children.
Reason to think this might be useful:
Reasons why this might not be useful:
If algebra's a deal with the devil where you get the right answer but don't know why, then geometric intuition's a deal with the devil where you always get an answer but don't know whether it's right.
Someone should write the equivalent of TAOCP for machine learning.
(Ok, maybe not literally the equivalent. I mean Knuth is... Knuth. So it doesn't seem realistic to expect someone to do something as impressive as TAOCP. And yes, this is authority worship and I don't care. He's Knuth goddamn it.)
Specifically, a book where the theory/math's rigorous but the algorithms are described in their efficient forms. I haven't found this in the few ML books I've read parts of (Bishop's Pattern Recognition and Machine Learning, MacKay's Information Theory, and Tibrisha
...Thank you. Yes it is a real problem, speaking from experience from the people I personally know. The reason these events are not talked about much is that any press just makes the problem worse—it gives a bunch of copycat muggers the same bright idea. So unfortunately you get a bunch of speculation and not a lot of observable evidence of the downsides of that speculation, so people don’t realize the harm that has been caused.
There are people who have been killed in attempted bitcoin muggings. Speculating on the Internet that someone is possession of >1 million bitcoins is like tattooing a big target on their back they can’t get rid of.
Link post for a short post I just published describing my way of understanding Simpson's Paradox.
Thing I desperately want: tablet native spaced repetition software that lets me draw flashcards. Cloze deletions are just boxes or hand-drawn occlusions.
Today I attended the first of two talks in a two-part mini-workshop on Variational Inference. It's interesting to think of from the perspective of my recent musings about more science-y vs. engineering mindsets because it highlighted the importance of engineering/algorithmic progress in widening Bayesian methods' applicability
The presenter, who's a fairly well known figure in probabilistic ML and has developed some well known statistical inference algorithms, talked about how part of the reason so much time was spent debating philosophical issues in the pa
...
In light of reading Hazard's Shortform Feed -- which I really enjoy -- based on Raemon's Shortform feed, I'm making my own. There be thoughts here. Hopefully, this will also get me posting more.