Note: this post leans heavily on metaphors and examples from computer programming, but I've tried to write it so it's accessible to a determined person with no programming background.
To summarize some info from computer processor design at very high density: There are a variety of ways to manufacture the memory that's used in modern computer processors. There's a trend where the faster a kind of memory is to read from and write to, the more expensive it will be. So modern computers have a hierarchical memory structure: a very small amount of memory that's very fast to do computation with ("the registers"), a larger amount of memory that's a bit slower to do computation with, a even larger amount of memory that's even slower to do computation with, and so on. The two layers immediately below the the registers (the L1 cache and the L2 cache) are typically abstracted away from even the assembly language programmer. They store data that's been accessed recently from the level below them ("main memory"). The processor will do a lookup in the caches when accessing data; if the data is not already in the cache, that's called a "cache miss" and the data will get loaded in to the cache before it's accessed.
(Please correct me in the comments if I got any of that wrong; it's based on years-old memories of an undergrad computer science course.)
Lately I've found it useful to think of my memory in the same way. I've got working memory (7±2 items?), consisting of things that I'm thinking about in this very moment. I've got short term memory and long term memory. And if I can't find something after trying to think of it for a while, I'll look it up (frequently on Google). Cache miss for the lose.
What are some implications of thinking about memory about this way?
Register limitations and chunking
When programming, I've noticed that sometimes I'll encounter a problem that's too big to fit in my working memory (WM) all at once. In the spirit of getting stronger, I'm typically tempted to attack the problem head on, but I find that my brain just tends to flit around the details of the problem instead of actually making progress on it. So lately I've been toying with the idea of trying to break off a piece of the problem that can be easily modularized and fits fully in my working memory and then solving it on its own. (Feynman: "What's the smallest nontrivial example?") You could turn this definition around and define a good software architecture as one that consists of modular components that can individually be made to fit completely in to one's working memory when reading code.
As you write or read code modules, you'll come to understand them better and you'll be able to compress or "chunk" them so they take up less space in your working memory. This is why top-down programming doesn't always work that well. You're trying to fit the entire design in your working memory, but because you don't have a good understanding of the components yet (since you haven't written them), you aren't dealing with chunks but pseudochunks. This is true for concepts in general: it takes all of a beginner's WM to comprehend a for loop, but in a master's WM a for loop can be but one piece in a larger puzzle.
Swapping
One thing to observe: you don't get alerted when memory at the top of your mental hierarchy gets overwritten. We've all had the experience of having some idea in the shower and having forgotten it by the time we get out. Similarly, if you're working on a delicate mental task (programming, math, etc.) and you get interrupted, you'll lose mental state related to the problem you're working on.
If you're having difficulty focusing, this can easily make doing a delicate mental task, like a complicated math problem, much less fun and productive. Instead of actually making progress on the task, your mind drifts away from it, and when you redirect your attention, you find that information related to the problem has swapped out of your working memory or short-term memory and must be re-loaded. If you're getting distracted frequently enough or you're otherwise lacking mental stamina, you may find that you spend the majority of your time context switching instead of making progress on your problem.
Adding an additional external cache level
Anecdotally, adding an additional brain cache level between long-term memory and Google seems like a pretty big win for personal productivity. My digital notebook (since writing that post, I've started using nvALT) has turned out to be one of my biggest wins where productivity is concerned; it's ballooned to over 700K words, and a decent portion of it consists of copy-pasted snippets that represent the best information from Google searches I've done. A co-worker wrote a tool that allows him to quickly look up how to use software libraries and reports that he's continued to find it very useful years after making it.
Text is the most obvious example of an exobrain memory device, but here's a more interesting example: if you're cleaning a messy room, you probably don't develop a detailed plan in your head of where all of your stuff will be placed when you finish cleaning. Instead, you incrementally organize things in to related piles, then decide what to do with the piles, using the organization of the items in your room as a kind of external memory aid that allows you to do a mental task that you wouldn't be able to do entirely in your head.
Would it be accurate to say that you're "not intelligent enough" to organize your room in your head without the use of any external memory aides? It doesn't really fit with the colloquial use of "intelligence", does it? But in the same way computers are frequently RAM-limited, I suspect that humans are also frequently RAM-limited, even on mental tasks we frequently associate with "intelligence". For example, if you're reading a physics textbook and you notice that you're getting confused, you could write down a question that would resolve your confusion, then rewrite the question to be as precise as possible, then list hypotheses that would answer your question along with reasons to believe/disbelieve each hypothesis. By writing things down, you'd be able to devote all of your working memory to the details of a particular aspect of your confusion without losing track of the rest of it.
As a first step, I wouldn't put that much stock into Gwern's guides. I've found that Gwern has his own way of doing things but it rarely seems to generalize at least in my experience. Self-experimentation is good but no matter what you can't get much out of an N=1 sample unless you are that particular person.
I find that going to any sort of persistent store incredibly harmful for my flow state while programming, so I try to get as much as possible into Anki. I think you'll find that if you sum the time spent attempting recall and the 3-5 seconds per lookup you'll also get far more than five minutes for any reasonably well-used concept.
I also find that the concepts in my Anki decks tend to be the ones that come up when I'm problem solving in general or trying to be creative. In a psychology (not neuroscience -- none of this is neuroscience, much like programming is unrelated to byte patterns except as an implementation detail) sense, Anki is just generally raising the activation level of those concepts, and so when you try to think of things, you will think in terms of those concepts. That's why the self-programming cards thing works. But also, it means that when you think about anything, you think in terms related to your Anki concepts.
The OP of the second post you linked seems like they didn't use a lot of Anki functionality. Anki's most popular plugin (maybe second most since I think kanji is still implemented as a plugin) is image occlusion, which seems like it would perfectly mesh with flash cards. However, I still use spatial memory with Anki just by associating Anki values with directions. It's not hard to do.
Overall, I think it's something you should invest in. No matter what you say about its value, it is a reliable way to move things from RAM (let's say) into L2 cache. This is something you should have familiarity with.
You can also check my comment history for a small OCaml utility, Space, that automates some aspects of making Anki cards.