Math is like constructing a Lego set on a picnic table outside in the middle of a thunderstorm. I grope blindly in the pouring rain for the first piece, and finally put it in place, but by the time I’ve found the second piece and move to connect it to the first piece, the first piece has blown away and is nowhere to be found, and the instructions are sopping wet, and the picnic table has just been carried away by a tornado. - Scott Alexander, The Lottery of Fascinations

Introduction

It's not hard to remember a simple sentence in isolation.

The spacing effect is arguably the most replicable and robust finding from experimental psychology.

But just a little more information can make reading much more challenging.

In these studies, memory is typically tested by presenting learners with lists of words on two learning schedules, massed and spaced. Massed learning schedules present participants with learning events in immediate succession (i.e., one right after the other). In contrast, spaced learning schedules distribute learning events across time (i.e., separated by an operationally defined amount of time).

Even this is enough to give me that wall of text reaction. My eyes don't "latch on" to the first word in the paragraph and proceed from left to right. Instead, they stare at the middle of the paragraph, while I silently ponder whether to make myself read it. Or else they zip back and forth, grabbing a word or phrase here and there ("memory," "participants," "immediate succession"), while my brain forms this inchoate judgment of whether or not to commit to actually reading it.

It's similar to the experience of navigating a crowd at a festival where many bands are playing, speakers are talking, and vendors are hawking their wares, and deciding whether or not to pay attention to whatever's in front of me.

But if I do manage to start actually reading, it gets more challenging. Even a few dense sentences can be enough to overload my very limited, goldfish-like working memory.

For a while, I looked into memory competitions. The world record for memorizing digits in an hour is currently 4620 digits, and the record for memorizing words in 15 minutes is 335 random words. These admirable memorizers use all kinds of specialized visual-memory tricks, which you can read about in Moonwalking With Einstein. But these tricks don't seem too promising for vastly more complex real-world learning. Besides - we want to take advantage of the spacing effect, not fight against it!

So I wanted to find a method of reading that works with my goldfish brain, not against it, particularly when I'm complementing it with progressive highlighting and a spaced repetition flashcard system like Anki.

The method is pretty simple to explain, but runs counter to habits I expect many people have formed over a lifetime of study for an educational system that has woefully neglected to teach the way people learn best.

From Whale Reading...

When you read, you instinctively sift through the words to pluck out the most important bits. For example, try to read this sentence as if you were actually trying to learn and understand it:

Spaced learning schedules have been tested over a matter of seconds, days, and years.

If you're anything like me, you probably focused on a few bits:

Spaced learning schedules have been tested over a matter of seconds, days, and years.

You've probably heard the idea that we have a few "slots," maybe 4 or 7, to store bits of information in our working memory. This might not be a perfectly conceptually accurate way to describe it, but close enough. Notice how the bold words fit easily in that number of slots. It's easy to fit the whole idea in your mind.

Now we're going to look at a longer chunk. I'd like you to try actually reading it, and noticing what a strain it is to hold the whole thing in your mind.

For example, Kornell and Bjork (2008) presented participants with different paintings by relatively obscure artists on either a massed (immediate succession) or spaced schedule (18 seconds between presentations). After a 15s delay, participants were shown unfamiliar paintings by the same artists and asked to generalize an artist’s style to the unfamiliar paintings. Participants that were presented with paintings on a spaced schedule were more accurate in generalizing a painter’s style than participants on a massed schedule, suggesting that spaced presentations facilitated generalization more so than did massed presentations.

Once again, I'm going to bold the parts that you might have felt were important:

For example, Kornell and Bjork (2008) presented participants with different paintings by relatively obscure artists on either a massed (immediate succession) or spaced schedule (18 seconds between presentations). After a 15s delay, participants were shown unfamiliar paintings by the same artists and asked to generalize an artist’s style to the unfamiliar paintings. Participants that were presented with paintings on a spaced schedule were more accurate in generalizing a painter’s style than participants on a massed schedule, suggesting that spaced presentations facilitated generalization more so than did massed presentations.

Augh! Almost everything is important! That's 15 different bold bits, and some of them are pretty long! In retrospect, not all of them are important for what you might be trying to learn from it, but your intuitive brain doesn't know very well how to predict that as you go. How could it, when you don't know the content you're about to read?

The basic point of the passage is probably obvious from context. This is a piece of evidence the authors are claiming supports the spacing effect. But the authors are really trying to get you to show how the methods and results of this empirical study support that claim.

That's a complex task. To achieve it, a diligent student might try to stretch their working memory as far as they can, trying to take in and synthesize the content all at once. They're trying to expand their goldfish-like working memory into a whale, to swallow a gigantic mouthful of informational krill.

In practice, this "whale reading" might involve reading bits of the paragraph over and over again. If they're really focused, they might be thinking of questions, or have a running monologue to tie together the bits of information. For example, maybe they pause to consider why the researchers used paintings by obscure artists, try to visualize the procedure, or asked how they "generalized an artist's style."

Their goal is to understand and, at least for the moment, remember the totality of this paragraph. How does it tie in to the rest of the paper? What kind of evidence does it provide for the conclusion?

This is just difficult to do in one sitting. It strains our natural limits. And this is in relatively easy material, psychological research, where our social and mental intuitions can serve as a guide. If you're learning biochemistry, or math, it gets even harder.

Yet we usually make this our goal, because we're looking to generate that feeling of deep understanding as quickly as possible. Usually we're trying to learn on a tight schedule, and honestly we are usually doing a lot more cramming and a lot less diligent review than we're told to do. It's hard to keep track of all we want to read, we're going off an impulse that led us to read the article in the first place, and we don't have the tools or concepts that might let us take a different approach.

Hence, we engage in "whale reading," over and over again, from the time we're young and throughout our adult life. We even take pride in it.

... To Goldfish Reading

In theory, reading could work like this:

  1. You read the book chapter or article, not trying to understand it all at once or remember anything. There is no strained attempt at "whale reading" because you're not concerned with trying to understand or remember big pieces all at once.
  2. Along the way, you highlight bits of information that seem important.
  3. When you're done, you go back and turn the highlighted bits into flashcards and store them in an automated flashcard system, like Anki.
  4. Every day, you practice your flashcards, taking no more than 2 seconds per view, slowly needing to review them less often. Occasionally, you might even go back and re-read the whole text, with the benefit of having memorized.

If this was your habit, it would still take time, but hopefully no more (or possibly less) than your normal approach to reading. Consider all the time you've spent reading and re-reading a difficult passage, feeling that whatever you'd read before had "blown away and was nowhere to be found."

You've never measured that wasted time, but you know it exists. And doesn't just waste your time, but it produces only a temporary feeling of understanding. Back in school, how well did you remember the concepts and equations you'd labored to learn the semester before?

If you're unfamiliar with Anki, you need to understand how that software works. You also need to understand progressive highlighting in order to decide what flashcards to make. If you're sold on these tools to build and support your long-term memory, you're finally ready to learn how to read.

Fortunately, Goldfish Reading is the easy part. It works like this:

  1. Read just one phrase or sentence, until you've "picked out" around 3-5 bits of info. It should be an easy, "goldfish bite" of information, not a "whale-size mouthful."
  2. If that "goldfish bite" of information seems potentially flashcard-worthy, highlight it. Otherwise, let yourself forget about it and move on.
  3. Treat the next "goldfish bite" as more-or-less new, disconnected information.
  4. I say "more-or-less" because you might be able to hang on to one or two bits of information from the previous "bite" to help connect to the next "bite."

Here's an example. We'll use a new paragraph from the same article as our tough article about the painting experiment from before. First, I'll show it to you "wall of text" style, but don't bother trying to actually read it. We're going to break it up into bite-size pieces to show about what you'd be trying to "hold in your head" at one time.

Recent research has proposed that spaced learning promotes generalization by supporting the abstraction of relevant and irrelevant features. Spaced learning provides time in between learning presentations for learners to forget irrelevant information. However, relevant features are likely to be present on subsequent learning presentations, reactivated in memory, and thus be forgotten to a lesser degree. Consequently, when the learner is required to make a generalization at a later point in time, the learner will remember relevant features and thus generalize based upon these characteristics. In the case of complex generalization, perceptual features are likely to be forgotten, whereas the abstract structure is likely to be remembered to a greater degree.

OK, wall of text, right? It's another whale-size mouthful. But here's what it might be like in "goldfish bites." I'm going to put a little goldfish picture in between each one to encourage you not to try and race ahead, tie them together, or understand the whole thing. This is all about delaying any understanding of the whole in order to more easily understand the little, tiny parts.

I'll bold the bits you might "pick out" for each part, and italicize any "bridge" words you might want to keep in mind as you go to the next part. This is just an artificial presentation to show you how it feels to practice Goldfish Reading. You don't, like, have to copy/paste goldfish pictures into everything you read from now on, I promise.

Recent research has proposed that spaced learning promotes generalization...

kewl

by supporting the abstraction of relevant and irrelevant features.

you might not even understand the parts! just notice the words that seem important, like "abstraction" and "features"

Spaced learning provides time in between learning presentations for learners to forget irrelevant information.

how appropriate!

However, relevant features are likely to be present on subsequent learning presentations, reactivated in memory, and thus be forgotten to a lesser degree

maybe i'll figure out what a "feature" is someday...

Consequently, when the learner is required to make a generalization at a later point in time...

yee-haw!

...the learner will remember relevant features and thus generalize based upon these characteristics.

what were we talking about again? oh whatever...

In the case of complex generalization, perceptual features are likely to be forgotten, whereas the abstract structure is likely to be remembered to a greater degree.

See how every bit has about 3-5 important bits of information? And how if you focus on one part, it's not hard to hold them all in your mind, even though the don't always make a ton of sense. By the time you get to the end, the beginnings have probably "blown away and are nowhere to be found."

The difference is that now, you don't care! Rather than trying to strain your mental mouth wide open, or rescue your new knowledge from the tornado of forgetfulness, you're just letting it blow away into whatever inaccessible corner of your mind they end up in. But look... do you recognize this sentence?

Recent research has proposed that spaced learning promotes generalization by supporting the abstraction of relevant and irrelevant features.

If it looks familiar to you, then you haven't forgotten it after all! It was just not in your working memory, and you didn't know how to get it back there. I just showed you a flashcard. We did a little bit of spaced repetition, right here in this article. Cool, huh? And you are going to remember that bit much better than if you'd tried to read it over again to keep it "top of mind," in a bit whale-size mouthful of information, as you were reading the original full paragraph.

here are some nice purple carp to look at. let's take a breather.

Now, if you were using progressive highlighting and Anki, you'd also have been creating highlights in this section. You might have highlighted the whole paragraph, or just a few individual sentences:

When you're done reading for the day, you'll go back, look at the section you highlighted, and ask if it seems important enough to make into a flashcard. If so, maybe your flashcard will read something like this:

Q: What does spaced learning let learners do in between learning presentations, and why does that help them make generalizations at a later point in time?

A: With spaced learning, learners have time to forget irrelevancies in between presentations. This lets the relevant features "shine through" due to practice, so they're easier to remember and use as the basis for generalizations.

is this reeeeally easier? it seems... blub blub... harder >O|O:(

That's because you're learning a new skill! It's definitely going to be harder and seem strange at first. But I really believe it will grow to feel natural, easier, and more effective once you've gotten it down. Let's go over the workflow one more time:

  1. Goldfish Read the paragraph. Remember: quick, easy, always preferring "no strain" to "understanding the whole thing," but actually reading all the words, not just skimming.
  2. Highlight it if it seems potentially flashcard-worthy. Spend approximately zero thought on this, go with your gut and move on to Goldfish Read the next bit.
  3. Remove/ignore any highlights that no longer seem flashcard-worthy, and turn the rest into flashcards.

If you're comfortable with progressive highlighting and making flashcards, the only tough thing is teaching yourself how to Goldfish Read in the way that text is normally presented, in big walls of text. This is probably hard, because you're still accustomed to "whale reading," and your mind naturally wants to resist that kind of strain.

But remember how it felt earlier, to actually read all the words, but to focus in on just one "goldfish bite" of a few information bits at a time? And then to just let it go and forget about it when you move on to the next? I want you to try and Goldfish Read this next three paragraphs. It's going to be a bunch of text. And you're going to try and Goldfish Read the whole thing, just letting yourself forget as you go along, but again, you have to actually read all the words. As you go, imagine what phrases or sentences you would have highlighted as potentially flashcard-worthy. If you're reading this on a computer, you can use your mouse to temporarily highlight the text to simulate progressive highlighting.

Time to dive in...

Historically, there have been four classes of theories used to explain spacing effects: 1) deficient processing theories, 2) encoding variability theories, 3) consolidation theories,  and 4) study phase retrieval theories. To date, the most parsimonious and predominate collection of theories are study-phase retrieval theories. However, one limitation of spacing effect theories is that they have primarily been constructed to explain memory processes, not generalization processes.

For example, many deficient processing theories are based on the idea that massed presentations are encoded to a lesser degree than spaced presentations. Massed presentations are encoded to a lesser degree because, when presenting the exact same stimulus over and over again, learners habituate to the stimulus. However, in the case of generalization tasks, presentations are likely to be quite variable and, consequently, learners are less likely to habituate to massed presentations. In short, this work demonstrates that spaced learning promotes several levels of generalization, and thus current theories of the spacing effect must be revised in order to account for these findings.

Why do spaced learning schedules promote both simple and complex generalization? This is an open question and definitely an area for future research. One possibility is that spaced learning provides opportunities for forgetting between learning presentations. Relevant features are likely to be present on subsequent learning presentations, reactivated in memory, and thus be forgotten to a lesser degree than irrelevant features. Consequently, when the learner is required to generalize at a later point in time, the learner will remember relevant features and thus generalize based upon these characteristics. In the case of complex generalization, perceptual features are likely to be forgotten, whereas the underlying abstract structure is likely to be remembered to a greater degree. Indeed, the most basic mechanisms of memory (i.e., forgetting) may be the same mechanisms that support our most sophisticated forms of learning (i.e., complex generalization).

how'd it go, goldfish?

If you did this right, you don't remember very much at all of what you just read, let alone how it all fits together. But hopefully you found several sentences which, in the moment, seemed potentially flashcard-worthy. You probably don't recall what was in them now, and you couldn't necessarily have tied them in with the beginning and end of the paragraph. That's fine!

If we were actually trying to study this article and deeply understand it, we'd have been doing progressive highlighting all through it. Then we'd go back, delete a bunch of the highlights that no longer seemed as important, and make flashcards of the rest. We'd have a daily habit of reviewing flashcards covering not just this article, but lots of articles and book chapters we'd read in the past.

And we'd go over this article again in 4-10 days, with the benefit of some flashcard practice and time. With so many of the terms and concepts more deeply learned via flashcards, we could gain a much stronger synthesis with just as little strain as we applied during our first Goldfish Read.

Caveats and Conclusion

First, I want to be totally transparent. Goldfish Reading is, as far as I know, my own creation. I'm not a super-genius, a psychologist, and I don't have 1,001 studies to throw at you to prove it works. I'm building from my own experience as a student and teacher, and my understanding of key well-supported psychological theories, such as the spacing effect, to break down the task of reading in a way that fits with my best understanding of how the human memory actually works.

So not only are you depending on me, you're depending on your own understanding of what I'm trying to convey here. I wouldn't be surprised if this hinders your reading in the short-run, while you're still figuring out how to make these ideas work for you. And they absolutely depend on progressive highlighting and flashcards. Without being willing to fully incorporate those into your reading habits, I'd expect Goldfish Reading will make you a strictly worse reader.

On a deeper level, you've had a lifetime of experience making reading work for your brain. So don't abuse yourself into totally distorting your approach to reading based on what some guy on the internet had to say. If this approach makes sense and you want to try it, go for it. The formula is:

  1. Goldfish reading
  2. Progressive highlighting
  3. Make flashcards
  4. Review flashcards and periodically re-read the text until it "clicks"

[1] Vlach, H. A., & Sandhofer, C. M. (2012). Distributing learning over time: The spacing effect in children’s acquisition and generalization of science concepts. Child development, 83(4), 1137-1144.

The extensive quotations in this blog post are drawn from this paper.

New Comment
8 comments, sorted by Click to highlight new comments since:

Quick comment:  I noticed that in all of your examples above, I chunk substantially bigger and fewer pieces.  For example, in the "15 different bold bits" clip, I chunk it into about 8 pieces instead.

This is likely experience/background dependent; I happen to have a relatively strong background in ML and have read a stack of research papers recently, so I probably have both stronger noise filters and more complicated primitives available.

One possibly interesting side note:  I never once, in any of your examples, considered metadata about the topic relevant.  This includes things like the author names, "tested", "study proposed", etc.  I suspect I've learned that 1) author names are almost never important, 2) test procedures are only worth thinking about if they're very explicitly detailed (which was not the case above), and 3) even if the test procedures are ok, they're typically only relevant as a cleanup/sanitization pass once the main concept is understood.

These are great points! The chunks I included are not personalized, so as you point out, they include information a skilled reader would filter out and use multiple chunks where a skilled reader might just see one.

That sounds quite a bit like what I do. When I encounter an insight in an article that I want to keep I create an Anki card from it. Here is the latest one that came up in my Anki:

Q: Which people who say that they want to change actually will do?

A: An observation: People who blame a part of themselves for a failure do not change. If someone says "I've got a terrible temper", he will still hit. If he says "I hit my girlfriend", he might stop. If someone says "I have shitty executive function", he will still be late. If he says "I broke my promise", he might change.

http://lesswrong.com/r/discussion/lw/o13/open_thread_oct_17_oct_23_2016/dghr 

And for this article I will create:

Q: Goldfish Reading is

A: reading text without trying to fully understand (or memorize) all of it at once but focus on the key parts of it (and optionally making Anki cards out of it). A bit like this Anki deck but with more cards. 
https://www.lesswrong.com/posts/fSos4ZwdQmRuLLnwK/goldfish-reading

Maybe my method is really plankton reading?

I'd distinguish between how you read vs. what you extract. From your comment, it's hard to tell what your experience is like while you're actually reading these articles. But it looks like you go on to create a short overall summary of the main point and turn it into a flashcard, which is definitely a part of Goldfish Reading!

The main point I wanted to get across with this post was that people probably have reading habits that are adapted to studying without Anki. And that Anki doesn't just allow you to maintain long-term knowledge, but to simplify your approach to reading the material in the first place to be less strained.

About how I read: I have always been a fast reader easily willing to not think too much about things that seemed unimportant. Except for math where building a working model is key. 

Only skimmed an extremely short passage, but sounded similar to https://en.m.wikipedia.org/wiki/Incremental_reading.

This is kind of a "common core math" approach to reading. It decomposes a process and presents it step by step, in contrast to either

  1. Presenting a goal, and letting each learner figure out the how on their own
  2. Presenting the assembly line steps needed to mechanically generate the answer

Anecdotally, this is approximately what I do when reading (or listening), just all at once and without physically highlighting/flashcarding. But I'm still taking each piece, understanding how it fits into the greater whole (I visualize actual pieces of a tinker-toy like structure fitting together, but that's just me), and then moving on and forgetting the parts that don't matter. Two notes:

  1. This relies on the source material being laid out in a particular way, and with a specific density of information. Too dense (a list of facts or grocery items to memorize; a math textbook) and there is nothing to forget, and no repeated information to absorb. Too diffuse, and there's only one idea present in the first place, already separated by plenty of dead space, and repeated enough to get it through (basic communication theory: tell em what you're gonna say, say it, tell em what you said. Use examples.)
  2. I do not recommend trying this on casual conversation. People very often include a detail in their stream of thought that doesn't connect, and gets forgotten as irrelevant, but was the most important part to them. For example, their name.

These are great points. I have only tried this on a few subjects. It seems to work well for biochemistry. I'll have to see how it works for differential equations this coming quarter. And I would never have thought about applying this to casual conversation, but that's worth pointing out. Thanks for mentioning it.