BBE W1: HMCM and Notetaking Systems

Spiracular

HMCM and Notetaking Systems

BBE W1C1

(Build a Better Exobrain Week 1, Commentary 1)

Back in the old days, when the internet was bad for notetaking, some obsessive notetaker named Lion Kimbro wrote a Document.

A long, long time after that, the document came up on some LessWrong thread I was reading.*

This week, I decided to flip through it.

HMCMETYT

HMCMETYT is an abbreviation of How to Make a Complete Map of Every Thought You Think.

The title definitely wouldn't have gotten past a publishing editor. For sanity's sake, I'm going to shorten the abbreviation to HMCM henceforth.

I'm going to give an overall review of it here. If you don't care about that, skip ahead to Categorizing Notetaking Systems.

Reviewy Review

Its title makes a big claim, and that claim doesn't quite match with what I got out of reading it. That being said, I still think I got a lot out of it.

His abstractions were often fun to engage with, and I'm currently using a modified version of his tagging system (tags will probably be my next post). Try as I might, though, I never quite felt like I "grokked" the benefits of his mindmapping practice.

HMCM is very stream-of-consciousness at times, and describes a system of physical notetaking with laborious indexing. The abstractions aged pretty well, the physical instructions did not.

I would say that overall, I found HMCM to be an interesting book.** However, I think I was an exceptionally good audience for it, and I'm not sure that I'd recommend it to anyone who isn't notetaking-obsessed.

Glad That's Over

Turns out I don't like writing reviews! With the "review" out of the way, I will henceforth be jumping around and expanding on topics in a very non-linear fashion, according to my own whims and interests.

Lets talk about...

Categorizing Notetaking Systems

I am astonished that there isn't a field of study of notebooks. I have searched on the net, and while I have found a page here and there on some type of notebook method, it is almost ALWAYS one of the following two things:

The Diary: A bunch of entrees, chronologically based, maybe with a TOC, in which a person keeps a record of their thoughts. AKA "The Journal".

The Category Bins: A bunch of notes, stuffed into category bins, maybe 2 or 3 levels deep.

That's IT. In all the world, people have only been putting their notes in the above two ways.

-- Lion Kimbro, Introduction***

To the world's credit, I think the situation for notetakers has improved a lot since 2003 (when this was written).

There's a lot of good notetaking software out there nowadays, and not all of it is best described as a Diary or a Category Bin.

Stephen Davies is a computer scientist who wrote extensively about the underlying format of Personal Knowledge Bases. A lot of modern notetaking still fits neatly into the taxonomy of data structures they described in 2005.

If I had to broadly categorize the notetakers I've seen, here are the standard categories I'd describe...

File System: Classic. A (mostly) acyclic tree whose nodes are folders, and leaves are distinct files. Acyclicity might be broken with shortcuts. (ex: computer file system, GoogleDocs)

The Outliner: A tree of nestled pages or bullet-points, whose nodes are the same type as the leaves. Usually acyclic. The best examples can be almost 100% hands-on-keyboard. Good structure for code folding. (ex: Workflowy, most forums)

The Timeline: What it says on the tin. Another classic. Chronologically-organized. Typically a journal, or current-events-related. A variant is a multi-threaded timeline, such as a Gantt Chart. (ex: RSS feed)

The Calendar: Some might not call this notetaking software, but I've come to think of it like one. Chronological, like the timeline. Unlike the standard timeline, the priority is based largely on proximity to future due-dates. Fantastic for just-in-time reminders. (ex: GoogleCalendar)

Tag You're It: Everything has a pile of #hashtags or stickers. You pull up lists based on a common tag, possibly ordered by some other index. Often hybridized with other types. (ex: Category Bins, Evernote)

MindMap: Nodes and edges, where a lot of the value of information is in the connections drawn between concepts. The underlying structure is called a Spider Diagram. Usually heirarchical but may be permissive of cyclic references, sometimes in a limited way ( lines pointing back up the chain). (ex: XMind)

The Wiki: Hyperlinks hyperlinks hyperlinks. Navigated as an extremely-cyclic web of hyperlinks, usually with some custom index files. Unlike tag, the line between an article instance and an indexing instance isn't as clear or hierarchical. Probably supports backlinks. Not always public. (ex: Wikipedia, Tiddlywiki)

Flashcards: Two-sided association between one piece of data, and another. A dictionary data type, or a two-sided dictionary. Usually index-card sized, and most often used for memorization. (ex: Anki)

Annotator: A system for taking notes coupled to uploaded reading material, usually an ebook, PDF, or similar. The best of them capture something of the "writing/highlighting in the margins of books" experience. (ex: Kindle, some of Zotero)

Recommender System: An add-on for any of the above. These rank things according to some metric of quality, like ratings, relevance, or frequency of forwarding (ex: RSS feed curators)

Standardization Axis

Standardization of input is a key axis of divergence among notetakers. The level of standardization is an early software decision that has a lot of influence what the software is good at, and how it is used. Think of this as the extent to which formatting is standardized, and to which random imports are allowed or integrated.

Standardized Format: All data is kept in a similar file format to one another. On the plus side, it's easier for people to program smart interactions between standardized notes. On the minus side, you might not be able to import older notes from another system. (ex: Workflowy)

And the Kitchen Sink: Can import and render many different data types, but usually with very limited interactions allowed between them, because they're harder to program in. In my experience, exporting data from a Kitchen Sink is usually a nightmare. (ex: Evernote)

Standard Compromises

Some commit very hard in one direction or the other, for the most part (ex: Workflowy tends very SF, Evernote tends very AtKS).

There are also a few common compromises.

A lot of generalist notetaking software reaches a compromise by having a standard format with added functionality, and some non-standard formats that can be stored, but lack most of the added functionality. We can call this Standard +.

Another type of compromise I've seen is having non-standardized data types, but standardized metadata. (ex: computer file systems exemplify this). I would call this Meta-Standardized. In order to warrant the name, I would also require that the standardized metadata be surfaced to the user in some way.

Length of Input

Another axis of divergence for standardized notetakers is the length of input that it encourages, or even forces.

Some applications can get pretty ham-handed at forcing their preferred length upon your communiques, and will cut you off if it comes down to it.

If it's not forced, look to the size of the default textbox to see what length it is encouraging. Is there a whole blank page, waiting to be filled? An emulated index card? A single line?

On the shorter end: 3 words, the sentence, the index card, 200 characters

On the longer end: the essay, the blank page, the giant template, the large div, the endless scroll

Short encourages compression. Long, on the other hand, tends to encourage elaboration and expansion.

I've generally noticed that short is good for getting thoughts down, but long is often better for organizing and referencing. YMMV, though.****

Hide and Show

Code Folding can make very large but well-structured documents manageable to skim and read, by temporarily hiding the sections of a tree that you don't currently want to interact with.

Transclusion (coined by Ted Nelson) lets you reference and call a smaller note within a larger note. Tiddlywiki supports it, and it is easily one of my favorite features.

Miscellaneous

Summary

File System
Outliner
Timeline
- Gantt Chart (or multi-threaded)
Calendar
Tags
MindMap
Wiki
- Personal wiki
Flashcards
Annotator

Standardization? (Formats?)

Length of Input?

Footnotes

* Two nerds were probably geeking out about markdown notetaking software, or something. One of those nerds may have been me.

** The author also wrote 2 sections in Mindhacker. Smaller and polished. They are under the headers "Write in your Books" and "Write Magnificent Notes." I did not get much out of these sections, although other parts of the book seem potentially interesting.

*** A couple of people he deems notable exceptions: the thoughts of Ted Nelson, the man who coined the term "hypertext." David Allen's "Getting Things Done" system.

**** I've heard some people swear that organized index cards were the best system for them, overall. I know I couldn't get anything lasting from that. I suspect the variance in people's needs here tends to be pretty wide.

Questions

What major types of notetaking software did I miss entirely?

What types of notetaking software do you use? What do you get out of them?

If you use more than one, how are the benefits different? What situations does one handle better than the other?

Do you have any import/export horror stories?

Today's Challenge

Write up your own notetaking desiderata

See the linked partner post for a walk-through.

I suggest making a distinction between non-programmable and programmable systems. We have non-searchable systems, like physical notebooks, and we have searchable systems, like wikis. Going from searchable systems to programmable systems is a similar quantum leap.

One might say that programmability goes beyond the bounds of notetaking, but if our larger domain (exobrain) includes both notetaking and programmability, do we want to mix them or keep them separate?

As a simple example, I can have Google Calendar email me every Thursday morning (programmability). Whatever I put in the event description (notetaking) appears in the body of the email.

Lately I am using Zim Desktop Wiki. I can make links that run batch files on my PC. Those batch files can launch applications or run little Python programs that access a database and generate Zim pages. This is very open-ended, but often with just a bit of programming I can add a feature that I will use a lot.

What do you mean by "programmable"? I keep my notes as a directory of markdown files in a git repo. I can manipulate these files with all the standard Unix command line tools that are specialized for manipulating text. In your mind, does that meet your threshold for programmability, or are you looking for something else?

What do you mean by "programmable"?

Is it possible to add new features, you hadn't previously thought of? How easy?

It's as simple as doing any other sort of text manipulation with a shell script or Python script, or whatever other programming system one uses to manipulate text. It's remarkable what you can do with a simple combination of find and sed.

I think this is a good question. Here are some probable components of programmability...

Did it surface most of its actual functionality to users?
- A couple different settings: Closed proprietary cloud software, API (how friendly or permissive is it?), downloadable open-source...
How easy (and safe!) it is to call relevant utility functions?
- Do you need to close the software to edit it? Did they merely surface the functionality, or did they also leave functions that were highly-exposed, labeled, well-documented, and easy to use? How well do they adhere to various standards, and therefore benefit from skill-transfer? Is it easy to screw up? To revert? What's the learning curve like?

I'm not sure I agree with the premise of the question. Correct me if I'm wrong, but it seems to me that the question is assuming there's a single program or system somewhere that is maintaining the wiki, and that this single monolithic system has certain characteristics (open vs. closed source, accessible vs. inaccessible API, etc, etc.). My response is to ask why do we want a single monolithic system in the first place?

In my mind, a personal knowledgebase is a set of texts which capture information that we want to store and retrieve later. Fortunately for us, Unix and Unix-like (by which I mean, Linux, MacOS and Windows-with-WSL) computer systems come pre-equipped with a plethora of tools that have been finely tuned for text processing over a period of decades. I've found that by combining the tools already available, I can do most of the things a monolithic wiki system would do with far less configuration and far more flexibility.

With that in mind, I find that my answer to most of your questions is, "Not applicable". Is it closed proprietary cloud software? It certainly can be, if you store your files in a proprietary service like Dropbox. However, if you store your files in a git repo, which you either self-host or use a more free service like GitLab or sr.ht, it doesn't have to be. The API, such as it is, is the same "API" you can use to interact with any other file on your computer: GNU command line tools, or if you choose to write scripts in some other programming language, whatever file manipulation API is exposed by the standard library for that language. Same with editing. I choose to use an open source text editor (namely, Visual Studio Code), but there are certainly many competent proprietary text editors, such as BBEdit or Sublime Text.

How easy is it to call relevant utility functions? Well, it's as easy as invoking any other shell command. Do I need to close the software in order to edit it? Once again, the answer is "not applicable", because I'm not editing a single piece of software, I'm composing multiple pieces of software, on the fly, to accomplish particular tasks.

Are the functions easy to use and standardized? While we can debate the usability of Unix command line tools for a long time, what cannot be denied is that they are quite well standardized. As for skill transfer, the skills are extremely transferable, insofar as they're exactly the same skills you'd be using to manage source code in any kind of even moderately sized codebase.

It can be easy to screw up. Command line tools are sharp, and can cut you if you don't use them appropriately. However, if you have your wiki in a version control system, reverts are nigh trivial. One command and your wiki (or any part of your wiki) is restored to a previous state of your choosing.

While the learning curve on command line tools is steep, I would argue that the advantages that one earns in flexibility, speed (both in terms of machine time and user time), and transferability to other tasks make it more than worthwhile. Of course, if one already knows how to use command line tools with a fair degree of proficiency (as many programmers and technically inclined people do), then the question becomes, why aren't you using these tools to manage your knowledgebase?

TextCards

I really wish there was better flashcard and annotation/marginalia software out there! It's kinda weird to me how limited the options seem to be for both. While plenty of things perform the core functionality, I haven't seen as many interesting experiments with it as I have with, say, outliners.

While writing this post, I developed a vague suspicion that there's something in-between Annotator and Flashcard that could be pretty valuable if someone actually implemented it. This seems as good a place as any to describe it. (And if someone has already done it, or wants to do it, cool!)

Annotators and Flashcards are both often tracking an underlying dictionary-ish data-type, and a lot of flashcards seem to originate from textbooks. I have a suspicion that there should exist a good standardized-format notetaker that goes something like... this?

TextCards: 3 linked items

A bounded section of highlighted textbook (Any size, from a section to entire chapter. Sometimes discontinuous.)
An index-card laconic description (or answer)
A title (or question)

Sometimes, it could be used to pose standard quiz-questions (the highlighted section is just the part of the book the quiz came from, the title is the question, the description is an answer). But where it might really shine is in "Summarize Chapter X" questions; it encourages you to write along as you read the text, and if you miss something on a quiz, you can click right to the sections you were originally summarizing.

When rendered as marginalia, the small titles (until click) should make that experience more tolerable for frequent-margin-users. (Marginalia asyncing with the page seems like a really common problem, otherwise.)

For convenience, adding something that swipes all of the questions from a highlighted section of the text to form the front end of flashcards (that you then answer) seems pretty nice. For well-formatted answer-sections, you might even be able to get it to pair the two (but you'd probably need to highlight where to look). Additionally, it wouldn't be that hard for it to track which chapter's questions you're doing poorly on -and therefore what chapters you should re-read- if it knows where in the book you swiped them from. Bonus points if you can sort and cross-link notes by title, folder, tags, overlapping highlights, and/or order in text.

Presumably this is usually harder than I think it should be, because PDFs are just awful (I've dragged tables from PDFs before; I feel so sorry for Tabula!). But HTML books and ebooks don't have that problem, and often simulate a textbook-like structure.

As a simple example, I can have Google Calendar email me every Thursday morning (programmability). Whatever I put in the event description (notetaking) appears in the body of the email.

What do you mean by "programmable"?

Is it possible to add new features, you hadn't previously thought of? How easy?

I think this is a good question. Here are some probable components of programmability...

Did it surface most of its actual functionality to users?
- A couple different settings: Closed proprietary cloud software, API (how friendly or permissive is it?), downloadable open-source...
How easy (and safe!) it is to call relevant utility functions?
- Do you need to close the software to edit it? Did they merely surface the functionality, or did they also leave functions that were highly-exposed, labeled, well-documented, and easy to use? How well do they adhere to various standards, and therefore benefit from skill-transfer? Is it easy to screw up? To revert? What's the learning curve like?

TextCards

TextCards: 3 linked items

A bounded section of highlighted textbook (Any size, from a section to entire chapter. Sometimes discontinuous.)
An index-card laconic description (or answer)
A title (or question)

LESSWRONG
LW

LESSWRONG
LW

26

BBE W1: HMCM and Notetaking Systems

26

HMCM and Notetaking Systems

BBE W1C1

HMCMETYT

Reviewy Review

Glad That's Over

Categorizing Notetaking Systems

Standardization Axis

Standard Compromises

Length of Input

Hide and Show

Miscellaneous

Summary

Footnotes

Questions

What major types of notetaking software did I miss entirely?

What types of notetaking software do you use? What do you get out of them?

If you use more than one, how are the benefits different? What situations does one handle better than the other?

Do you have any import/export horror stories?

Today's Challenge

Write up your own notetaking desiderata

26

TextCards

26

TextCards