I think the key element that determines how easy a piece of writing is to read is its density of novelty.

Novelty can be thought of as the writing equivilant of information. Anything the reader already knows doesn't have to be fully processed, it can just be recalled. Known words, idioms, and structures don't have to be relearned every time they appear. So only new information (novelty) has to be decoded by the reader.

The higher the density of novelty, the harder a piece of writing is to read.

Shakespeare vs Ordinary Speech

Shakespeare is relatively difficult for modern readers because there are lots of unfamiliar words, linguistic structures, and styles of expression. The reader has to process novel elements like blank verse, Elizabethan English, and poetic creativity.

Will all great Neptune's ocean wash this blood
Clean from my hand? No; this my hand will rather
The multitudinous seas incarnadine,
Making the green one red.

Macbeth Act 2, Scene 2, 54–60

This was a little easier for people in Shakespeare’s time to follow, because they were more familiar with contemporary linguistic and artistic tropes.

Contrast that with the effortlessness of parsing ordinary conversation:

Hello, how are you?
Fine, thanks. And you?
I’m doing well.

Ordinary conversation barely registers as information to our minds because it's so familiar.

Readable writing falls somewhere between these two extremes, maintaining a comfortable density of novelty for the reader. I’ll give several examples of writing with a high density of novelty (hard to read), and writing with a low density of novelty (easy to read). In general, it's better to have a lower density of novelty if you want to communicate clearly.

High Density of Novelty (Hard)

The sub-relations and sur-relations of quads span partonomic hierarchies, where each element can be defined by its parts. This is different from a taxonomic (“is-a”) hierarchy, where the elements are categories made up of sub-categories.

 - Joscha Bach’s PhD Thesis “Principles of Synthetic Intelligence

Admittedly, this quote is taken out of context from a long PhD thesis, and Joscha Bach is actually a very clear communicator. But on its own, without the proper background knowledge, this passage is unreadable (to me) because there are too many unfamiliar terms, and not enough context to infer their meaning.

Low Density of Novelty (Easy)

“A lot of the commentators say Moloch represents capitalism. This is definitely a piece of it, even a big piece. But it doesn’t quite fit.”

 - Scott Alexander’s “Meditations On Moloch”

Notice that there’s only one uncommon word here (Moloch), and it’s the focus of the essay. Scott Alexander uses lots of familiar words to provide context, so that even when a new word goes undefined, it’s usually possible to infer its meaning. Although it covers many complex ideas, this essay is long, and the novel ideas are spaced out with illustrations and explanations to allow the reader to process what’s being said.

High Density of Novelty (Hard)

You need to realize right now that the existence which is had through and through by Buddha Nature is beyond the existence of ‘existing versus not existing’. ‘Having It through and through’ is the Buddha’s term. It is the Tongue of Buddhas. It is the Eye of the Buddhas and Ancestors. It is the Nose of mendicant monks.

 - Dogen’s “Shobogenzo”

For modern English-speaking non-Buddhists, the unfamiliar phrases and concepts here are impenetrable. To make this readable, a translator could go through each of these terms one by one and explain their cultural and religious significance.

Low Density of Novelty (Easy)

“We shall fight in France,
we shall fight on the seas and oceans,
we shall fight with growing confidence and growing strength in the air,
we shall defend our island, whatever the cost may be.
We shall fight on the beaches,
we shall fight on the landing grounds,
we shall fight in the fields and in the streets,
we shall fight in the hills;
we shall never surrender.”

 - Winston Churchill

Lots of famous speeches use repetition to make parsing the novel bits easier. A repeated phrase doesn’t have to be decoded again each time, so the focus is on the part that changes.

High Density of Novelty (Hard)

The end of man (as a factual anthropological limit) is announced to thought from the vantage of the end of man (as a determined opening or the infinity of a telos). Man is that which is in relation to his end, in the fundamentally equivocal sense of the word. Since always.
 - Jacques Derrida

When a philosopher adopts a unique and idiosyncratic way of communciating, the result is often difficult to read. It's better to surround novel ideas with clear explanations for the sake of comprehensibility.

Low Density of Novelty (Easy)

I try to write using ordinary words and simple sentences. That kind of writing is easier to read, and the easier something is to read, the more deeply readers will engage with it. The less energy they expend on your prose, the more they'll have left for your ideas.
 - Paul Graham

Paul Graham communicates novel ideas using common words and structures. This gives the reader lots of familiarity to work with. More novel ideas need to be balanced out by more familiar words to provide relevant context.

 

Note

This seems like a moderate deviation from the typical model of readability, which is something like simplicity, brevity, and directness. These are often useful for clarity, but I think the fundamental factor is really density of novelty, and I think succesfully clear writers manage a balance between novelty and familiarty.

New Comment
4 comments, sorted by Click to highlight new comments since:

I am a lawyer, and in legal writing we are taught the concept of "old to new waterfall." This means, essentially, that every sentence, every paragraph, and the document as a whole, should start by reminding the reader of something they already know, and then proceed to new information that's related. The important bit is that, once some "new" information is introduced, it becomes "old" information in-scope.

So you can build up complex points/arguments by building old-> new, old -> new, over and over. You can see this even at the sentence level. Here's an example:

The moon is made of green cheese. The cheese smells very bad, and the bad smell wafts through nearby space. The smell repels aliens from attacking Earth.

The above is an example of how to do this. The below is an example of how not to do this. Note that they both have the same semantic content, but the first "flows" better and is thus easier to read.

The moon is made of green cheese. Nearby space is filled with bad smell coming from the cheese. Aliens don't attack Earth, because they are repelled by the smell.

I'm very interested in this topic. I think you point to something important, but the model seems quite incomplete.

First off, the examples are somewhat strange because a text with unknown terms it it isn't harder to understand, it's impossible to understand by itself. Using examples where explanations are included may make more sense. I'm also skeptical that some of your hard examples are, in fact, hard rather than meaningless.

If terms are freshly defined, I think there's more to be said about what makes them difficult.

Second, I think prose is a separate dimension from novelty; I can imagine both sentences with difficult prose but simple words and with simple prose but unknown terms.

[-]TLW10

The reader has to process novel elements like blank verse, Elizabethan English, and poetic creativity.

Much[1] of this is simply formatting. Capitalizing the first word of every line makes my parser initially assume it missed a question mark at the end of the first line, which is made worse by the fact that "Will all great Neptune's ocean wash this blood?" is close enough to a valid sentence.

I find said quote significantly easier to read if you reformat it somewhat:

Will all great Neptune's ocean wash this blood
  clean from my hand? No; this my hand will rather
  the multitudinous seas incarnadine,
  making the green one red.

(That is, 2-space indent at the start of any lines that don't start a sentence.)

(Of course, poetry fans will kill me[2] for reformatting poetry.)

  1. ^

    Not all.

  2. ^

    Not literally.