The RAIN Framework for Informational Effectiveness

ozziegooen

[Epistemic status: Medium-Uncertainty. I've only spent a few days thinking about this, but it seems to fit well for some specific problem so far.]

The following describes one possible framework for understanding the usefulness of different sources of information. It's particularly meant to help value source types such as books, academic articles, blog posts, online comments, and mathematical models. I think it could be a useful starting point but would guess that there are better alternatives upon further deliberation.

The framework factors are robustness, importance, novelty, and accessibility.

Simple Use Cases

There are multiple things this could be useful for, I'm sure most of which I haven't yet considered. For a start, I would hope that it could be used when discussing options on either writing information for others or deciding what materials to encourage.

Some possible discussion quotes relating to this framework
"These blog posts are quite novel, but I think they aren't very robust."
"This video may not be very dense, but it is highly accessible."
"This paper has a lot of equations but they don't seem useful to the point. It's both inaccessible and not robust."
"I think that you can change your paper to make it more accessible without sacrificing any robustness."

Informational Effectiveness

People use informational resources (books, videos, etc), in part, to learn information. There are many important attributes of such resources that will impact the quality and magnitude of such learning. Here I wrap these in the total term "information effectiveness."

Information effectiveness, when most narrowly estimated, is context-specific to an agent or group of readers or writers. It is specific to a set of topics; for instance, a particular article by George about politics may be considered ineffective on the topic of politics, but highly effective in it's revealed information about George's beliefs.

Informational effectiveness could be judged for any quantity of information; an entire book, a "per-page average", a "per-bit average", or similar.

In this document, we focus on "reader information effectiveness", which seeks to understand the effectiveness of information to readers. Similar frameworks could be made for writers; for instance, they may have goals such as persuading readers of specific claims or generating status.

To give a simple example, if you were to read a document that was enjoyable, seemed trustworthy, and became significantly life-changing in a positive way, that would be considered to have high informational effectiveness. If you were to read a boring archaic tome by a highly unreliable author about a topic not at all important to you, then that would be considered to have low information effectiveness. To be clear, this says more about the relationship between yourself and the text than about the text itself; in each case, it's possible other readers could have had very different reactions.

While reader informational effectiveness varies per reader, there are expected to be strong correlations between readers on many dimensions. For example, one article may be highly biased. This may not be a big deal for a reader incredibly well read on the particular author's biases, but would likely be a significant deterrent for most readers. Therefore such an article could be rated as having "low expected informational effectiveness" for a collection of possible readers.

RAIN Factors

The RAIN framework lists four factors that I think may be relatively intuitive, mutually-exclusive, relatively exhaustive, and relatively low in internal correlations. These factors are robustness, importance, novelty, and accessibility.

Hypothetically one could directly calculate the expected value of all information sources on all agents on all tasks, but this would be challenging and may not break down into the most intuitive substructures. This framework may provide a more pragmatic approach.

Robustness

Robustness describes how valid the information understood by a reader would be expected to be. This could mean a few different things in different contexts. If a reader is reading an article expressing several claims, robustness would refer to the expected validity of those claims. If the reader is reading a table of data, robustness would refer to the expected validity of that data. If the reader is reading an article by an obviously biased source, but is reading it for information unimpacted from that bias, then that information can be robust.

Robustness can itself be broken down into further factors.

Verifiability
If claims or data are described, can those be easily verified? One way of doing this is by being able to explicitly falsify this information.

Bias (Noise)
Based on the author's background, the medium, and the intended audience, can this information be expected to be false, misleading, or selectively chosen to create bias that would be disliked by the reader?

Scrutinization
If the information came from a human author, did it go through rounds of scrutinization by unbiased and qualified parties? Will other parties pay attention to it and be able to disprove questionable claims? Even in specific cases where scrutinization itself is not obvious, the threat of it could promote accuracy.

Accessibility

Accessibility covers the relative cost or benefit of obtaining information. Typically learning bears costs, but not always. There are some educational information sources that are highly enjoyable and preferable even if not for the information value; these would be considered to have negative learning cost.

Unlike with the other three primary attributes, accessibility determines both costs and benefits. An unnecessarily difficult-to-read book would probably have readers struggle more per unit learned (a cost), but also have them give up before learning all the available content (a lack of benefit).

As with robustness, accessibility can be broken down further.

Availability
Information may not be easily available to many possible learners. This could be because it is behind a paywall, only shared within an exclusive group, or difficult to discover. In cases like video, it may not be available in websites that offer variable speeds. There could also be substantial parts missing.

Understandability
Even when information is technically available, it may be difficult for some readers to understand. This could be reader-specific; a technical article may have high understandability, and thus information effectiveness, for some readers, but not others.

Most documents take a lot of time to understand, and then have some expected limit of understanding for a given reader. Both of these cost considerations can be significant and would go under the title of understandability.

Enjoyability
If information is strongly unenjoyable, that would count as a cost for the learner. Enjoyment could come from many traits such as simplicity, elegance, low required mental effort, and humor. There could also be personally beneficial factors such as reinforcing the learner's identity or making them feel intelligent.

Compactness
Compactness describes the density of relevant information.

Importance

Importance here is very similar to the Importance attribute of the ITN framework. Information content is important to a reader if it describes information that is highly decision relevant to the reader. This is very similar to it having high "value of information", though it is not constrained to any one decision the reader may be facing. Note that it is possible that information content could be high in importance but still useless; for instance, if the reader already knows all of that information.

Novelty

Information is novel to a learner if that learner does not yet know that information. If the reader does know that information, it would have zero educational value. I believe this is pretty self-evident.

Common Trade-Offs

I think there are some common correlations between the four factors, and that these come about for different reasons.

Robustness vs. Accessibility

Some common ways of making information sources more robust include things that would make them less generally accessible.

High-Robustness, Low-Accessibility Example
Technical papers with lots of proofs, citations, and carefully described terminology.

Low-Robustness, High-Accessibility Example
Short emotionally-charged opinion pieces.

Importance vs. Accessibility

People generally seem to like it when information is useful to them, but on the other hand, the most accessible information for them is generally not the most important.

High-Importance, Low-Accessibility Example
Facts involving difficult truths. For a group at war, this could be, "You are very likely to lose, and if you really should surrender immediately."

Low-Importance, High-Accessibility Example
Writings about the lives of cultural celebrities.

Robustness vs. Novelty

When information is not novel, the learner would have a greater ability to validate it against their existing knowledge. Also, if one believes there is generally a much wider variety of false information than true information, then on average the false information would be more novel.

High-Robustness, Low-Novelty Example
Scientific statements that can be reasonably verified, because almost all are already known well by the readers.

Low-Robustness, High-Novelty Example
Sophisticated conspiracy theories complete new to the readers but very unlikely to be true.

Accessibility vs. Novelty

If information is not at all novel it may be boring, which would reduce accessibility. On the other hand, if it is too novel, it may be mentally challenging to process, also reducing accessibility.

High-Accessibility, Low-Novelty Example
A movie that the viewers have seen before but still enjoy. They don't have to struggle to follow it because they already know it well.

Low-Accessibility, High-Novelty Example
A 120-minute, highly-dense academic seminar on a very new topic to the audience.

Density vs. Accessibility

This is similar to the accessibility/novelty tradeoff. Very dense and very sparse information is typically low in accessibility.

High-Density, Low-Accessibility Example
A dense math logic textbook with derivations but very few explanations.

Low-Density, High-Accessibility Example
An extensive video series on a relatively simple subject.

Using RAIN for Content Evaluation

If one wanted to use this framework in order to evaluate all blog posts of LessWrong, for instance, I would recommend using it as a starting point, but modifying it for the use case. A few things to consider:

Does the audience fall into clusters? Content may be important or novel to some clusters but not others.
It's a much higher bar to be novel to experts than to be novel to most readers. Work that is novel to experts can be considered "innovative," will work that is novel to most readers can be considered "informative" or similar.
Total length likely matters, even though it is not technically part of this framework.

Future Work

The current framework is not tied to any specific mathematical model. I think that one is possible, though it may not map 1-1 with the accessibility term specifically.

It would be interesting to attempt to provide rubrics or quantifications for each factor. I'd also be interested, of course, in applying this framework in different ways to various available information sources.

For any specific in-progress informational work, there would be an effective "Pareto frontier" of RAIN factors. Understanding how to weight these factors for future works could be quite useful.

Many thanks to Ondřej Bajgar, Jan Kulveit, and Carina Prunkl for feedback and discussion on this post.

This is great, I can see it being really helpful for me to consciously think about which of these I'm optimizing for (or am willing to sacrifice) when writing. I got confused by the introduction of the term 'density' in the section on trade-offs, as this isn't represented in the RAIN framework. Is density just a sub-consideration of accessibility or are you considering it in its own right?

Thanks, and good to hear!

Density applies only to some situations; it just depends on how you look at things. It felt quite different from the other attributes.

For instance, you could rate a document "per information content" according to the RAIN framework, in which you would essentially decouple it from density. Or you could rate it per "entire document", in which case the density would matter.

Would be interesting to use this framework for articles on LessWrong. Most people don't spend time arguing why they downvote or upvote posts. It would be useful to know that the community e.g. had downvoted a post mostly based on e.g. enjoyability, robustness, or novelty. There are probably many other ways one could measure, but this one still seems simple yet very useful.

One could of course post one's ranking using text as a comment, but that doesn't aggregate the community's judgment effectively.

If you liked this post you’d probably like this facebook post that Ozzie wrote recently on a similar topic:

https://www.facebook.com/722750362/posts/10165839328500363/?d=n

This framework reminded me of this quote from Bret Victor's talk "The Humane Representation of Thought" (timestamp included in link)

I've transcribed it approximately here (with some styling and small corrections to make it easier to read).

"There are many things, especially kind of modern things that we need to talk about nowadays which are not well-represented in spoken language.

One of those is systems. We live in an era of systems:

Natural systems:

Environment
Ecosystems
Biological systems
Pathological systems
etc

Systems that we make:

Political
Economic
Infrastructural systems
Things we make out of concrete, metal, electronics.
etc

The wrong way to understand a system is to talk about it, to describe it.

The right way to understand a system is to get in there, model it and explore it. You can't do that in words.

What we have is that people are using very old tools, explaining and convincing through reasoning and rhetoric instead of these newer tools of evidence and explorable models. We want a medium that supports that."

A quick sketch of how the RAIN Framework approximately maps to Bret's model of a good communication medium

Rigor: Quick sketch, not exhaustive. To start a conversation.

Epistemic Status: Moderate. I think Bret Victor has a lot of good insights, but I haven't done an extensive research to see if the cognitive science research supports his claims.

Explorable Models

Accessibility

Availability

No big difference to text, most texts are accessible these days and explorable models can be too, on the internet. However, making native high-performance explorable models is trickier with software if you want it to work on all platforms. WebAssembly could improve this in the future, hopefully.

Understandability

Well-designed explorable models of systems could give some insights much faster than reading about the systems could. E.g. innovation of mathematical notation was a big step forward in expressing some things more effectively than words previously could, although with a steep learning curve for many people. However, I could imagine badly designed explorable models that wouldn't effectively guide you toward specific insights either, so it's important to compare high-quality texts with high-quality explorable models.

Compactness

Explorable models can be dynamic, so they could e.g. be personalized to specific audiences. Instead of writing multiple texts for multiple audiences, one model could be made that could adapt on parameters to fit different audiences. Personalization could tie insights better to your current needs and motivations and make you more likely to pursue learning about challenging but valuable information.

Enjoyability

Explorable models could engage more senses than only the symbolic visual. Personalization probably increases enjoyability too.

Evidence

Robustness

Could include system features to provide verifiability, more openness on bias & noise, and e.g. making models open source for scrutinization.

Explorable models could include being able to explore the data and source, not only the output program.

Importance

If we had explorable models of moral uncertainty which brought together a diversity of intrinsic values representing the cultures of the world, then (to my knowledge) that is the closest approach we have to find "evidence" of what is most important.

Effective Altruism organizations like e.g. Global Priorities Institute could use explorable models to make their information & evidence more accessible, easier to give feedback to, and then improve further in their judgment on what is most important.

I'm curious what you think about Bret Victor's claims. How big could the effect be if people used explorable models more? The technology to make it cheap to author these kinds of models isn't here yet, so if we consider tradeoffs then writing might still be a better option. I'm personally more excited about skill-building to make explorable models than I am about perfecting my writing, but maybe I'm overestimating the value. I used to read a lot on Less Wrong but these days I often find it hard to choose which articles are worth my time reading, perhaps a lot based on the enjoyability and compactness factors. But maybe I'm letting my vision of how good information intake could be irrationally demotivates me to read and write texts in the LessWrong norm format.

Thanks for this clear framework, it's really useful for me right now!

I haven't read Kahneman's book 'Noise' yet, just listened to a podcast episode where he described how it is important to distinguish between noise and bias. I'm curious if that distinction is important in this framework and if I should read "Bias (Noise)" as "Bias & Noise" or something else instead?

Minor comment/correction - VoI isn't necessarily linked to a single decision, but the way it is typically defined in introductory works, it implicit that it is limited to one decision. This is mostly because (as I found out when trying to build more generalized VoI models for my dissertation,) it's usually quickly intractable for multiple decisions.

Good to know. Can you link to another resource that states this? Wikipedia says "the amount a decision maker would be willing to pay for information prior to making a decision", LessWrong has something similar "how much answering a question allows a decision-maker to improve its decision".

https://en.wikipedia.org/wiki/Value_of_information https://www.lesswrong.com/posts/vADtvr9iDeYsCDfxd/value-of-information-four-examples

The works on decision theory tend to be general, but I need my textbooks to find better resources - I'll see if I have the right ones at home. Until then, Andrew Gelmans' BDA3 explicitly formulates VoI as a multi-stage decision tree in section 9.3, thereby making it clear that the same procedure is generalizable. And Jaynes doesn't call it VoI in PT:LoS, but his discussion in the chapter on simple applications of decision theory leaves the number of decision implicitly open.

This is great.

Could you link to some specific examples of content that hits the different sweet spots of the framework?

Glad you like it :)

There's definitely a ton of stuff that comes to mind, but I don't want to spend too much time on this (have other posts to write), but a few quick thoughts.

Novelty
The Origin of Consciousness in the Breakdown of the Bicameral Mind
On the Origin of Species

Accessibility
Zen and the Art of Motorcycle Maintenance
3Blue1Brown's Videro Series

Robustness
Euclid's Elements
The Encyclopædia Britannica

Importance is more relative to the reader and is about positive expected value, so is harder to say. Perhaps one good example is an 80,000 Hours article that gets one to change their career.

I'm also interested in others here have recommendations or good examples.

What's the difference between RAIN, RINA, ARIN, NIRA, Iran...

I quite like ARIN.

Could you standardise the order of the headings to match the acronym?

There's no real difference, RAIN just a quick choice because I figured people may prefer it. Happy to change if people have preferences, would agree ARIN sounds cool.

Do other commenters have thoughts here?

I don't care about the name but please order the headings to match whichever name you choose.

Changed. I originally moved accessibility to the bottom, out of order, just because the other three are more similar to each other, but don't have a strong preference.

RAIN is easiest and most memorable.

Thanks, and good to hear!

Density applies only to some situations; it just depends on how you look at things. It felt quite different from the other attributes.

One could of course post one's ranking using text as a comment, but that doesn't aggregate the community's judgment effectively.

If you liked this post you’d probably like this facebook post that Ozzie wrote recently on a similar topic:

https://www.facebook.com/722750362/posts/10165839328500363/?d=n

This framework reminded me of this quote from Bret Victor's talk "The Humane Representation of Thought" (timestamp included in link)

I've transcribed it approximately here (with some styling and small corrections to make it easier to read).

"There are many things, especially kind of modern things that we need to talk about nowadays which are not well-represented in spoken language.

One of those is systems. We live in an era of systems:

Natural systems:

Environment
Ecosystems
Biological systems
Pathological systems
etc

Systems that we make:

Political
Economic
Infrastructural systems
Things we make out of concrete, metal, electronics.
etc

The wrong way to understand a system is to talk about it, to describe it.

The right way to understand a system is to get in there, model it and explore it. You can't do that in words.

A quick sketch of how the RAIN Framework approximately maps to Bret's model of a good communication medium

Rigor: Quick sketch, not exhaustive. To start a conversation.

Epistemic Status: Moderate. I think Bret Victor has a lot of good insights, but I haven't done an extensive research to see if the cognitive science research supports his claims.

Explorable Models

Accessibility

Availability

Understandability

Compactness

Enjoyability

Explorable models could engage more senses than only the symbolic visual. Personalization probably increases enjoyability too.

Evidence

Robustness

Could include system features to provide verifiability, more openness on bias & noise, and e.g. making models open source for scrutinization.

Explorable models could include being able to explore the data and source, not only the output program.

Importance

Thanks for this clear framework, it's really useful for me right now!

https://en.wikipedia.org/wiki/Value_of_information https://www.lesswrong.com/posts/vADtvr9iDeYsCDfxd/value-of-information-four-examples

This is great.

Could you link to some specific examples of content that hits the different sweet spots of the framework?

Glad you like it :)

There's definitely a ton of stuff that comes to mind, but I don't want to spend too much time on this (have other posts to write), but a few quick thoughts.

Novelty
The Origin of Consciousness in the Breakdown of the Bicameral Mind
On the Origin of Species

Accessibility
Zen and the Art of Motorcycle Maintenance
3Blue1Brown's Videro Series

Robustness
Euclid's Elements
The Encyclopædia Britannica

Importance is more relative to the reader and is about positive expected value, so is harder to say. Perhaps one good example is an 80,000 Hours article that gets one to change their career.

I'm also interested in others here have recommendations or good examples.

What's the difference between RAIN, RINA, ARIN, NIRA, Iran...

I quite like ARIN.

Could you standardise the order of the headings to match the acronym?

There's no real difference, RAIN just a quick choice because I figured people may prefer it. Happy to change if people have preferences, would agree ARIN sounds cool.

Do other commenters have thoughts here?

I don't care about the name but please order the headings to match whichever name you choose.

Changed. I originally moved accessibility to the bottom, out of order, just because the other three are more similar to each other, but don't have a strong preference.

RAIN is easiest and most memorable.

LESSWRONG
LW

LESSWRONG
LW

37

The RAIN Framework for Informational Effectiveness

37

Simple Use Cases

Informational Effectiveness

RAIN Factors

Robustness

Accessibility

Importance

Novelty

Common Trade-Offs

Robustness vs. Accessibility

Importance vs. Accessibility

Robustness vs. Novelty

Accessibility vs. Novelty

Density vs. Accessibility

Using RAIN for Content Evaluation

Future Work

37

A quick sketch of how the RAIN Framework approximately maps to Bret's model of a good communication medium

Explorable Models

Evidence

37

A quick sketch of how the RAIN Framework approximately maps to Bret's model of a good communication medium

Explorable Models

Evidence