My Model of Epistemology

adamShimi

I regularly get asked by friends and colleagues for recommendation of good resources to study epistemology. And whenever that happens, I make an internal (or external) "Eeehhh"pained sound.

For I can definitely point to books and papers and blog posts that inspired me, excited me, and shaped my world view on the topic. But there is no single resource that encapsulate my full model of this topic.

To be clear, I have tried to write that resource — my hard-drive is littered with such attempts. It's just that I always end up shelving them, because I don't have enough time, because I'm not sure exactly how to make it legible, because I haven't ironed out everything.

Well, the point of this new blog was to lower the activation energy of blog post writing, by simply sharing what I found exciting quickly. So let's try the simplest possible account I can make of my model.

And keep in mind that this is indeed a work in progress.

The Roots of Epistemology

My model of epistemology stems from two obvious facts:

The world is complex
Humans are not that smart

Taken together, these two facts mean that humans have no hope of ever tackling most problems in the world in the naive way — that is, by just simulating everything about them, in the fully reductionist ideal.

And yet human civilization has figured out how to reliably cook tasty meals, build bridges, predict the minutest behaviors of matter... So what gives?

The trick is that we shortcut these intractable computations: we exploit epistemic regularities in the world, additional structure which means that we don't need to do all the computation.^[1]

As a concrete example, think about what you need to keep in mind when cooking relatively simple meals (not the most advanced of chef's meals).

You can approximate many tastes through a basic palette (sour, bitter, sweet, salty, umami), and then consider the specific touches (lemon juice vs vinegar for example, and which vinegar, changes the color of sourness you get)
You don't need to model your ingredient at the microscopic level, most of the transformations that happen are readily visible and understandable at the macro level: cutting, mixing, heating…
You don't need to consider all the possible combinations of ingredients and spices; if you know how to cook, you probably know many basic combinations of ingredients and/or spices that you can then pimp or adapt for different dishes.

All of these are epistemic regularities that we exploit when cooking.

Similarly, when we do physics, when we build things, when we create art, insofar as we can reliably succeed, we are exploiting such regularities.

If I had to summarize my view of epistemology in one sentence, it would be: The art and science of finding, creating, and exploiting epistemic regularities in the world to reliably solve practical problems.

The Goals of Epistemology

If you have ever read anything about the academic topic called "Epistemology, you might have noticed something lacking from my previous account: I didn't focus on knowledge or understanding.

This is because I take a highly practical view of epistemology: epistemology for me teaches us how to act in the world, how to intervene, how to make things.

While doing, we might end up needing some knowledge, or needing to understand various divisions in knowledge, types of models, and things like that. But the practical application is always the end.

(This is also why I am completely uninterested in the whole realism debate: whether most hidden entities truly exist or not is a fake question that doesn't really teach me anything, and probably cannot be answered. The kind of realism I'm interested in is the realism of usage, where there's a regularity (or lack thereof) which can be exploited in some case, and not in others, whether or not I wish it to be different.)^[2]

So the interesting question becomes: What are the possible end goals of epistemology? What sort of goals do we want to accomplish, and how do they impact what we need from epistemology?

Currently I have the following three categories:^[3]

Prediction
Intervention
Construction

Prediction: How To Know What Will Happen

Given an existing system (physics, a bridge, a language…), you want to predict some property that you haven’t yet found or measured: maybe its exact trajectory, whether it break, if it finally reaches equilibrium.

The obvious class of situations where prediction is the main goal are the natural sciences, but it's totally possible to attempt prediction of social, or man-made phenomena (either for themselves, or as instrumental reasons for the latter goals).

Note also that prediction is not just about the causal future (as in predicting what the weather will be tomorrow), but also about predicting new things that you don’t know but might learn or find latter. This is particularly true about historical sciences in general: although they focus on what has already happened, their models and theories can make predictions about what will be discovered about the past in the future.^[4]

Intervention: How To Shift The Situation

Here, compared to prediction, we don’t just want to observe the system, but also act on it: repairing a broken motorcycle, adding features to a programming language, fix the Great Depression…

So here, you have a system, a desired end result, a range of interventions, and you search for the interventions that lead to the end result, or at least sufficiently approximate it.

Construction: How To Make Things That Work

Last but not least, this category of goals is about creating something from scratch (probably from existing components though).

Various examples include writing software for filling taxes, cooking a tasty meal, inventing a new language, designing a drug to cure a disease…

The Three Parts of Epistemology

Last but not least, I want to explain a bit what the parts of epistemology are according to me. By this, I mean that in order to build a full model of epistemology that let's you tackle the goals discussed above in accordance with the roots of epistemology (that the world is too complex for humans to naively simulate it), I believe that you need to build a model of these three parts:

The Regularities
- What are the existing epistemic regularities, how do you find them, and how to exploit them?
The Cognition
- What are the cognitive limits of human minds, and what constraints follow about the kind of frames and models we can use and think about?
The Languages
- What are the components and structures of our cognitive models that exploit the epistemic regularities?

The Regularities: What The World Offers (Or We Impose)

As mentioned before, epistemic regularities are the computational shortcuts that we exploit to reach our goals. Fundamentally, they tell us that we don’t need to care about dependencies, that out of all the possible options, only a few really matter, that only this and that parameters truly influence what we care about.

For example, most of classical physics (and a decent chunk of non-classical physics) is replete with a myriad of such regularities:

Decoupling of Scales
- When you study a specific scale, you mostly don’t need to care about the details at vastly different scales, whether smaller or larger.
Interchangeability
- When your system has many components, you can often consider them all interchangeable, or reduce their difference to a few parameters.
Dull Function Hypothesis^[5]
- By default, if there’s a numerical relation that you need to guess in most of physics, it’s going to look like a reasonable function like a polynomial, or exponential, or logarithm, not a crazy insane function.
Stable Phenomena
- Experimenting on a given phenomena doesn’t change it irremediably (if you make an experiment on the gas laws, you don’t change them in doing so).
Independence From Models
- Creating a theory or model of a phenomena doesn’t change the phenomena itself.
And many more

What’s really, really important to get here is that these are properties of the phenomena studied in most of physics. So they don’t have to hold in other fields; indeed, for each of these regularities, I can point to settings where they definitely do not hold:

Decoupling of scale is broken by chaotic systems like the weather, due to sensitivity to initial conditions.
Interchangeability breaks down in social settings, where humans are often not easily interchangeable.^[6]
To see where the dull function hypothesis fails (or at least doesn’t work despite a lot of efforts), just look at modern ML, particularly generative AI: these neural nets are basically massive function approximators (which map text to text or text to image say), and they have literally billions of parameters.
Stable phenomena would be great in many medical and social fields; unfortunately, economic interventions alter the economy, treatments mutate diseases, language guidelines alter languages…
And independence from models is broken in the case of most social sciences: as people learn about models in psychology, economics, sociology, they often update their behaviors (by following the model or opposing it), which bias the whole system!

So this already tells us something essential: when we try to move a method, a trick, an approach from one setting to another, the real question is whether the new setting also has the relevant epistemic regularities.

This is why I expect by default that methods from physics will fail to generalize: they come from the most auspicious epistemic landscape known to man, where almost everything aligns to allow computational tractability. But most fields and problems have worst conditions than that, and unless an argument is made that the relevant regularities are maintained, the technique will just fail.

Does that mean that we’re completely lost if no good regularity can be found?

No. Another essential insight is that when we have control over a system (whether by intervening or designing it), we can bake in some of these regularities.

This is what is done with programming languages: often the options are restricted to allow various forms of static analysis. It also happens in economics, where moving various settings towards pure free markets makes them easier to model and understand with known tools.

But whether we just search for regularities in existing systems, or bake them into our own creations, the biggest missing piece in my model of epistemology is a detailed list, classification, and analysis of known epistemic regularities that have been successfully exploited throughout history.^[7]

I have tried to write it a handful of times, but I usually give up from the enormous scope of the work. I published some very topical analyses along these lines, but nothing with the depth and breadth I really want…

Maybe I will end up writing an attempt at the very least!

The Cognition: What Our Brains Can Handle

Next, it’s important to understand how human cognition works specifically, and notably which computational constraints it must deal with.

This is because the way we exploit regularities (our theories, models, tricks, tools…) must be simple enough for us to learn and use them well. And so understanding how simple and structured requires a good model of human cognition.

Honestly, I have dabbled a bit in this part, but I don’t have anything deep to share.

I’m quite convinced that strong memory complexity bounds (our limited short term memory) is a massive factor, which makes various compression and externalization devices (notes, tools, presets, notations…) absolutely necessary.

I also expect that exaptation of core cognitive machinery (notably visual, spatial, and language) plays a big role in overcoming our brute computational limitations.

But I don’t have much evidence or detailed models for this. Maybe a topic for future exploration?

The Languages: How We Exploit Regularities

Last but not least, we need strategies to exploit the regularities while satisfying our cognitive constraints.

I think the best way to think about these strategies for exploiting regularities is to cast them as languages, specifically as Domain-Specific Languages.

Because when I cook for example, I’m using a language of options, of constraints, of things to be careful about, that I have learned from various sources (my mom, french cookbooks, youtube, friends…). What I have is really a way to turn the blinding complexity of cooking into a puzzle, a simplified game with rules that I understand, and a complexity I can manage.

This is related to paradigms, frames, any concept that focuses on what is considered relevant, what is considered irrelevant, and what are the rules of the game, the grammar and syntax and semantics.

On this front too, I unfortunately only have pointers and intuitions rather than detailed models:

I think that Programming Language Theory is a great frame to build a general model of these cognitive DSLs. Notably the thinking about fragments, features and how they interact with each others, constraints and their relationship with the possibility of static analysis.
I see notation as an essential tool in cognitive DSLs, and I have many books and papers that I want to explore on the uses of notation.^[8]
I also believe that tools embed a lot of our cognitive DSLs in their procedures and structure, which means that studying physical tools and software tools is a great way to reverse engineer the strategies that are used to exploit regularities.^[9]

So I need to spend more time digging into this.

Conclusion

To summarize, I see epistemology as the art of finding and exploiting epistemic regularities (natural or enforced) in order to accomplish various goals: prediction, intervention, construction.

This is clearly only a sketch, and I’m not clear how many lifetimes (even without AI risk to work on…) it would take to fully flesh it out. I still hope that it gives you some food for thought, if only through the examples and their organization.

^{^}
This framing comes from Physics Avoidance by Mark Wilson. Note though that it’s a very philosophically focused book, with an opinionated style and a bunch of philosophy of language interspersed with the philosophy of science. Still, if the style doesn’t put you off, I think it’s probably the best exploration of epistemic regularities out there.
^{^}
See this post for more reflections. To read the best treatment of this notion of realism, I recommend Hasok Chang's Realism for Realistic People.
^{^}
Note that these are related, but not in a tight waterfall way. Prediction helps with intervention, but is not strictly necessary (see for example generative AI). And prediction and intervention help with construction, but are not fully needed (for example the Apollo Program had much less prediction ability that you might imagine, they mostly tested what they built a lot and didn't encounter the worst possible situations out there)
^{^}
One of my favorite examples come from historical linguistics: the laryngeal theory. Basically, Ferdinand de Saussure and others predicted at the end of the 19th century that there existed in Proto-Indo-European (the reconstructed language from which all Indo-European languages stem) phonèmes (sounds) that were missing from every known language, purely from noticing structural patterns. This was mostly considered wild conjecture, until the Hittite language was discovered, which showed evidence of having these hidden phonemes!
^{^}
This great name comes from Fly By Night Physics by Anthony Zee, a treasure trove of epistemic regularities and how to exploit them in physics.
^{^}
Bret Devereaux makes the case nicely in this post, where he disputes (correctly) the application of methods from statistical physics to history; although he doesn’t use these terms, his argument boils down to “the regularities that you want to exploit don’t exist here”.
^{^}
It would also discuss specifically which regularies are valuable for which kind of goals. Because the example of generative AI and to a lesser extent many fields of engineering show that there are regularities which are good for construction and intervention without being useful for prediction.
^{^}
As a starting point, see this great github repo pointing to various forms of notation.
^{^}
A nice book on this is Image and Logic by Peter Galison, on the different frames, assumptions, methodologies embedded in two traditions of particle physics (the image tradition with for example bubble chambers, and the logic tradition with for example Geiger counters).

LESSWRONG
LW