This is a special post for quick takes by testingthewaters. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
8 comments, sorted by Click to highlight new comments since:

Note to self: If you think you know where your unknown unknowns sit in your ontology, you don't. That's what makes them unknown unknowns.

If you think that you have a complete picture of some system, you can still find yourself surprised by unknown unknowns. That's what makes them unknown unknowns.

If your internal logic has almost complete predictive power, plus or minus a tiny bit of error, your logical system (but mostly not your observations) can still be completely overthrown by unknown unknowns. That's what makes them unknown unknowns.

You can respect unknown unknowns, but you can't plan around them. That's... You get it by now.

Therefore I respectfully submit that anyone who presents me with a foolproof and worked-out plan of the next ten/hundred/thousand/million years has failed to take into account some unknown unknowns.

I could feel myself instinctively disliking this argument, and I think I figured out why.

Even though the argument is obviously true, and it is here used to argue for something I agree with, I've historically mostly seen this argument used to argue against things I agree with. Specifically, arguing to disregard experts, and to argue that nuclear power should never be built, no matter how safe it looks. Now this explains my gut reaction, but not whether it's a good argument.

When thinking through it, my real problem with the argument is the following. While it's technically true, it doesn't help locate any useful crux or resolution to a disagreement. Essentially, it naturally leads to a situation where one party estimates the unknown unknowns to be much larger than the other party, and this is the crux. To make things worse, often one party doesn't want to argue for their estimate of the size of the unknown unknowns. But we need to estimate sizes of unknown unknowns, otherwise I can troll people with "tic-tac-toe will never be solved because of unknown unknowns".

I therefore feel better about arguments for why unknown unknowns may be large, compared to just arguing for a positive probability of unknown unknowns. For example, society has historically been extremely chaotic when viewed at large time scales, and we have numerous examples of similar past predictions which failed because of unknown unknowns. So I have a tiny prior probability that anyone can accurately predict what society will look like far into the future.

Yeah, definitely. My main gripe where I see people disregarding unknown unknowns is a similar one to yours- people who present definite worked out pictures of the future.

This seems like an interesting paper: https://arxiv.org/pdf/2502.19798

Essentially: use developmental psychology techniques to cause LLMs to develop a more well rounded human friendly persona that involves reflecting on their actions, while gradually escalating the moral difficulty of the dilemmas presented as a kind of phased training. I see it as a sort of cross between RLHF, CoT, and the recent work on low example count fine tuning but for moral instead of mathematical intuitions.

I think I've just figured out why decision theories strike me as utterly pointless: they get around the actual hard part of making a decision. In general, decisions are not hard because you are weighing payoffs, but because you are dealing with uncertainty.

To operationalise this: a decision theory usually assumes that you have some number of options, each with some defined payout. Assuming payouts are fixed, all decision theories simply advise you to pick the outcome with the highest utility. "Difficult problems" in decision theory are problems where the payout is determined by some function that contains a contradiction, which is then resolved by causal/evidential/functional decision theories each with their own method of cutting the Gordian knot. The classic contradiction, of course, is that "payout(x1) == 100 iff predictor(your_choice) == x1; else payout(x1) == 1000".

Except this is not at all what makes real life decisions hard. If I am planning a business and ever get to the point where I know a function for exactly how much money two different business plans will give me, I've already gotten past the hard part of making a business plan. Similarly, if I'm choosing between two doors on a game show the difficulty is not that the host is a genius superpredictor who will retrocausally change the posterior goat/car distribution, but the simple fact that I do not know what is behind the doors. Almost all decision theories just skip past the part where you resolve uncertainty and gather information, which makes them effectively worthless in real life. Or, worse, they try to make the uncertainty go away: If I have 100 dollars and can donate to a local homeless shelter I know well or try and give it to a malaria net charity I don't know a lot about, I can be quite certain the homeless shelter will not misappropriate the funds or mismanage their operation, and less so about the faceless malaria charity. This is entirely missing from the standard EA arguments for allocation of funds. Uncertainty matters.

I think unpacking that kind of feeling is valuable, but yeah it seems like you've been assuming we use decision theory to make decisions, when we actually use it as an upper bound model to derive principles of decisionmaking that may be more specific to human decisionmaking, or to anticipate the behavior of idealized agents, or (the distinction between CDT and FDT) as an allegory for toxic consequentialism in humans.

To operationalise this: a decision theory usually assumes that you have some number of options, each with some defined payout. Assuming payouts are fixed, all decision theories simply advise you to pick the outcome with the highest utility.

The theories typically assume that each choice option has a number of known mutually exclusive (and jointly exhaustive) possible outcomes. And to each outcome the agent assigns a utility and a probability. So uncertainty is in fact modelled insofar the agent can assign subjective probabilities to those outcomes occurring. The expected utility of an outcome is then something like its probability times its utility.

Other uncertainties are not covered in decision theory. E.g. 1) if you are uncertain what outcomes are possible in the first place, 2) if you are uncertain what utility to assign to a possible outcome, 3) if you are uncertain what probability to assign to a possible outcome.

I assume you are talking about some of the latter uncertainties?

https://research.google/blog/deciphering-language-processing-in-the-human-brain-through-llm-representations/

Activations in LLMs are linearly mappable to activations in the human brain. Imo this is strong evidence for the idea that LLMs/NNs in general acquire extremely human like cognitive patterns, and that the common "shoggoth with a smiley face" meme might just not be accurate

Curated and popular this week