I interpreted this concept of wholesomeness to be a least somewhat objective, but perhaps that's not the intention.
Could you clarify how much wholesomeness is a subjective property relative to one's values, vs being a more objective property that would hold constant under different values?
For example, say I lay out a business plan use X natural resources to build Y buildings, that will be used for Z purpose. Would you expect to be able to rate my wholesomeness without knowing how much I value things like nature, humanity, industrial progress, rural/urban lifestyle, etc?
(assuming this business plan only covers some of these things, because considering all things isn't possible)
I don't think it's coherent to define this idea of wholesomeness as paying attention to the whole system.
"Paying attention to the whole system" isn't a thing that can be done. There aren't a finite number of things to consider. And even as you consider each marginal thing, the effort spent considering trades off against actually getting anything done. (Getting things done is also a thing to consider.)
All of these examples of "unwholesomeness" are just examples of someone making a tradeoff; disregarding one thing to benefit another thing. You could call any of these tradeoff decisions bad, but "bad" is only coherent relative to a set of values.
So it sounds to me like this concept of wholesomeness is reasonably summarized as "consciously optimized for a vague set of traditional values."
I think it's reasonable to advocate for shared values. If 95% of people in a town like neat square buildings, then should probably plan their construction accordingly.
But putting a set of values behind the vague curtain of "wholesomeness" implies that these values are somehow objective and universal. Which of course they aren't, as evidenced by the way they change across cultures and over time.
I think this is more of a disagreement on messaging than a disagreement on facts.
I don't see anyone disputing the "the AI is about as unpredictable as weather" claim, but it's quite a stretch to summarize that as "we have no idea how the AI works."
I understand that abbreviated and exaggerated messaging can be optimal for public messaging, but I don't think there's enough clarification in this post between direct in-group claims and examples of public messaging.
I would break this into three parts, to avoid misunderstandings from poorly contextualized language:
1. What is our level of understanding of AIs?
2. What is the general public's expectation of our level of understanding?
3. What's the best messaging to resolve this probable overestimation?
I think it’s important to recognize that there are in fact nontrivial things that can be done with increased aural discernment. You might notice subtleties of emotional expression, or small choices in musical interpretation that aren’t perceptible to a novice listener.
If you assume there are only trivial benefits to increased aural discernment, then of course the discussion falls apart. The drawbacks of finding amateur music off-putting aren’t trivial, so aural discernment would just be a simple loss in utility.
Sure, there may be larger obvious benefits to wearing glasses than training your musical ear, but it’s arguably just a matter of degree. To me, this is the point of the parallel here: to consider how both rectifying impaired fidelity of perception and increasing beyond typical fidelity of perception are both examples of a difficult-to-calculate pattern of utility changes.
What's supposed to happen with this sentence, cognitively speaking, is that you read the sentence, slot it into a preexisting model of how deterrence and red lines work.
I think it’s a mistake to qualify this interpretation as an example of following decoupling norms. Deterrence and red lines aren’t mentioned in Eliezer’s comment at all; they’re just extra context that you’ve decided to fill in. That’s generally what people do when they read things under contextualizing norms. Interpreting this comment as a suggestion to consider initiating a nuclear exchange is also a contextualized reading, just with a different context filled in.
A highly-decoupled reading, by contrast, would simply interpret “some risk of nuclear exchange” as, well, some unquantified/unspecified risk.
This post doesn't convince me away from the "tautological" objection mentioned at the end.
Caplan is presenting a contrarian position: "Standard signaling models are incorrect. The true cost of making friends is astronomically lower than what most people think it is." It's an intriguing take, but what are those actual and perceived costs?
To me, it's borderline ridiculous to simplify the costs down to "impulse control." Making friends is often difficult, complicated, and at times genuinely risky. It takes so much more effort and commitment than simply saying hello with a smile. Even for the "sincerely friendly" people who have good social intuitions, there's clearly an opportunity cost to having friends.
I'd be interested in an analysis of the perceived vs actual costs of friendship, but I don't see any of that explored in detail here. Even the core claim that "people are impulsive" doesn't seem particularly well supported.
First of all, wow, great read! The non-technical explanations in the first half made it easy to map those concepts to the technical notation in the second half.
The hardest part to understand for me was the idea of absolute bits of optimization. I resolved my confusion after some closer examination, but I’d suggest two changes that would strengthen the explanation:
1. The animated graph showing increasing bits of optimization has a very unfortunate zoom level. This — combined with the fat left side of the graph — gave me a false impression that the area was simply reducing by a third of its total as the optimization bits counted up to a maximum of ~3 (2.93 to be exact). Eventually realized there’s presumably a long tail extending to the right past the range we can see, but that wasn’t clear from the start. Even knowing this, it’s still hard to get my brain to parse the area as being progressively cut in half. I would hesitate to change the shape of this graph (it’s important to understand that it’s the area being halved, not just horizontal progress) but I think zooming out the domain would make it feel much more like “approaching infinite optimization”.
2. This is a minor point, but I think the “in terms of” in this sentence is needlessly vague:
Then, the absolute optimization of a specific state is in terms of the probability mass above it, that is, to the right of it on the -axis.
It took a bit for me to understand, “oh, bits of optimization is just calculated as the inverse of ”. Maybe it would be clearer to say that from the start? Not sure what the exact math terms are, but something like:
Then, the absolute optimization of a state is [the proportional inverse of] the probability mass above it, that is, to the right of it on the -axis.
Yeah, I think there’s an important distinction to make in the intent of the speaker saying this remark. Sometimes it’s intended as evidence the argument is wrong, and yes, that’s a dirty rhetorical trick. But as you’ve listed here, there are several motivations to give this response as a criticism of someone’s genuinely bad/harmful persuasion tactics.
So I guess when hearing this, it’s worth taking a moment to check whether the speaker is using fallacious rhetoric, or attempting to give helpful social advice. (And then, accordingly, “go for the throat,” or kindly thank them for the presentation feedback.)
Agreed.
It seems like the moral of this parable should be “don’t make foolish, incoherent hedges” — however, the final explanations given by Eternidad don’t touch on this at all. I would be more satisfied by this parable if the concluding explanations focused on the problems of naive data aggregation.
The “three reasons” given are useful ideas, but the king’s decision in this story is foolish even if this scenario was all three: a closed game, an iterated game, and only a betting situation. (Just imagine betting on a hundred coin flips that the coin will land on its edge every time.)
So I've been trying to get a clearer picture of what you mean by wholesomeness. So far I have:
* Make an attempt to pay attention to the whole system, but stop at whatever point feels reasonable.
* Don't exclude any important domains from things you care about.
* Make these judgements based on your own values, and also the values that are felt-to-be-important by a good number of other people.
* Wholesomeness is subjective to individual interpretation, so there aren't definitive right answers.
* Certain tradeoffs of values are objectively unwholesome. There are definitive wrong answers.
I don't think this is a useful model. The devil of all of this is in the interpretations of "reasonable" and "important" and "good."
You say it's unwholesome when someone ignores what you think is important by saying "I don't value this". But this is exactly what your model is encouraging: consider everything, but stop whenever you personally feel like you've considered everything you value. The only safeguard against this is just biasing the status quo by labeling things unwholesome if enough people disagree.