Perhaps the most insightful comment I ever read on Hacker News went something like,
One of the big problems for startup founders who are immigrants is not knowing what they're expected to lie about, and what they're absolutely forbidden to lie about. If you cook the books, you go to jail for fraud. But if you're very honest when a VC asks you, 'Who else is thinking of investing in you?' and you answer, 'It's only you, no one else is interested' — then you're never going to get investment. You're expected to lie and say, 'Oh, there's a lot of interest.'
I can't find the exact comment but I found that very insightful.
The code of literal truth only lets people navigate anything like ordinary social reality to the extent that they are very fast on their verbal feet, and can respond to the question "How are you?" by saying "Getting along" instead of "Horribly" or with an awkward silence while they try to think of something technically true. (Because often "I'm fine" is false, you see. If this has never bothered you then you are perhaps not in the target audience for this essay.)
I think this is missing the important role of question-substitution in basic social encounters. When you ask "How are you?" and I respond with "I'm fine," the question I'm actually answering is not my physical or emotional state but instead the questions of "do you need assistance?" or "Is there anything I should know?" with a response that codes to "no". So if my knee is hurting, in a normal conversation I might respond with "I'm fine" because I expect that information to not be useful to them (and is a bid for a demonstration of care that I'm not interested in), whereas in cases where this might af...
The issue of defining "literal honesty" seems pretty subtle. Allowing reasonable stretching of "literal" to "my concept of what people meant by the question" is ... well, reasonable, but also not very consistent with drawing a clear line separating 'honest' from 'dishonest'. Another issue is that if the definition of honesty is
"Don't say things that you believe to be literally false in a context where people will (with reasonably high probability) persistently believe that you believe them to be true."
seems to admit lying when everyone knows you are lying. IE, someone who everyone assumes to be a liar is "literally honest" no matter what they say! I take this to suggest that our definition has to include intent, not just expectation. But how to modify the definition to avoid further trouble is unclear to me.
Given the difficulties, it seems like one who wishes to adhere to 'literal honesty' had better err on the side of literalness, clarifying any issues of interpretation as they arise. Being very literal in your answers to "how are you" may be awkward in an individual case, but as a pattern, it sets up expectations about the sort of replies you give to questions.
On the object level, I disagree about the usual meaning of "how are you?" -- it seems to me like it is more often used as a bid to start a conversation, and the expected response is to come up with smalltalk about your day / what you've been up to / etc.
The first time I read this, I think my top-level personal takeaway was: 'Woah, this is complicated. I can barely follow the structure of some of these sentences in section 7, and I definitely don't feel like I've spent enough time meditating on my counterfactual selves' preferences or cultivating a wizard's metacognitive habits to be able to apply this framework in a really principled way. This hard-to-discuss topic seems like even more of a minefield now.'
My takeaways are different on a second read:
1. Practicing thinking and talking like a wizard seems really, really valuable. (See Honesty: Beyond Internal Truth.) Bells and whistles like "coming up with an iron-clad approach to Glomarization" seem much less important, and shouldn't get in the way of core skill-building. It really does seem to make me healthier when I'm in the mindset of treating "always speak literal truth" as my fallback, complicated meta-honesty schemes aside.
2. It's possible to do easier modified versions of the thing Eliezer's talking about. E.g., if I'm worried that I'm not experienced or fast enough on my feet to have a meta-hon...
Here are my thoughts.
Also, not everyone may be familiar with this “Glomarization” thing, so here’s a Wikipedia link:
Promoted to curated, here are my thoughts:
I think Robby's comment captures a lot of my thoughts on this post. This was the third time I read this post, and I think it was the first time that I started getting any grasp on the core concepts of the post. I think there are two primary reasons for this:
1. The concepts are indeed difficult, and are recursive and self-referential in a way that requires a good amount of unpacking and time to understand them
2. The focus of the post shifts very quickly from "here is a crash-course introduction to the wizard's code" to "here is a crash-course introduction to meta-honesty" to "here is a discussion about whether meta-honesty is good" and "here is an introduction to the considerations around meta-meta-honesty".
I think it's good to have a post that tries to give some kind of overview over the considerations around meta-honesty and rational honesty in general, and that that is better than only having a single educational introductory post that can't give a sense of the bigger picture.
But I do think that the next natural step after this high-level discussion, if we think meta-honesty i...
Yeah, my feeling on re-reading this post is that it would have worked well as a sequence, since it breaks down into a bunch of parts that are important to consider and digest in their own right. (And since it would have benefited from more background, motivation, exercises, examples, etc.)
Also, to give a personal +1 to bounties and to this particular goal, I'll give another $40 to whoever collects Oliver's bounty, as judged by Oliver.
In practice, I've found that it's possible to keep a lot of information secret without the need to either lie, or do lots of extra Glomarizing to avoid the act of Glomarizing giving too much away. Rather than have a stated, deterministic solution to each possible question or problem, I do what seems practical given the situation and who is asking.
One tactic I've found necessary is to just not talk about entire topics, or not write entire posts, that would put me in a position where I'd be backed into a corner and Glomarizing would be too suspicious to pull off without giving the game away, and of course not talking about which posts/topics those are. The alternative, to do sufficient Glomarizing 'in the open,' would cut off a lot more discussion/information on net and also be more socially costly.
In general, there's a temptation to do things that are game-theoretically robust - where if they could see your source code and decision algorithm, you'd still be all right, or at least do as well as possible given the circumstances. This is of course hugely important to various scenarios important for AI, where you actually do face such circumstances. But in reality, it's usually right to do things that are hugely exploitable if they could see what you're up to - e.g. to not Glomarize 'enough' even though that opens you up to problems.
Thanks for explicitly giving me the out not to answer, I think that's 'doing it right' here.
Not being able to talk about things really sucks! Especially because the things you're actually thinking about a lot, and are the most interesting to you, are more likely to include information you can't share, for various reasons.
On the flip side, there are also topics one can't talk about because of worry that it would expose information about one's opinions rather than secret facts. This can be annoying, but it's also a good way to avoid things that you should, for plenty of other reasons, know better than to waste one's time on!
My question is: there seems to be a good deal of context missing. What was the motivation for this post? What conversation context was it taken from? It’s difficult to interpret it, without that information.
Some context from Eliezer's Honesty: Beyond Internal Truth (in 2009):
[...] What I write is true to the best of my knowledge, because I can look it over and check before publishing. What I say aloud sometimes comes out false because my tongue moves faster than my deliberative intelligence can look it over and spot the distortion. Oh, we're not talking about grotesque major falsehoods - but the first words off my tongue sometimes shade reality, twist events just a little toward the way they should have happened...
From the inside, it feels a lot like the experience of un-consciously-chosen, perceptual-speed, internal rationalization. I would even say that so far as I can tell, it's the same brain hardware running in both cases - that it's just a circuit for lying in general, both for lying to others and lying to ourselves, activated whenever reality begins to feel inconvenient.
There was a time - if I recall correctly - when I didn't notice these little twists. And in fact it still feels embarrassing to confess them, because I worry that people will think: "Oh, no! Eliezer lies without even thinking! He's a pathological liar!" For they ha...
There's something I've seen some rationalists try for, which I think Eliezer might be aiming at here, which is to try and be a truly robust agent.
Be the sort of person that Omega (even a version of Omega who's only 90% accurate) can clearly tell is going to one-box.
Be the sort of agent who cooperates when it is appropriate, defects when it is appropriate, and can realize that cooperating-in-this-particular-instance might look superficially like defecting, but avoid falling into a trap.
Be the sort of agent who, if some AI engineers were whiteboarding out the agent's decision making, they were see that the agent makes robustly good choices, such that those engineers would choose to implement that agent as software and run it.
Not sure if that's precisely what's going on here but I think is at least somewhat related. If your day job is designing agents that could be provably friendly, it suggests the question of "how can I be provably friendly?"
The code of literal truth only lets people navigate anything like ordinary social reality to the extent that they are very fast on their verbal feet, and can respond to the question "How are you?" by saying "Getting along" instead of "Horribly" or with an awkward silence while they try to think of something technically true.
If you are trying to be unusually honest as a matter of policy, there are some things it is worth lying for under some circumstances. This is not one of them. Quoting Calvin and Hobbes: "I don't know which is worse: that everyone has his price, or that the price is always so low."
This comment thread contributed to a substantial personal update for me over the weekend. I noticed ways in which I was out of integrity with myself. I've moved a lot closer to something like a radical honesty practice over the weekend, and it has worked out pretty well so far.
I stopped blocking my perception of my own suffering, and noticed that my mind-body is full of grief and fatigue. Naturally, noticing thing caused it to start showing up in my body language, and I also started talking about things that were bad for me. It turned out that while this was sometimes upsetting for the people around me, it also allowed us to negotiate in better faith than before.
I think I'd been suppressing this in part because it seemed like the people around me couldn't handle it. This still might turn out to be the case sometimes, but I myself have sufficient privilege to be able to safely handle other people not being able to handle it, so I may as well not destroy my soul :)
I feel physically better.
Thank you for holding me to account. Jessica, I know you didn't explicitly target your intervention at me, but your comments here were sufficiently interpretable for someone trying to learn from them to apply them to their personal situation anyway.
Why does it seem unusually clear-cut to you that everyone should take part in a ritual in which one person acts like they care about the other's emotional state by asking about it, the other one lies about it in order to maintain the narrative that Things Are Fine (even in cases where things aren't fine and that would be important information to someone who actually cared about them), then sometimes they switch roles?
While I still endorse my description of what's going on there (and thanks for linking it!), in hindsight it seems like I'm describing something that has some pretty substantial costs - as I mentioned in another thread it literally prevents me from actually asking the question "how are you?".
It does make sense to exercise some care here, as one of the effects of this social ritual is to compel disprivileged people to participate in creating a shared narrative in which they're fine, everything is fine, can't complain or else I'll be socially attacked, how are you? Asking such people not to lie in response to "how are you?" may sometimes not be a reasonable request.
Update: I moved to Berkeley last week and noticed a huge difference in how the rationalist/EA community deals with these sorts of conversations and how the rest of the world does. Yesterday I was talking to someone I had barely met and they asked "how are you doing?" I said "you just opened a whole can of worms" and we ended up having an interesting discussion, including about how the conversational norms are different here from elsewhere. In general, I think people in this community are both more likely to give an honest answer to such questions, and less likely to ask them if they aren't interested in an honest answer.
[This is written as a moderator, and is a suggestion to Said Achmiz, jessicata, and others, posted here because this is currently the highest-placed comment on the page that follows a particular pattern.]
Not paying attention to the semantic content of this comment, but rather its structure, notice that it is a series of quotes, often of a single sentence, followed by similarly short replies. While this is a standard technique in forum arguments, I claim that is mostly for undesirable reasons (like it being optimized for "scoring points"), and have found that it's not a particularly helpful method of discussion.
My suggestion (and it is only a suggestion) is that you try an approach where you share your understanding of your interlocutor's whole point or position with a paraphrase, attempt to identify the most fruitful part of the disagreement to work on, and then devote the remainder of the comment to that point. This keeps discussions focused on moving forward, doesn't give an edge to the party with more attention to spare to the discussion, makes it harder to talk past one another repeatedly, and makes it easier to notice when core points are simply droppe...
I made a proposal for a moderator tool that seems like it might have been helpful to this thread, partly in response to your bracketed text, and I'd be curious to hear your thoughts. https://github.com/LessWrong2/Lesswrong2/issues/610
Another question to ask, with regards to launching into unprompted explanations of one's personal life, is whether the other person actually wants that information. Like it or not, most people subscribe to the Copenhagen Interpretation of Ethics, which means by telling people of your problems, you are implicitly making it their problem (else why would you bother sharing?).
If I said, "Hi, how are you," and your response was a 5-minute long explanation of how your aunt died, your car broke down and how your dog needs surgery, my reaction will be awkward silence, not because I have no sympathy for your plight, but because I am wondering whether there is any obligation for me to step in and help.
This was the post that got the concept of a "robust agent" to click into place in my mind. It also more concretely got me thinking about what it means to be honest, and how to go about that in a world that sometimes punished honesty.
Since then, I've thought a lot about meta-honesty as well as meta-trust (in contexts that are less about truth/falsehood). I have some half-finished posts on the topic I hope to share at some point.
This also had some concrete impacts on how I think about the LessWrong team's integrity, which made it's way into several conversations that (I'd guess?) made their way into habryka's post on Integrity, as well as my Robust Agency for People and Organizations.
I prefer not to lie, but there are so many cases where the weight of projected futures is overwhelmingly in favor of lying that I can't call it a rule, or give much moral weight to it.
Lying has costs (it's unpleasant, if found out, reduces trust, etc.). Truth-telling has costs (hurt feelings, punishments, etc.). Silence (including Glomar's response) has a cost (much the same as both lying and truth-telling). Weighing costs and benefits of actions (including communication and signals to other entities) is what we do.
Any sane decision theory will choose "lie" in some inputs, "truth" in others, and "silence" in still others.
Note: I do subscribe to the (rejected by you) notion that
rationalists ought to practice lying, so that they could separate their internal honesty from any fears of needing to say what they believed.
Belief is different from communication, which is different from signaling/manipulation. They are all mixed up in different proportions in different contexts, and trying to generally solve for one without acknowledging the others is likely to lead to pain.
I also think that the idea of "honesty" AND by extensio...
My own honesty pledge would be summarised as something like this:
1) I will try hard to not mislead you, unless the circumstances are extreme (Jews in the attic).
2) Consequently, I will lie in vacuous social interactions ("How was my play?" "It was fine"), but will tell only the truth if pressed.
3) I may, and will, refuse to answer your question, for reasons that are valid or just randomly. This may not involve explicit Glomarizing; but I will explicitly Glomarize if pressed.
4) I won't meta-lie.
On point 3), I think semi-inconsistent Glomarizing is almost as good as carefully strategic Glomarizing - indeed, it may be better, if it makes you less predictable.
Wasn't your old rule officially "don't lie to someone unless you would also feel good about slashing their tires given the opportunity?" Or something very close to that? That already solves the standard Kantean problems.
This chunk felt like the biggest difference between meta-honesty and "tire slash":
Harry shook his head. "No," said Harry, "because then if we weren't enemies, you would still never really be able to trust what I say even assuming me to abide by my code of honesty. You would have to worry that maybe I secretly thought you were an enemy and didn't tell you.
If I'm following the old rule, you probably want to know in what situations I'd feel good slashing your tires. If I actually felt okay slashing your tires, I'd probably also be invested in making you falsely belief I wouldn't slash your tires. This makes it hard to super soundly, within one's honesty code, let someone know when you would or wouldn't be lying to them.
If I'm following meta-honesty, it seems like I can say, "I wouldn't lie to you about being on your side unless XYZ doomsday scenario", and that claim is as sound as my claim to be meta-honest. Now, if I say I'm on your side (not going to slash tires / lie), and you trust my claim to be meta-honest, you can believe me with whatever probability you assign to us not currently being in a doomsday scenario.
I don't recommend this post for the Best-of-2018 Review.
It's an exploration of a fascinating idea, but it'skind of messy and unusually difficult to understand (in the later sections). Moreover, the author isn't even sure whether it's a good concept or one that will be abused, and in addition worries about it becoming a popularized/bastardized concept in a wider circle. (Compare what happened to "virtue signaling".)
one that will be abused, and in addition worries about it becoming a popularized/bastardized concept in a wider circle. (Compare what happened to "virtue signaling".)
This is a terrible rationale! Our charter is to advance the art of human rationality—to discover the correct concepts for understanding reality. I just don't think you can optimize for "not abusable/bastardizable if marketed in the wrong way to the wrong people" without compromising on correctness.
Concepts like "the intelligence explosion" or "acausal negotiation" are absolutely rife for abuse (as we have seen), but we don't, and shouldn't, let that have any impact on our work understanding AI takeoff scenarios or how to write computer programs that reason about each other's source code.
And likewise "virtue signaling." Signaling is a really important topic in economics and evolution and game theory more generally. If we were doing a Best-of-2014 review and someone had written a good post titled "Virtue Signaling", I would want that post to be judged for its contribution to our collective understanding, not on whatever misuse or confusion someone, somewhere might subsequently have attached to the same two-word phrase
...(Because often "I'm fine" is false, you see. If this has never bothered you then you are perhaps not in the target audience for this essay.)
This does bother me, but I’ve come to the conclusion that “How are you?” usually isn’t really a question - it’s a protocol, and the password you’re supposed to reply with is “Fine.” Almost no-one will take this to mean that you actually are fine, in my experience - they will take it to mean that you are following the normal rules of conversation, which is true. It’s much like how I can tell jokes, use idio...
This is probably the post I got the most value out of in 2018. This is not so much because the precise ideas (although I have got value out of the principle of meta-honesty, directly), but because it was an attempt to understand and resolve a confusing, difficult domain. Eliezer explores various issues facing meta-honesty – the privilege inherent in being fast-talking enough to remain honesty in tricky domains, and the various subtleties of meta-honesty that might make it too subtly a set of rules to coordinate around.
This illustration of "how to contend w
...Given how people actually act, a norm of "no literal falsehoods, but you can say deceptive but literally true things" will encourage deception in a way that "no deception unless really necessary" will not. "It's literally true, so it isn't lying" will easily slip to "it's literally true, so it isn't very deceptive", which will lead to people being more willing to deceive.
It's also something that only Jedi, certain religious believers, autists, Internet rationalists, and a few other odd groups would think is a good idea. "It isn't lying because what I said was literally true" is a proposition that most people see as sophistry.
Eliezer mostly talks about the idea that 'No literal lies' isn't morally necessary, but I take it from the "your sentences never provided Bayesian evidence in the wrong direction" goal that he also wouldn't consider this morally sufficient.
I tend to separate the topics of "I prefer to be honest" and "I prefer others to be honest". Both are true, but I approach them very differently. For myself, I pretty much set the default but allow that I'll deviate if I think it's long-term more valuable to do so.
For others, I try to approach it as "permission to be honest". I let people know that I prefer the truth, and I will do my best not to punish them for delivering it efficiently. This is similar to Crocker's Rules not being automatically symmetric...
One of my favorite posts, that encouraged me to rethink and redesign my honesty policy.
Used as research for my EA/rationality novel, I found this interesting and useful (albeit very meta and thus sometimes hard to follow).
I request that we stop using the Nazis as an example of a go-to fantasy adversary like vampires or zombies. The Gestapo was an actual institution that did real things for particular reasons. "Jews in the attic" shouldn't be parsed as a weird hypothetical like Kant's "murderer at the door" - it's a historical event. You can go to the library and read a copy of Anne Frank's diary.
On another post there was recently a demon thread in which I'm partially at fault, but an important contributing factor was that I was trying to point out specif...
Upvoted, and I agree with this concern, though I also think I'd have had a harder time digesting and updating on Eliezer's example if he'd picked something more fantastical. Using historical examples, even when a lot of the historical particulars are irrelevant, helps remind my brain that things in the discussed reference class actually occur in my environment.
I agree that historical examples can be helpful. I suspect these can be even more helpful if people vary the examples so they don't wear down into tropes, and check whether the details plausibly match. My reply to Zvi here seems relevant:
It seems to me as though when people evaluate the "Jews in the attic" hypothetical, "Gestapo" isn't being mapped onto the actual historical institution, but to a vague sense of who's a sufficiently hated adversary that it's widely considered legitimate to "slash their tires." In Nazi Germany, this actually maps onto Jews, not the Gestapo. It maps onto the Gestapo for post-WWII Americans considering a weird hypothetical.
To do the work of causing this to reliably map onto the Gestapo in Nazi Germany, you have to talk about the situation in which almost everyone around you seems to agree that the Gestapo might be a little scary but the Jews are dangerous, deceptive fantasy villains and need to be rooted out. Otherwise you just get illusion of transparency.
Finally!
Allow me an excursion which is not meant to subsume Eliezer Yudkowsky under Immanuel Kant or vice versa. It is intended to depict what I regard as a related thought process, and point out where I see people often getting sidetracked with regards to what's actually the issue (to my understanding).
Back in the Philosophy seminar on Kants prohibition on lying I felt everyone was missing the point and that this (to my understanding) was it:
Sometimes there is no "right thing" you can easily choose. Sometimes your choice is between the bad and the worse. ...
Eliezer discusses the fact that replying “I’m fine” to “How are you?” is literally false. In case anyone’s interested, one answer I've taken to using in response to "How are you?" is “High variance," which is helpfully vague about the direction of the variance.
This has definitely among the top posts that has stuck with me. My instincts are very strongly towards wanting to always be maximally honest, but of course that's not perfectly practical. This post works to recover a principled relationship to truth-telling and honesty even in the face of real-world necessity to sometimes not maximally promote truth.
I'm afraid this would not work for me as too much information would be "leaking thorough the side channels". By that I mean that while I can probably do the reasoning and give non-revealing replies in writing, anyone good at noticing emotions, small movements, delays, etc. would probably be able to learn what I'm trying or not trying to hide on the object level quite easily.
(This may mean that if you are too honest person on S1 level, it's plausible you cannot use some strategies on S2 level.)
"Would I be willing to publicly defend this as a situation in which unusually honest people should lie, if somebody posed it as a hypothetical?" Maybe that just gets turned into "It's permissible to lie so long as you'd be honest about whether you'd tell that lie if anyone asks you that exact question and remembers to say they're invoking the meta-honesty code," because people can't process the meta-part correctly.
Thank you for this direct contrast! It gave me the opportunity to understand why you added this part in the first place.
(The difference between ...
Ummm... if the feds are questioning you about a potential criminal act, Glomarization is almost always the best answer, because the 5th Amendment gives you a right to do it. And yes, they will take that to mean you did the thing, but they can't legally do anything with that. So worst case is they bs around that, dig a little harder to find evidence on you they probably would have found anyway, and pretend they were always going to try that hard to find the evidence. But the practical reality is the FBI usually only asks you questions they know the ans...
...It’s a human kind of thinking to verbally insist that “Don’t kill” is an an absolute rule, why, it’s right up there in the Ten Commandments. Except that what soldiers do doesn’t count, at least if they’re on the right side of the war. And sure, it’s also okay to kill a crazy person with a gun who’s in the middle of shooting up a school, because that’s just not what the absolute law “Don’t kill” means, you know!
Why? Because any rule that’s not labeled “absolute, no exceptions” lacks weight in people’s minds. So you have to perform that the “Don’t kill” com
Let me reformulate this essay in one paragraph:
Glomarization is good, but sometimes we can't use it because others don't understand the principle of Glomarization, or because you have too many counterfactual selves, and some of them won't like just telling the truth. Therefore, when you are asked about Jews in the attic, it is acceptable to lie, but when you are asked if you would lie about Jews in the attic, you must ALWAYS tell the truth. So meta honesty is just a way to use glomarization as often as you want.
If everyone in town magically receives the same speedup in their "verbal footwork", is that good for meta-honesty? I would like some kind of story explaining why it wouldn't be neutral.
Point for yes:
Sure seems like being able to quickly think up an appropriately nonspecific reference class when being questioned about a specific hypothetical does not make it harder for anyone else to do the same.
Point against:
...The code of literal truth only lets people navigate anything like ordinary social reality to the extent that they are very fast on their v
- Most people, even most unusually honest people, wander about their lives in a fog of internal distortions of reality. Repeatedly asking yourself of every sentence you say aloud to another person, "Is this statement actually and literally true?", helps you build a skill for navigating out of your internal smog of not-quite-truths. For that is our mastery.
I think some people who read this post ought to reverse this advice. The advice I would give to those people is: if you're constantly forcing every little claim you make through a literalism filter, you mig...
If you actually follow the advice about glomarization it is no longer improbable that you will be interrogated by someone who has read the rationalist literature on the subject and thought through the consequences. Investigators do their homework and being committed enough to glomarize frequently enough to do the intended work is a feature that will stick out like a sore thumb when your associates are interviews and immediately send the investigator out to read the literature.
Now maybe most investigators aren’t anywhere near this through but if you are facing an investigator who doesn’t even bother looking into your normal behavior your glomarization is irrelevant anyway.
...I theoretically ought to answer “I can’t confirm or deny what I was doing last night” because some of my counterfactual selves were hiding fugitive marijuana sellers from the Feds. ...
This seems easy to fix in principle. If, conditioned on the info that's known, or that probabilistically might be known to your asker, your counterfactual selves were especially likely to hide fugitives, you ought to say "I can’t confirm or deny"; otherwise, you can be truthful, and accept the consequence that some negligible fraction of your counterfactual se...
Regarding meta-honesty:
I'm going to flip the usual jargon on its head and say that I "agree connotatively, but disagree denotatively".
Meta-honesty - that is, "honesty about honesty" - is, like many meta-concepts, interesting to think about, but I don't quite understand why it needs to be formulated as some sort of "code". As you've presented it here, this "meta-honesty code" seems largely intractable in normal communication, and comes across as an overly-complicated way of simply refusing to hold up "Do not lie&...
(Cross-posted from Facebook.)
0: Tl;dr.
A rule which seems to me more "normal" than the wizard's literal-truth rule, more like a version of standard human honesty reinforced around the edges, would be as follows:
"Don't lie when a normal highly honest person wouldn't, and furthermore, be honest when somebody asks you which hypothetical circumstances would cause you to lie or mislead—absolutely honest, if they ask under this code. However, questions about meta-honesty should be careful not to probe object-level information."
I've been tentatively calling this "meta-honesty", but better terminology is solicited.
1: Glomarization can't practically cover many cases.
Suppose that last night I helped hide a fugitive marijuana seller from the Feds. You ask me what I was doing last night, and I, preferring not to emit false statements, reply, "I can't confirm or deny what I was doing last night."
We now have two major problems here:
This doesn't mean that Glomarization is never helpful. If you ask me whether my submarine is carrying nuclear weapons, or whether I'm secretly the author of "The Waves Arisen", I think most listeners would understand if I replied, "I have a consistent policy of not saying which submarines are carrying nuclear weapons, nor whether I wrote or helped write a document that doesn't have my name on it." An ordinary honest person does not need to lie on these occasions because Glomarization is both theoretically possible and pragmatically practical, so one should adopt a consistent Glomarization rather than lie.
But that doesn't work for hiding fugitives. Or any other occasion where an ordinary high-honesty person would consider it obligatory to lie, in answer to a question where the asker is not expecting evasion or Glomarization.
(I'm sure some people reading this think it's all very cute for me to be worried about the fact that I wouldn't tell the truth all the time. Feel free to state this in the comments so that we aren't confused about who's using which norms. Smirking about it, or laughing, especially conveys important info about you.)
2: The law of no literal falsehood.
One formulation of my automatic norm for honesty, the one that feels like the obvious default from which any departure requires a crushingly heavy justification, was given by Ursula K. LeGuin in A Wizard of Earthsea:
Or in simpler summary, this policy says:
Don't say things that are literally false.
Or with some of the unspoken finicky details added back in: "Don't say things that you believe to be literally false in a context where people will (with reasonably high probability) persistently believe that you believe them to be true." Jokes are still allowed, even jokes that only get revealed as jokes ten seconds later. Or quotations, etcetera ad obviousum.
The no-literal-falsehood code of honesty has three huge advantages:
From Frank Hebert's Dune Messiah, writing about Truthsayers, people who had trained to extreme heights the ability to tell when others were lying and who also never lied themselves:
This is probably not true in normal human practice for detecting other people's lies. I'd expect a lot of con artists are better than a lot of honest people at that.
But the phrase "It requires you have an inner agreement with truth which allows ready recognition" is something that resonates strongly with me. It feels like it points to the part that's good for your soul. Saying only true things is a kind of respect for the truth, a pact that you forge with it.
3: The privilege of truthtelling.
I've never suggested to anyone else that they adopt the wizard's code of honesty.
The code of literal truth only lets people navigate anything like ordinary social reality to the extent that they are very fast on their verbal feet, and can respond to the question "How are you?" by saying "Getting along" instead of "Horribly" or with an awkward silence while they try to think of something technically true. (Because often "I'm fine" is false, you see. If this has never bothered you then you are perhaps not in the target audience for this essay.)
So I haven't advocated any particular code of honesty before now. I was aware of the fact that I had an unusually high verbal SAT score, and also, that I spend little time interfacing with mundanes and am not dependent on them for my daily bread. I thought it wasn't my place for me to suggest to anyone else that they try their hand at saying only true things all the time, or for me to act like this conveys moral virtue. I'm only even describing the wizard's code publicly now that I can think of at least one alternative.
I once heard somebody claim that rationalists ought to practice lying, so that they could separate their internal honesty from any fears of needing to say what they believed. That is, if they became good at lying, they'd feel freer to consider geocentrism without worrying what the Church would think about it. I do not in fact think this would be good for the soul, or for a cooperative spirit between people. This is the sort of proposed solution of which I say, "That is a terrible solution and there has to be a better way."
But I do see the problem that person was trying to solve. One can also be privileged in stubbornness when it comes to overriding the fear of other people finding out what you believe. I can see how telling fewer routine lies than usual would make that fear even worse, exacerbating the pressure it can place on what you believe you believe; especially if you didn't have a lot of confidence in your verbal agility. It's one more reason not to pressure people (even a little) into adopting the wizard's code, but then it would be nice to have some other code instead.
4: Literal-truth as my automatic norm, maybe not shared.
This set of thoughts started, as so many things do, with a post by Robin Hanson.
In particular Robin tweeted the paper: "The surprising costs of silence: Asymmetric preferences for prosocial lies of commission and omission."
This got me wondering whether my default norm of the wizard's code is something other people will even perceive as prosocial. Yes, indeed, I feel like not saying things is much more law-abiding than telling literal falsehoods. But if people feel just as wounded, or more wounded, then that policy isn't really benefiting anyone else. It's just letting me feel ethical and maybe being good for my own personal soul.
Robin commented, "Mention all relevant issues, even if you have to lie about them."
I don't think this is a bullet I can bite in daily practice. I think I still want to emit literal truths for most dilemmas short of hiding fugitives. But it's one more argument worth mentioning against trying to make an absolute wizard's code into a bedrock solution for interpersonal reliability.
Robin also published a blog post about "automatic norms" in general:
This made me realize that the wizard's code of honesty I grew up with is, indeed, an automatic norm for me. Which meant I was probably overestimating and eliezeromorphizing the degree to which other people even cared at all, or would think I was keeping any promises by doing it. Again, I don't see this as a good reason to give up on emitting literally true sentences almost all of the time, but it's one more reason I feel more open to alternatives than I would've ten years ago. That said, I do expect a lot of people reading this also have something like that same automatic norm, and I still feel like that makes us more like part of the same tribe.
5: Counterargument: The problem of non-absolute rules.
A proposal like this one ought to come with a lot of warning signs attached. Here's one of them:
There's a passage in John M. Ford's Web of Angels, when the protagonist has finally killed someone even after all the times his mentor taught him to never ever kill. His mentor says:
Surprise! Really the mentor just meant to try to get him to wait before killing people instead of jumping to that right away.
Humans are kind of insane, and there are all sorts of insane institutions that have evolved among us. A fairly large number of those institutions are twisted up in such a way that something explodes if people try to talk openly about how they work.
It's a human kind of thinking to verbally insist that "Don't kill" is an absolute rule, why, it's right up there in the Ten Commandments. Except that what soldiers do doesn't count, at least if they're on the right side of the war. And sure, it's also okay to kill a crazy person with a gun who's in the middle of shooting up a school, because that's just not what the absolute law "Don't kill" means, you know!
Why? Because any rule that's not labeled "absolute, no exceptions" lacks weight in people's minds. So you have to perform that the "Don't kill" commandment is absolute and exceptionless (even though it totally isn't), because that's what it takes to get people to even hesitate. To stay their hands at least until the weight of duty is crushing them down. A rule that isn't even absolute? People just disregard that whenever.
(I speculate this may have to do with how the human mind reuses physical ontology for moral ontology. I speculate that brains started with an ontology for material possibility and impossibility, and reused that ontology for morality; and it internally feels like only the moral reuse of "impossible" is a rigid moral law, while anything short of "moral-impossible" is more like a guideline. Kind of like how, if something isn't absolutely certain, people think that means it's okay to make up their own opinion about it, because if it's not absolutely certain it must not be the domain of Authority. But I digress, and it's just a hypothesis. We don't need to know exactly what is the buried cause of the surface craziness to observe that the craziness is in fact there.)
So you have to perform that the Law is absolute in order to make the actual flexible Law exist. That doesn't mean people lie about how the Law applies to the edge cases—that's not what I mean to convey by the notion of "performing" a statement. More like, proclaim the Law is absolute and then just not talk about anything that contradicts the absoluteness.
And when that happens, it's one more little chunk of insanity that nobody can talk about on the meta-level without it exploding.
Now, you will note that I am going ahead and writing this all down explicitly, because... well, because I expect that in the long run we have to find a way that doesn't require a little knot of madness that nobody is allowed to describe faithfully on the meta-level. So we might as well start today.
I trust that you, the reader, will be able to understand that "Don't kill" is the kind of rule where you give it enough force-as-though-of-absoluteness that it actually takes a deontology-breaking weight of duty to crush down your hands, as opposed to you cheerfully going "oh well I guess there's a crushing weight now! let's go!" at the first sign of inconvenience.
Actually, I don't trust that everyone reading this can do that. That's not even close to literally true. But most you won't ever be called on to kill, and society frowns upon that strongly enough to discourage you anyway. So I did feel it was worth the risk to write that example explicitly.
"Don't lie" is more dangerous to mess with. That's something that most people don't take as an exceptionless absolute to begin with, even in the sense of performing its absoluteless so that it will exist at all. Even extremely honest people will agree that you can lie to the Gestapo about whether you are hiding any Jews in the attic, and not bother to Glomarize your response either; and I think they will mostly agree that this is in fact a "lie" rather than trying to dance around the subject. People who are less than extremely honest think that "I'm fine" is an okay way to answer "How are you?" even if you're not fine.
So there's still a very obvious thing that could go wrong in people's heads, a very obvious way that the notion of "meta-honesty" could blow up, or any other codebesides "don't say false things" could blow up. It's why the very first description in the opening paragraphs says "Don't lie when a normal highly honest person wouldn't, and furthermore…" and you should never omit that preamble if you post any discussion of this on your own blog. THIS IS NOT THE IDEA THAT IT'S OKAY TO LIE SO LONG AS YOU ARE HONEST ABOUT WHEN YOU WOULD LIE IF ANYONE ASKS. It's not an escape hatch.
If anything, meta-honesty is the idea that you should be careful enough about when you break the rule "Don't lie" that, if somebody else asked the hypothetical question, you would be willing to PUBLICLY DEFEND EVERY ONE OF THOSE EXTRAORDINARY EXCEPTIONS as times when even an unusually honest person should lie.
(Unless you were never claiming to be unusually honest, and your pattern of meta-honest responses to hypotheticals openly shows that you lie about as much as an average person. But even here, I'd worry that anyone who lets themselves be as wicked as they imagine the 'average' person to be, would be an unusually wicked person indeed. After all, if Robin Hanson speaks true, we are constantly surrounded by people violating what seem to us like automatic norms.)
6: Meta-honesty, the basics.
Okay, enough preamble, let's speak of the details of meta-honesty, which may or may not be a terrible idea to even talk about, we don't know at this point.
The basic formulation of meta-honesty would be:
"Be at least as honest as an unusually honest person. Furthermore, when somebody asks for it and especially when you believe they're asking for it under this code, try to convey to them a frank and accurate picture of the sort of circumstances under which you would lie. Literally never swear by your meta-honesty that you wouldn't lie about a hypothetical situation that you would in fact lie about."
My first horrible terminology for this was the "Bayesian code of honesty", on the theory that this code meant your sentences never provided Bayesian evidence in the wrong direction. Suppose you say "Hey, Eliezer, what were you doing last night?" and I reply "Staying at home doing the usual things I do before going to bed, why?" If you have a good mental picture of what I would lie about, you have now definitely learned that I was not out watching a movie, because that is not something I would lie about. A very large number of possibilities have been ruled out, and most of your remaining probability mass should now be on me having stayed home last night. You know that I wasn't on a secret date with somebody who doesn't want it known we're dating, because you can ask me that hypothetical and I'll say, "Sure, I'd happily hide that fact, but that isn't enough to force me to lie. I would just say 'Sorry, I can't tell you where I was last night,' instead of lying."
You have not however gained any Bayesian evidence against my hiding a fugitive marijuana seller from the Feds, where somebody's life or freedom is at stake and it's vital to conceal that a secret even exists in the first place. Ideally we'd have common knowledge of that, and hopefully we'd agree that it was fine to lie in that case to a friend who asks a casual-seeming question.
Let's be clear, although this is a kind of softening of deception, it's still deception. Even if somebody has extensively discussed your code of honesty with you, they aren't logically omniscient and won't explicitly have the possibility in mind every time. That's why we should go on holding ourselves to the standard of, "Would I defend this lie even if the person I was defending it to had never heard of meta-honesty?"
"Eliezer," you say, "if you had a temporary schizophrenic breakdown and robbed a bank and this news hadn't become public, would you lie to keep it from becoming public?"
And this would cause me to stop and think and agonize for a bit (which itself tells you something about me, that my answer is not instantly No or Yes). I do have important work to do which should not be trashed without strong reason, and this hypothetical situation would not have involved a great deliberate betrayal on my part; but it is also the sort of thing that you could reasonably argue an unusually honest person ought not to lie about, where lies do not in general serve the social good.
I think in the end I might reply something like "I wouldn't lie freely and would probably try to use at least technical truth or Glomarize, but in the end I might conceal that event rather than letting my work be trashed for no reason. I think I'd understand if somebody else had done likewise, if I thought they were doing good work in the first place. Except that obviously I'd need to tell various people who are engaged in positive-sum trades with me, where it's a directly important issue to them whether I can be trusted never to have mental breakdowns, and remove myself from certain positions of trust. And if it happened twice I'd be more likely to give up. If it got to the point where people were openly asking questions I don't imagine myself as trying to continue a lie. I also want to caveat that I'm describing my ethical views, what I think is right in this situation, and obviously enough pressure can make people violate their own ethics and it's not always predictable how much pressure it takes, though I generally consider myself fairly strong in that regard. But if this had actually happened I would have spent a lot more time thinking about it than the two minutes I spent writing this paragraph." And this would help give you an accurate picture of the sort of person that I am in general, and what I take into account in considering exceptions.
Insofar as you are practicing a mental discipline in being meta-honest, the discipline is to be explicitly aware of every time you say something false, and to ask yourself, "Would I be okay publicly saying, if somebody asked me the hypothetical, that this is a situation where a person ought to lie?"
I still worry that this is not the thing that people need to do to establish their inner pact with truth. Maybe you could pick some friends to whom you just never tell any kind of literal falsehood, in the process of becoming initially aware of how many false things you were just saying all the time… but I don't actually know if that works either. Maybe that's like trying to stop smoking cigarettes on odd-numbered days. It'd be something to notice if the experimental answer is "In reality, meta-honesty turns out not to work for practicing the respect of truth."
Meta-honesty should be for people who are comfortable, not with absolute honesty, but with not trying to appear any more honest than they are. This itself is not the ordinary equilibrium, and if you want to do things the standard human way and not forsake a well-tested and somewhat enforced social equilibrium in pursuit of a bright-eyed novel idealistic agenda, then you should not declare yourself meta-honest, or should let somebody else try it first.
7: Consistent object-level glomarization in meta-level honest responses.
Glomarization can be workable when restricted to special cases, such as only questions about nuclear weapons and submarines. Meta-honesty is such a special case and, if we're doing this, we should all Glomarize it accordingly. In particular meta-questions are not to be used to extract object-level data, and we should all respect that in our questions, and consistently Glomarize about it in our answers, including some random times when Glomarization seems silly.
Some key responses that need to be standard:
And if you clearly say that you "irrevocably worry" about any of these things, it means the meta-honest conversation has crashed; the other person is not supposed to keep pressing you, and if they do, you can lie. Ideally, this is something you should consistently do in any case where a substantial measure of your counterfactual selves as the other person might imagine them would be feeling pressured to the point of maybe meta-lying. That is, you should not only say "irrevocably worry" in cases where you actually have something to conceal, you should say it in cases where the discussion would be pressuring somebody who did have something to conceal and this seems high-enough-probability to you or to your model of the person talking to you.
For example: "Eliezer, would you lie about having robbed a bank?"
I consider whether this sounds like an attempt to extract object-level information from some of my counterfactual selves, and conclude that you probably place very little probability on my having actually robbed a bank. I reply, "Either it is the case that I did rob a bank and I think it is okay to lie about that, or alternatively, my reply is as follows: I wouldn't ordinarily rob a bank. It seems to me that you are postulating some extraordinary circumstance which has driven me to rob a bank, and you need to tell me more about this extraordinary circumstance before I tell you whether I'd lie about it. Or you're postulating a counterfactual version of me that's fallen far enough off the ethical rails that he'd probably stop being honest too."
Some additional statements that ought to be taken as praiseworthy:
This is not supposed to be a clever way to extract information from people and you should shut down any attempt to use it that way.
"Harry," says HPMOR!Dumbledore, "I ask you under the code of meta-honesty (which we have just anachronistically acquired): Would you lie about having robbed the Gringotts Bank?"
Harry thinks, Maybe this is about the Azkaban breakout, and says, "Do you in fact suspect me of having robbed a bank?"
"I think that if I suspected you of having robbed a bank," says Dumbledore, "and I did not wish you to know that, I would not ask you if you had robbed a bank. Why do you ask?"
"Because the circumstances under which you're invoking meta-honesty have something to do with how I answer," says Harry (who has suddenly acquired a view on this subject that some might consider implausibly detailed). "In particular, I think I react differently depending on whether this is basically about you trying to construct a new mutually beneficial arrangement with the person you think I am, or if you're in an adversarial situation with respect to some of my counterfactual selves (where the term 'counterfactual' is standardly taken to include the actual world as one that is counterfactually conditioned on being like itself). Also I think it might be a good idea generally that the first time you try to have an important meta-honest conversation with someone, you first spend some time having a meta-meta-honest conversation to make sure you're on the same page about meta-honesty."
"I am not sure I understood all that," said Dumbledore. "Do you mean that if you think we have become enemies, you might meta-lie to me about when you would lie?"
Harry shook his head. "No," said Harry, "because then if we weren't enemies, you would still never really be able to trust what I say even assuming me to abide by my code of honesty. You would have to worry that maybe I secretly thought you were an enemy and didn't tell you. But the fact that I'm meta-honest shouldn't be something that you can use against me to figure out whether I… sneaked into the girl's dorm and wrote in somebody's diary, say. So if I'm in that situation I've got to protect my counterfactual selves and Glomarize harder. Whereas if this is more of a situation where you want to know if we can go to Mordor together, then I'd feel more open and try to give you a fuller picture of me with more detail and not worry as much about Glomarizing the specific questions you ask."
"I suspect," Dumbledore said gravely, "that those who try to be honest at all will always be at something of a disadvantage relative to the most ready liars, at least if they've robbed Gringotts. But yes, Harry, I am afraid that this is more of a situation where I am… concerned… about some of your counterfactual selves. But then why would you answer at all, in such a case?"
"Because sometimes people are honest and have good intentions," answered Harry, "and I think that if in general they can have an accurate picture of the other person's honesty, everybody is on net a bit better off. Even if I had robbed a bank, for example, you and I would both still not want anything bad to happen to Britain. And some of my counterfactual selves are innocent, and they're not better off if you think I'm more dishonest than I am."
"Then I ask again," said Dumbledore, "under the code of meta-honesty, whether you would lie about having robbed a bank."
"Then my answer is that I wouldn't ordinarily rob a bank," Harry said, "and I'd feel even worse about lying about having robbed a bank, than having robbed a bank. And I'd know that if I robbed a bank I'd also have to lie about it. So whatever weird reason made me rob the bank, it'd have to be weird enough that I was willing to rob the bank and willing to lie about it, which would take a pretty extreme situation. Where it should be clear that I'm not trying to answer about having specifically robbed a bank, I'm trying to give you a general picture of what sort of person I am."
"What if you had been blackmailed into robbing the bank?" inquired Dumbledore. "Or what if things crept up on you bit by bit, so that in the end you found yourself in an absurd situation you'd never intended to enter?"
Harry shrugged helplessly. "Either it's the case that I did end up in a weird situation and I don't want to let you know about that, or alternatively, I feel like you're describing a very broad range of possibilities that I'd have to think about more, because I haven't yet ended up in that kind of situation and I'm not quite sure how I'd behave… I think I'd have in mind that just telling the Headmaster the truth can prevent big problems from blowing up any further, but there'd be cases extreme enough that I wouldn't do that either… I mean, the basic answer is, yes, there's things that would make me lie right to your face, but, I wouldn't do that just for having stolen candy from the kitchen, I don't think. I'd just be like 'I consistently Glomarize when people ask me if I've stolen candy from the kitchen.'"
"Would you lie to protect a fugitive from the law who you believed was being unjustly pursued?" said Dumbledore. "Say that Sirius Black was innocent after all, as the Quibbler would have it; would you hide him and then lie to Minerva about what you were doing on the night you hid him?"
"Yes," Harry said unhesitatingly. "That's something I think is clearly right and the sort of thing I would ordinarily do."
"What if you thought that somebody was being unjustly pursued by the law?" said Dumbledore. "Is that worth lying for?"
Harry frowned. "I think I maybe shouldn't answer questions about what in particular I think ought not to be illegal," Harry said, "but if the basic question is whether I'd lie to my friends to keep somebody who'd only used marijuana out of Azkaban, then I would. I don't think my friends should expect otherwise; they should want me to do that."
"We are not being meta-honest any more, only honest," said Dumbledore. "Were you involved in the breakout from Azkaban?"
Harry stared at Dumbledore a moment, and said, "I think I have to consistently not confirm or deny that, but my counterfactual and possibly actual self who wasn't involved would really like to know at this point what the heck you think might've happened."
If somehow the Gestapo has heard of meta-honesty and they ask you if you would lie to hide Jews in your attic, they should be expecting a law-abiding meta-honest Nazi citizen to Glomarize, abstract the question in such a way as to not reveal the object-level information that might be sought, or say "I irrevocably worry that some of my counterfactual selves are not in a positive-sum situation." That, I now proclaim, ought to be done by Nazis and resistance fighters alike, when you are faced with a question that might be meant to reveal object-level information about what happened.
"Eliezer," says the hypothetical Gestapo officer who has somehow heard about my meta-honesty code, "it happens that I'm a person who's heard of meta-honesty. Now, are you the sort of person who would lie about having Jews hidden in your attic?"
This hypothetical Gestapo officer has a gun. Most people asking you meta-honest questions won't have a gun. In fact I bet this will literally never happen until the end of the world. Let's suppose he has a gun anyway.
"I am the following sort of person," I reply. "If I was hiding the Führer in my attic to protect him from Jewish assassins, I'd lie about that to the assassins. It's clear you know about my code of meta-honesty, so you should understand that is a very innocent thing to say. But these circumstances and the exact counterfactual you are asking make me nervous, so I'm afraid to utter the words I think you may be looking for, namely the admission that if I were the kind of person who'd hide Jews in his attic then I'd be the kind of person who would lie to protect them. Can I say that I believe that in respect to your question as you mean it, I think that is no more and no less true of me than it is true of you?"
"My, you are fast on your verbal feet," says the Gestapo officer. "If somebody were less fast on their verbal feet, would you tell them that it was acceptable for a meta-honest person to just meta-lie to the Jewish assassins in order to hide the Führer?"
"If they didn't feel that their counterfactual loyal Nazi self would think that their counterfactual disloyal self was being pressured and clearly state that fact irrevocably," I say, "I'd say that, just like their counterfactual loyal self, they should make some effort to reveal the general limits of their honesty without betraying any of their counterfactual selves, but say they irrevocably couldn't handle the conversation as soon as they thought their alternate loyal self would think their alternate's counterfactual disloyal self couldn't handle the conversation. It's not as if the Jewish assassins would be fooled if they said otherwise. If the Jewish assassins do continue past that point, which is blatantly forbidden and everyone should know that, they may lie."
"I see," says the Gestapo officer. "If you are telling me the truth, I think I have grasped the extent of what you claim to be honest about." He turns to his subordinates. "Go search his attic."
"Now I'm curious," I say. "What would you have done if I'd sworn to you that I was an absolutely loyal German citizen, and that my character was such that I would certainly never lie about having Jews in my attic even if I were the sort of disloyal citizen who had Jews in his attic in the first place?"
"I would have detailed twice as many men to search your house," says the Gestapo officer, "and had you detained, for that is not the response I would expect from an honest Nazi who knew how meta-honesty was supposed to work. Now I ask you meta-meta-honestly, why haven't you said that you are irrevocably worried that I am abusing the code? Obviously I put substantial probability on you being a traitor, meaning I am deliberately pressuring you into a meta-conversation and trying to use your code of honesty against those counterfactual selves. Why didn't you just shut me down?"
"Because you do have a gun, sir," I say. "I agree that it's what the rules called for me to say, but I thought over the situation and decided that I was comfortable with saying that in general this was a sort of situation where that rule could be bent so as for me to not end up being shot—and I tell you meta-meta-honestly that I do believe the situation has to be that extreme in order for that rule to even be bent."
Really the principle is that it is not okay to meta-ask what the Gestapo officer is meta-asking here. This kind of detailed-edge-case-checking conversation might be appropriate for shoring up the edges of an interaction intended to be mutually beneficial, but absolutely not for storming in looking for Jews in the attic of a person who in your mind has a lot of measure on having something to hide.
But I do want to have trustworthy foundations somewhere.
And I think it's reasonable to expect that over the course of a human lifetime you will literally never end up in a situation where a Gestapo officer who has read this essay is pointing a gun at you and asking overly-object-level-probing meta-honesty questions, and will shoot you if you try to glomarize but will believe you if you lie outright, given that we all know that everyone, innocent or guilty, is supposed to glomarize in situations like that. Up until today I don't think I've ever seen any questions like this being asked in real life at all, even hanging out with a number of people who are heavily into recursion.
So if one is declaring the meta-honesty code at all, then one shouldn't meta-lie, period; I think the rules have been set up to allow that to be absolute. I don't want you to have to worry that maybe I think I'm being pressured, or maybe I thought you meta-asked the wrong thing, so now I think it's okay to meta-lie even though I haven't given any outward sign of that. To that end, I am willing to sacrifice the very tiny fraction of the measure of my future selves who will end up facing an extremely weird Gestapo officer. To me, for now, there doesn't seem to be any real-life circumstance where you should lie in response to a meta-honesty question—rather than consistently glomarize that kind of question, consistently abstract that kind of question, consistently answer in an analogy rather than the original question, or consistently say "I believe some counterfactual versions of me would say that cuts too close to the object level." (It being a standard convention that counterfactuals may include the actual.)
I also think we can reasonably expect that from now until the end of the world, honest people should literally absolutely never need to evade or mislead at all on the meta-meta-level, like if somebody asks if you feel like the meta-level conversation has abided by the rules. (And just like meta-honesty doesn't excuse object-level dishonesty, by saying that meta-meta-honesty seems like it could be everywhere open and total, I don't mean to excuse meta-level lies. We should all still regard meta-lies as extremely bad and a Code Violation and You Cannot Be Trusted Anymore.)
If there's a meta-honest discussion about someone's code of honesty, and a discussion of what they think about the current meta-meta conditions of how the meta-honesty code is being used, and it sounds to you like they think things are fine… then things should be fine, period. If you ask, do they think that any pressure strong enough to potentially shake their meta-honesty is potentially around, do they think that the overall situation here would have treated any of their plausible counterfactual selves in a negative-sum way, and they say no it's all fine—then that is supposed to be absolute under the code. That ought to establish a foundation that's as reliable as the person's claim to be meta-honest at all.
If you go through all that and lie and meta-lie and meta-meta-lie after saying you wouldn't, you've lied under some of the kindest environments that were ever set up on this Earth to let people not lie, among people who were trying to build trust in that code so we could all use it together. You are being a genuinely awful person as I'd judge that, and I may advocate for severe social sanctions to apply.
Assuming this ends up being a thing, that is. I haven't run it past many people yet and this is the first public discussion. Maybe there's some giant hole in it I haven't spotted.
If anybody ever runs into an actual real circumstance where it seems to them that meta-honesty as they tried to use it was giving the essay-reading Gestapo too much power or too much information, maybe because they weren't fast enough on their verbal feet, please email me about it so I can consider whether to modify or backtrack on this whole idea. I will try to protect your anonymity under all circumstances up to and including the end of the world unless you say otherwise. The previous sentence is not the sort of thing I would lie about.
8: Counterargument: Maybe meta-honesty is too subtle.
I worry that the notion of meta-honesty is too complicated and subtle. In that it has subtleties in it, at all.
This concept is certainly too subtle for Twitter. Maybe it's too subtle for us too.
Maybe "meta-honesty" is just too complicated a concept to be able to make it be part of a culture's Law, compared to the standard-twistiness-compliant performance of saying "Always be honest!" and waiting for the weight of duty to crush down people's hands, or saying "Never say anything false!" and just-not-discussing all the exceptions that people think obviously don't count.
(But of course that system also has disadvantages, like people having different automatic norms about what they think are obvious exceptions.)
I've started to worry more, recently, about which cognitive skills have other cognitive skills as prerequisites. One of the reasons I hesitated to publish Inadequate Equilibria (before certain persons yanked it out of my drafts folder and published it anyway) was that I worried that maybe the book's ideas were useless or harmful without mastery of other skills. Like, maybe you need to have developed a skill for demotivating cognition, and until then you can't reason about charged political issues or your startup idea well enough for complicated thoughts about Nash equilibria to do more good than harm. Or maybe unless you already know a bunch of microeconomics, you just stare at society and see a diffuse mass of phenomena that might or might not be bad equilibria, and you can't even guess non-wildly in a way that lets you get started on learning.
Maybe meta-honesty contains enough meta, in that it has meta at all, that it just blows up in most people's heads. Sure, people in our little subcommunity tend to max out the Cognitive Reflection Test and everything that correlates with it. But compared to scoring 3 out of 3 on the CRT, the concept of meta-honesty is probably harder to live in real life—stopping and asking yourself "Would I be willing to publicly defend this as a situation in which unusually honest people should lie, if somebody posed it as a hypothetical?" Maybe that just gets turned into "It's permissible to lie so long as you'd be honest about whether you'd tell that lie if anyone asks you that exact question and remembers to say they're invoking the meta-honesty code," because people can't process the meta-part correctly. Or maybe there's some subtle nonobvious skill that a few people have practiced extensively and can do very easily, and that most people haven't practiced extensively and can't do that easily, and this subskill is required to think about meta-honesty without blowing up. Or maybe I just get an email saying "I tried to be meta-honest and it didn't work because my verbal SAT score was not high enough, you need to retract this."
If so, I'm not sure there's much that could be done about it, besides me declaring that Meta-Honesty had turned out to be a terrible idea as a social innovation and nobody should try that anymore. And then that might not undo the damage to the law-as-absolute performance that makes something be part of the Law.
But I'd outright lie to the Gestapo about Jews in my attic. And even to friends, I can't consistently Glomarize about every point in my life where one of my counterfactual selves could possibly have been doing that. So I can't actually promise to be a wizard, and I want there to exist firm foundations somewhere.
Questions? Comments?