Some quick reactions:
So my overall position here is something like: we should use religions as a source of possible deep insights about human psychology and culture, to a greater extent than LessWrong historically has (and I'm grateful to Alex for highlighting this, especially given the social cost of doing so).
But we shouldn't place much trust in the heuristics recommended by religions, because those heuristics will often have been selected for some combination of:
Where the difference between a heuristic and an insight is something like the difference between "be all-forgiving" and "if you are all-forgiving it'll often defuse a certain type of internal conflict". Insights are about what to believe, heuristics are about what to do. Insights can be cross-checked against the rest of our knowledge, heuristics are much less legible because in general they don't explain why a given thing is a good idea.
IMO this all remains true even if we focus on the heuristics recommended by many religions, i.e. the pluralistic focus Alex mentions. And it's remains true even given the point Alex made near the end: that "for people in Christian Western culture, I think using the language of Christianity in good ways can be a very effective way to reach the users." Because if you understand the insights that Christianity is built upon, you can use those to reach people without the language of Christianity itself. And if you don't understand those insights, then you don't know how to avoid incorporating the toxic parts of Christianity.
I think this:
Religions can be seen as solutions to the coordination problem of how to get many very different people to trust each other.
is the actual reason religions matter so much, and to a larger extent why they were created so much.
In slogan form, religion turns prisoner's dilemmas into stag hunts.
So my overall position here is something like: we should use religions as a source of possible deep insights about human psychology and culture, to a greater extent than LessWrong historically has (and I'm grateful to Alex for highlighting this, especially given the social cost of doing so).
Thanks a lot for the kind words!
IMO this all remains true even if we focus on the heuristics recommended by many religions, i.e. the pluralistic focus Alex mentions.
I think we're interpreting "pluralism" differently. Here are some central illustrations of what I consider to be the pluralist perspective:
I don't think "lots of religions recommend X" means the pluralist perspective thinks X is good. If anything, the pluralist perspective is actually pretty uncommon / unusual among religions, especially these days.
Because if you understand the insights that Christianity is built upon, you can use those to reach people without the language of Christianity itself. And if you don't understand those insights, then you don't know how to avoid incorporating the toxic parts of Christianity.
I think this doesn't work for people with IQ <= 100, which is about half the world. I agree that an understanding of these insights is necessary to avoid incorporating the toxic parts of Christianity, but I think this can be done even using the language of Christianity. (There's a lot of latitude in how one can interpret the Bible!)
I think we're interpreting "pluralism" differently. Here are some central illustrations of what I consider to be the pluralist perspective:
- the Catholic priest I met at the Parliament of World Religions who encouraged someone who had really bad experiences with Christianity to find spiritual truth in Hinduism
- the passage in the Quran that says the true believers of Judaism and Christianity will also be saved
- the Vatican calling the Buddha and Jesus great healers
If I change "i.e. the pluralist focus Alex mentions" to "e.g. the pluralist focus Alex mentions" does that work? I shouldn't have implied that all people who believe in heuristics recommended by many religions are pluralists (in your sense). But it does seem reasonable to say that pluralists (in your sense) believe in heuristics recommended by many religions, unless I'm misunderstanding you. (In the examples you listed these would be heuristics like "seek spiritual truth", "believe in (some version of) God", "learn from great healers", etc.)
I think this doesn't work for people with IQ <= 100, which is about half the world. I agree that an understanding of these insights is necessary to avoid incorporating the toxic parts of Christianity, but I think this can be done even using the language of Christianity. (There's a lot of latitude in how one can interpret the Bible!)
I personally don't have a great way of distinguishing between "trying to reach these people" and "trying to manipulate these people". In general I don't even think most people trying to do such outreach genuinely know whether their actual motivations are more about outreach or about manipulation. (E.g. I expect that most people who advocate for luxury beliefs sincerely believe that they're trying to help worse-off people understand the truth.) Because of this I'm skeptical of elite projects that have outreach as a major motivation, except when it comes to very clearly scientifically-grounded stuff.
If I change "i.e. the pluralist focus Alex mentions" to "e.g. the pluralist focus Alex mentions" does that work? I shouldn't have implied that all people who believe in heuristics recommended by many religions are pluralists (in your sense). But it does seem reasonable to say that pluralists (in your sense) believe in heuristics recommended by many religions, unless I'm misunderstanding you. (In the examples you listed these would be heuristics like "seek spiritual truth", "believe in (some version of) God", "learn from great healers", etc.)
If your main point is "don't follow religious heuristics blindly, only follow them if you actually understand why they're good" I'm totally with you. I think I got thrown off a bit because, AFAIU, the way people tend to come to adopt pluralist views is by doing exactly that, and thereby coming to conclusions that go against mainstream religious interpretations. (I am super impressed that the Pope himself seems to have been going in this direction. The Catholic monks at the monastery I visited generally wished the Pope were a lot more conservative.)
I personally don't have a great way of distinguishing between "trying to reach these people" and "trying to manipulate these people".
I use heuristics similar to those for communicating to young children.
In general I don't even think most people trying to do such outreach genuinely know whether their actual motivations are more about outreach or about manipulation. (E.g. I expect that most people who advocate for luxury beliefs sincerely believe that they're trying to help worse-off people understand the truth.) Because of this I'm skeptical of elite projects that have outreach as a major motivation, except when it comes to very clearly scientifically-grounded stuff.
This is why I mostly want religious pluralist leaders who already have an established track record of trustworthiness in their religious communities to be in charge of getting the message across to the people of their religion.
I was telling one of the Catholic priests there about my experience of Jesus during an ayahuasca ceremony and he was just like, "I don't know what ayahuasca is, but the story you told sounds super legit and you are super lucky to have had that experience at such a young age. I've only had this experience after decades and decades of going deep into Catholicism and all the rites and rituals. All the doctrines of Catholicism are really about having that kind of experience. And you just had it directly." And another one who heard it was just like, "Whatever you're doing, Alex, keep doing it. It sounds like you're on the right track."
Reminded me of this:
In the early 1980’s Father Thomas Keating, a Catholic priest, sponsored a meeting of contemplatives from many different religions. The group represented a few Christian denominations as well as Zen, Tibetan, Islam, Judaism, Native American & Nonaligned. They found the meeting very productive and decided to have annual meetings. Each year they have a meeting at a monastery of a different tradition, and share the daily practice of that tradition as a part of the meetings. The purpose of the meetings was to establish what common understandings they had achieved as a result of their diverse practices. The group has become known as the Snowmass Contemplative Group because the first of these meetings was held in the Trappist monastery in Snowmass, Colorado.
When scholars from different religious traditions meet, they argue endlessly about their different beliefs. When contemplatives from different religious traditions meet, they celebrate their common understandings. Because of their direct personal understanding, they were able to comprehend experiences which in words are described in many different ways. The Snowmass Contemplative Group has established seven Points of Agreement that they have been refining over the years:
- The potential for enlightenment is in every person.
- The human mind cannot comprehend ultimate reality, but ultimate reality can be experienced.
- The ultimate reality is the source of all existence.
- Faith is opening, accepting & responding to ultimate reality.
- Confidence in oneself as rooted in the ultimate reality is the necessary corollary to faith in the ultimate reality.
- As long as the human experience is experienced as separate from the ultimate realty it is subject to ignorance, illusion, weakness and suffering.
- Disciplined practice is essential to the spiritual journey, yet spiritual attainment is not the result of one’s effort but the experience of oneness with ultimate reality. [...]
Contemplatives from different traditions generally agree that there is a transforming experience they agree to call enlightenment. They agree that enlightenment is attained as a result of controlling the mind with various forms of practice.
Thank you for recording and posting these, I feel like I learned a lot, both about how to have conversations and lots of little details like the restaurant thing as proto preference synthesizer and the trauma cancer analogy and the Muhammad story and the disendorsing all judgements/resentments thing.
which is being able to ground the apparently contradictory metaphysical claims across religions into a single mathematical framework.
Is there a minimal operationalized version of this? Something that is the smallest formal or empirical result one could have that would count to you as small progress towards this goal?
I'm not sure how much this answers your question, but:
Thanks this was clarifying. I am wondering if you agree with the following (focusing on the predictive processing parts since that's my background):
There are important insights and claims from religious sources that seem to capture psychological and social truths that aren't yet fully captured by science. At least some of these phenomenon might be formalizable via a better understanding of how the brain and the mind work, and to that end predictive processing (and other theories of that sort) could be useful to explain the phenomenon in question.
You spoke of wanting formalization but I wonder if the main thing is really the creation of a science, though of course math is a very useful tool to do science with and to create a more complete understanding. At the end of the day we want our formalizations to comport to reality - whatever aspects of reality we are interested in understanding.
There are important insights and claims from religious sources that seem to capture psychological and social truths that aren't yet fully captured by science. At least some of these phenomenon might be formalizable via a better understanding of how the brain and the mind work, and to that end predictive processing (and other theories of that sort) could be useful to explain the phenomenon in question.
Yes, I agree with this claim.
You spoke of wanting formalization but I wonder if the main thing is really the creation of a science, though of course math is a very useful tool to do science with and to create a more complete understanding. At the end of the day we want our formalizations to comport to reality - whatever aspects of reality we are interested in understanding.
That feels resonant. I think the kind of science I'm hoping for is currently bottlenecked by us not yet having the right formalisms, kind of like how Newtonian physics was bottlenecked by not having the formalism of calculus. (I would certainly want to build things using these formalisms, like an ungameable steel-Arbital.)
An idealized version would be like a magic box that's able to take in a bunch of people with conflicting preferences about how they ought to coordinate (for example, how they should govern their society), figure out a synthesis of their preferences,
(I didn't read most of the dialogue so this may be addressed elsewhere)
I think this is subtly but importantly wrong. I think what you're actually supposed to be trying to get at is more like creating preferences than reconciling preferences.
I'm not sure how you're interpreting the distinction between creating a preference vs reconciling a preference.
Suppose Alice wants X and Bob wants Y, and X and Y appear to conflict, but Carol shows up and proposes Z, which Alice and Bob both feel like addresses what they'd initially wanted from X and Y. Insofar as Alice and Bob both prefer Z over X and Y and hadn't even considered Z beforehand, in some sense Carol created this preference for them; but I also think of this preference for Z as reconciling their conflicting preferences X and Y.
I'm saying that a religious way of being is one where the minimal [thing that can want, in the fullest sense] is a collective.
I don't really get how what you just said relates to creating vs reconciling preferences. Can you elaborate on that a bit more?
I'll try a bit but it would take like 5000 words to fully elaborate, so I'd need more info on which part is unclear or not trueseeming.
One piece is thinking of individual humans vs collectives. If an individual can want in the fullest sense, then a collective is some sort of combination of wants from constituents--a reconciliation. If an individual can't want in the fullest sense, but a collective can, then: If you take several individuals with their ur-wants and create a collective with proper wants, then a proper want has been created de novo.
The theogenic/theopoetic faculty points at creating collectives-with-wants, but it isn't a want itself. A flowerbud isn't a flower.
The picture is complicated of course. For example, individual humans can do this process on their own somewhat, with themselves. And sometimes you do have a want, and you don't understand the want clearly, and then later come to understand the want more clearly. But part of what I'm saying is that many episodes that you could retrospectively describe that way are not really like that; instead, you had a flowerbud, and then by asking for a flower you called the flowerbud to bloom.
Thanks for the elaboration. Your distinction about creating vs reconciling preferences seems to hinge on the distinction between "ur-want" and "proper want". I'm not really drawing a type-level distinction between "ur-want" and "proper want", and think of each flower as itself being a flowerbud that could further bloom. In my example of Alice wanting X, Bob wanting Y, and Carol proposing Z, I'd thought of X and Y as both "proper wants" and "ur-wants that bloomed into Z"
In physics, the objects of study are mass, velocity, energy, etc. It’s natural to quantify them, and as soon as you’ve done that you’ve taken the first step in applying math to physics. There are a couple reasons that this is a productive thing to do:
Together this means that you benefit from even very simple math and can scale up smoothly to more sophisticated. From simply adding masses to F=ma to Lagrangian mechanics and beyond.
It’s not clear to me that those virtues apply here:
Perhaps these concerns would be addressed by examples of the kind of statement you have in mind.
It could also help for zhukeepa to give any single instance of such a 'Rosetta Stone' between different ideologies or narratives or (informal) worldviews. I do not currently know what to imagine, other than a series of loose analogies, which can be helpful, but are a bit of a difficult target to point at and I don't expect to find with a mathematical framework.
It's relevant that I think of the type signature of religious metaphysical claims as being more like "informal descriptions of the principles of consciousnes / the inner world" (analogously to informal descriptions of the principles of the natural world) than like "ideology or narrative". Lots of cultures independently made observations about the natural world, and Newton's Laws in some sense could be thought of as a "Rosetta Stone" for these informal observations about the natural world.
Perhaps these concerns would be addressed by examples of the kind of statement you have in mind.
I'm not sure exactly what you're asking -- I wonder how much my reply to Adam Shai addresses your concerns?
I will also mention this quote from the category theorist Lawvere, whose line of thinking I feel pretty aligned with:
It is my belief that in the next decade and in the next century the technical advances forged by category theorists will be of value to dialectical philosophy, lending precise form with disputable mathematical models to ancient philosophical distinctions such as general vs. particular, objective vs. subjective, being vs. becoming, space vs. quantity, equality vs. difference, quantitative vs. qualitative etc. In turn the explicit attention by mathematicians to such philosophical questions is necessary to achieve the goal of making mathematics (and hence other sciences) more widely learnable and useable. Of course this will require that philosophers learn mathematics and that mathematicians learn philosophy.
I think getting technical precision on philosophical concepts like these will play a crucial role in the kind of math I'm envisioning.
Ben Pace
Can you say slightly more detail about how you think the preference synthesizer thing is suposed to work?
zhukeepa
Well, yeah. An idealized version would be like a magic box that's able to take in a bunch of people with conflicting preferences about how they ought to coordinate (for example, how they should govern their society), figure out a synthesis of their preferences, and communicate this synthesis to each person in a way that's agreeable to them.
...
Ben Pace
Okay. So, you want a preference synthesizer, or like a policy-outputter that everyone's down for?
zhukeepa
Yes, with a few caveats, one being that I think preference synthesis is going to be a process that unfolds over time, just like truth-seeking dialogue that bridges different worldviews.
...
zhukeepa
Yeah. I think the thing I'm wanting to say right now is a potentially very relevant detail in my conception of the preference synthesis process, which is that to the extent that individual people in there have deep blind spots that lead them to pursue things that are at odds with the common good, this process would reveal those blind spots while also offering the chance to forgive them if you're willing to accept it and change.
I may be totally off, but whenever I read you (zhukeepa) elaborating on the preference synthesizer idea I kept thinking of democratic fine-tuning (paper: What are human values, and how do we align AI to them?), which felt like it had the same vibe. It's late night here so I'll butcher their idea if I try to explain them, so instead I'll just dump a long quote and a bunch of pics and hope you find it at least tangentially relevant:
We report on the first run of “Democratic Fine-Tuning” (DFT), funded by OpenAI. DFT is a democratic process that surfaces the “wisest” moral intuitions of a large population, compiled into a structure we call the “moral graph”, which can be used for LLM alignment.
- We show bridging effects of our new democratic process. 500 participants were sampled to represent the US population. We focused on divisive topics, like how and if an LLM chatbot should respond in situations like when a user requests abortion advice. We found that Republicans and Democrats come to agreement on values it should use to respond, despite having different views about abortion itself.
- We present the first moral graph, generated by this sample of Americans, capturing agreement on LLM values despite diverse backgrounds.
- We present good news about their experience: 71% of participants said the process clarified their thinking, and 75% gained substantial respect for those across the political divide.
- Finally, we’ll say why moral graphs are better targets for alignment than constitutions or simple rules like HHH. We’ll suggest advantages of moral graphs in safety, scalability, oversight, interpretability, moral depth, and robustness to conflict and manipulation.
In addition to this report, we're releasing a visual explorer for the moral graph, and open data about our participants, their experience, and their contributions.
...
Our goal with DFT is to make one fine-tuned model that works for Republicans, for Democrats, and in general across ideological groups and across cultures; one model that people all around the world can all consider “wise”, because it's tuned by values we have broad consensus on. We hope this can help avoid a proliferation of models with different tunings and without morality, fighting to race to the bottom in marketing, politics, etc. For more on these motivations, read our introduction post.
To achieve this goal, we use two novel techniques: First, we align towards values rather than preferences, by using a chatbot to elicit what values the model should use when it responds, gathering these values from a large, diverse population. Second, we then combine these values into a “moral graph” to find which values are most broadly considered wise.
Example moral graph, which "charts out how much agreement there is that any one value is wiser than another":
Also, "people endorse the generated cards as representing their values—in fact, as representing what they care about even more than their prior responses. We paid for a representative sample of the US (age, sex, political affiliation) to go through the process, using Prolific. In this sample, we see a lot of convergence. As we report further down, people overwhelmingly felt well-represented with the cards, and say the process helped them clarify their thinking", which is why I paid attention to DFT at all:
Yeah, I also see broad similarities between my vision and that of the Meaning Alignment people. I'm not super familiar with the work they're doing, but I'm pretty positive on the the little bits of it I've encountered. I'd say that our main difference is that I'm focusing on ungameable preference synthesis, which I think will be needed to robustly beat Moloch. I'm glad they're doing what they're doing, though, and I wouldn't be shocked if we ended up collaborating at some point.
Really appreciated this exchange, Ben & Alex have rare conversational chemistry and ability to sense-make productively at the edge of their world models.
I mostly agree with Alex on the importance of interfacing with extant institutional religion, though less sure that one should side with pluralists over exclusivists. For example, exclusivist religious groups seem to be the only human groups currently able to reproduce themselves, probably because exclusivism confers protection against harmful memes and cultural practices.
I'm also pursuing the vision of a decentralized singleton as alternative to Moloch or turnkey totalitarianism, although it's not obvious to me how the psychological insights of religious contemplatives are crucial here, rather than skilled deployment of social technology like the common law, nation states, mechanism design, cryptography, recommender systems, LLM-powered coordination tools, etc. Is there evidence that "enlightened" people, for some sense of "enlightened" are in fact better at cooperating with each other at scale?
If we do achieve existential security through building a stable decentralized singleton, it seems much more likely that it would be the result of powerful new social tech, rather than the result of intervention on individual psychology. I suppose it could be the result of both with one enabling the other, like the printing press enabling the Reformation.
In other words, there's a question about how to think about truth in a way that honors perspectivalism, while also not devolving into relativism. And the way Jordan and I were thinking about this, was to have each filter bubble -- with their own standards of judgment for what's true and what's good -- to be fed the best content from the other filter bubbles by the standards from within each filter bubble, rather than the worst content, which is more like what we see with social media today.
Seems like Monica Anderson was trying to do something like that with BubbleCity. (pdf, podcast)
Introduction from Ben
Zhukeepa is a LessWronger who I respect and whose views I'm interested in. In 2018 he wrote the first broadly successful explication of Paul Christiano's research ideas for AI alignment, has spent a lot of time interviewing people in AI about their perspectives, and written some more about neuroscience and agent foundations research. He came first in the 2012 US Math Olympiad, and formerly worked on a startup called AlphaSheets that raised many millions of dollars and then got acquihired by Google.
He has also gone around saying (in my opinion) pretty silly-sounding things like he believes in his steelman of the Second Coming of Christ. He also extols the virtues of various psychedelics, and has done a lot of circling and meditation. As a person who thinks most religions are pretty bad for the world and would like to see them die, and thinks many people trick themselves into false insights with spiritual and psychological practices like those Alex has explored, I was interested in knowing what this meant to him and why he was interested in it, and get a better sense of whether there's any value here or just distraction.
So we sat down for four 2-hour conversations over the course of four weeks, either written or transcribed, and have published them here as an extended LessWrong dialogue.
I think of this as being more of an interview about Zhukeepa's perspective, with me learning and poking at various parts of it. While I found it interesting throughout, this is a meandering conversation that many may prefer to skip unless they too are especially curious about Zhukeepa's perspective or have a particular interest in the topics discussed. You can skim through the table of contents on the left to get a sense of the discussion, and also read Zhu's introductory thoughts immediately below.
Introduction from Alex
Despite the warnings and admonishments against doing so, I’d decided 5 years ago to venture off to the Dangerous Foreign Land of Religion and Spirituality, after becoming convinced that something in that land was crucial for thinking clearly about AI alignment and AI coordination. Since embarking on that journey, I’ve picked up a lot of customs and perspectives that the locals here on LessWrong are highly suspicious of.
A few months ago, I caught up with my old friend Ben Pace on a walk, who expressed a lot of skepticism toward my views, but nevertheless remained kind, patient, respectful, and curious about understanding where I was coming from. He also seemed to have a lot of natural aptitude in making sense of my views. This gave me hope that:
… which is what motivated me to begin an extended series of dialogues with him. Below are a couple of excerpts going into some points that I'm particularly glad surfaced over the course of this dialogue:
[...]
[...]
I also managed to clarify a couple of my core beliefs over the course of this dialogue, that I'd like to summarize here:
Conversation 1 — April 6th 2024
Alex and Ben had gone on a walk to discuss religion, and decided to continue the discussion over a LessWrong dialogue.
Alex's steelman of the Second Coming of Christ
Why is Alex interested in religions in the first place?
What has Alex gotten from religion?
Resentment, forgiveness, and acceptance
Conversation 2 – April 13th 2024
At the next meeting we recorded audio and had it transcribed, and then lightly edited. This was 2 hours, and Ben was quite underslept.
What Alex gets from religious stories
Orientation toward death
What useful things do old religious stories have to say about how to live life well today?
Forgiving yourself, and devoting your life to something
Changing the minds of all of humanity / steel-Arbital
How Alex thinks about changing the world
Coordinating with the rest of the world sounds annoying
Maybe religious pluralists are sane?
<bathroom break>
A mathematical synthesis of religious metaphysics?
Conversation 3 — April 20th, 2024
We continued our dialogue over Zoom on April 20th 2024. The audio was also transcribed and lightly edited.
On the nature of evil
Religious prophets vs the Comet King on the problem of evil
Evil is like cancer, maybe
Mistake vs conflict theory on cancer / evil
Forgiving evil in ourselves
Does mistake theory toward sociopaths really make sense?
Rescuing old religions vs competing with old religions
Alex doesn't personally find institutionalized religion compelling
Mistake vs conflict theory toward existing religions
Religious pluralism coalition memetically outcompeting religious exclusivism
Why Alex doesn't want to start a new religion
Does the "Allah will protect me" story suck?
Jesus's empirical claim about evil
Alex doesn't want to tell people how to live
In closing
Conversation 4 — April 28th 2024
We continued our dialogue over Zoom on April 20th 2024. The audio was also transcribed and lightly edited.
Intro
Apparently: Agent Foundations = Religious Metaphysics = SteelArbital
Is Steel-Arbital harder than coordinating around not building AGI?
Steel-Arbital as a preference synthesizer
Killing Moloch with a decentralized singleton running on Steel-Arbital
(What would you do with a trusted preference synthesizer?)
Steel-Arbital should reveal and forgive blind spots
Religion as a tool for coordinating with the masses
Steel-Arbital vs religious visions for a fixed world
What religions say about starting new religions
Revisiting agent foundations vs religious metaphysics vs steel-Arbital
Cruxes around coordinating with the masses
Ben finally gets why Zhu doesn't want to start a new religion
Going forward