Nice interview, liked it overall! One small question -
- Heuristic: Imagine you were in a horror movie. At what point would the audience be like “why aren’t you screaming yet?” And how can you see GPT-3 and Dall-E (especially Dall-E) and not imagine the audience screaming at you?
I feel like I'm missing something; to me, this heuristic obviously seems like it'd track "what might freak people out" rather than "how close are we actually to AI". E.g. it feels like I could also imagine an audience at a horror movie starting to scream in the 1970s if they were shown the sample dialogue with SHRDLU starting from page 155 here. Is there something I'm not getting?
Jonathan Blow had a thread on Twitter about this, like Eroisko SHRDLU has no published code, no similar system showing the same behaviour after 40-50 years. Just the author’s word. I think the performance of both was wildly exaggerated.
But if we are the movie audience seeing just the publication of the paper in the 70s, we don't yet know that it will turn out to be a dead end with no meaningful follow-up after 40-50 years. We just see what looks to us like an impressive result at the time.
And we also don't yet know if GPT-3 and Dall-E will turn out to be dead ends with no significant progress for the next 40-50 years. (I will grant that it seems unlikely, but when the SHRDLU paper was published, it being a dead end must have seemed unlikely too.)
If we start going to the exact specifics of what makes them different then yes, there are reasonable grounds for why GPT-3 would be expected to genuinely be more of an advance than SHRDLU was. But at least as described in the post, the heuristic under discussion wasn't "if we look at the details of GPT-3, we have good reasons to expect it to be a major milestone"; the heuristic was "the audience of a horror movie would start screaming when GPT-3 is introduced".
If the audience of a 1970s horror movie would have started screaming when SHRDLU was introduced, what we now know about why it was a dead end doesn't seem to matter, nor does it seem to matter that GPT-3 is different. Especially since why would a horror movie introduce something like that only for it to turn out to be a red herring?
I realize that I may be taking the "horror movie" heuristic too literally but I don't know how else to interpret it than "evaluate AI timelines based on what would make people watching a horror movie assume that something bad is about to happen".
I found this a useful crystallization of what was going on with Death With Dignity (I'm curious if Eliezer thinks this was a good summary)
Great summary!
You can also find some quotes of our conversation here: https://www.lesswrong.com/posts/zk6RK3xFaDeJHsoym/connor-leahy-on-dying-with-dignity-eleutherai-and-conjecture
I appreciate the post and Connor for sharing his views, but the antimeme thing kind of bothers me.
- Here’s my hot take: I think Paul and Eliezer were having two totally different conversations. Paul was trying to have a scientific conversation. Eliezer was trying to convey an antimeme.
- An antimeme is something that by its very nature resists being known. Most antimemes are just boring—things you forget about. If you tell someone an antimeme, it bounces off them. So they need to be communicated in a special way. Moral intuitions. Truths about yourself. A psychologist doesn’t just tell you “yo, you’re fucked up bro.” That doesn’t work.
- A lot of Eliezer’s value as a thinker is that he notices & comprehends antimemes. And he figures out how to communicate them.
- A lot of his frustration throughout the years has been him telling everyone that it’s really really hard to convey antimemes. Because it is.
- If you read The Sequences, some of it is just factual explanations of things. But a lot of it is metaphor. It reads like a religious text. Not because it’s a text of worship, but because it’s about metaphors and stories that affect you more deeply than facts.
- What happened in the MIRI dialogues is that Eliezer was telling Paul “hey, I’m trying to communicate an antimeme to you, but I’m failing because it’s really really hard.”
Does Connor ever say what antimeme Eliezer is trying to convey, or is it so antimemetic that no one can remember it long enough to write it down?
I understand that if this antimeme stuff is actually true, these ideas will be hard to convey. But it's really frustrating to hear Connor keep talking about antimemes while not actually mentioning what these antimemes are and what makes them antimemetic. Also, saying "There are all these antimemes out there but I can't convey them to you" is a frustratingly unfalsifiable statement.
I suppose if it’s an a antimeme, I may be not understanding. But this was my understanding:
Most humans are really bad at being strict consequentialists. In this case, they think of some crazy scheme to slow down capabilities that seems sufficiently hardcore to signal that they are TAKING SHIT SERIOUSLY and ignore second order effects that EY/Connor consider obvious. Anyone whose consequentialism has taken them to this place is not a competent one. EY proposes such people (which I think he takes to mean everyone, possibly even including himself) follow a deontological rule instead, attempt to die with dignity. Connor analogizes this to reward shaping - the practice of assigning partial credit to RL agents for actions likely to be useful in reaching the true goal.
I think that's the antimeme from the Dying with Dignity post. If I remember correctly, the MIRI dialogues between Paul and Eliezer were about takeoff speeds, so Connor is probably referring to something else in the section I quoted, no?
That is a very enlightening post! My favorite bits:
A lot of Eliezer’s value as a thinker is that he notices & comprehends antimemes. And he figures out how to communicate them.
I think Paul and Eliezer were having two totally different conversations. Paul was trying to have a scientific conversation. Eliezer was trying to convey an antimeme.
We posted on LessWrong saying that we’re hiring, and we got so many high-quality applications. 1 in 3 applications were really good— that never happens! So we have some new people, and we have lots of projects, but we’re currently funding-constrained.
I want to flag I expect this to not just be funding constrained by network-constrained – onboarding a new employee doesn't just cost money but a massive amount of time, especially if you're trying to scale a nuanced company culture.
I'm starting to think that utilitarianism is the heart of the problem here. "Utilitarianism is intractable" is only an antimeme to utilitarians, in the same way that "Object-Oriented Programming is complex" is only an antimeme to people who are fans of Object-Oriented Programming.
I'd argue that a major part of the problem really is long-term consequentialism, but I'd argue that this is inevitable at least partially as soon as 2 conditions are met by default:
Trade offs exist, and the value of something cannot be infinity nor arbitrarily large values.
Full knowledge of the values of something isn't known to the agent.
It really doesn't matter whether consequentialism or morality is actually true, just whether it's more useful than other approaches (given that capabilities researchers are only focusing on how capable a model is.)
And for a lot of problems in the real world, this is pretty likely to occur.
For a link to a dentological AI idea, here it is:
https://www.lesswrong.com/posts/FSQ4RCJobu9pussjY/ideological-inference-engines-making-deontology
And for a myopic decision theory, LCDT:
https://www.lesswrong.com/posts/Y76durQHrfqwgwM5o/lcdt-a-myopic-decision-theory
This is a really interesting interview with lots of great ideas. Thanks for taking notes on this!
The only point I don't really agree with is the idea that Redwood Research, Anthropic, and ARC are correlated. Although they are all in the same geographic area, they seem to be working on fairly different projects to me:
Conjecture is a new AI alignment organization (https://www.conjecture.dev/). Edited the post to include the link.
Connor's explanation of an antimeme (as presented in the interview) is above:
An antimeme is something that by its very nature resists being known. Most antimemes are just boring—things you forget about. If you tell someone an antimeme, it bounces off them.
And some are too complicated, and some are too unusual, and some are too disturbing. Four very different things.
I recently listened to Michaël Trazzi interview Connor Leahy (co-founder & CEO of Conjecture, a new AI alignment organization) on a podcast called The Inside View. Youtube video here; full video & transcript here.
The interview helped me better understand Connor’s worldview and Conjecture’s theory of change.
I’m sharing my notes below. The “highlights” section includes the information I found most interesting/useful. The "full notes" section includes all of my notes.
Disclaimer #1: I didn’t take notes on the entire podcast. I selectively emphasized the stuff I found most interesting. Note also that these notes were mostly for my understanding, and I did not set out to perfectly or precisely capture Connor’s views.
Disclaimer #2: I’m always summarizing Connor (even when I write with “I” or “we”— the “I” refers to Connor). I do not necessarily endorse or agree with any of these views.
Highlights
Timelines
Thoughts on MIRI Dialogues & Eliezer’s style
Thoughts on Death with Dignity & optimizing for “dignity points” rather than utility
Thoughts on the importance of playing with large models
Conjecture
Refine (alignment incubator)
Uncorrelated Bets
What does Conjecture need right now?
Full notes
AGI Timelines
How we’ll get AGI
Thoughts on Ajeya’s bioanchors report
Thoughts on Paul-Eliezer-Others Dialogues
Thoughts on Death with Dignity
Thoughts on the importance of playing with large models
Was Eleuther AI net negative?
Conjecture
Thoughts on government coordination
Miracles
Uncorrelated bets
What partial solutions to alignment will look like
What Conjecture works on
Refine (alignment incubator)
Thoughts on infohazards
Why is Conjecture for-profit?
How will Conjecture make money?
What does Conjecture need right now?
Why invest in Conjecture instead of Redwood or Anthropic?