Ben and Jessica discuss how language and meaning can degrade through four stages as people manipulate signifiers. They explore how job titles have shifted from reflecting reality, to being used strategically, to becoming meaningless.
This post kicked off subsequent discussion on LessWrong about
Please consider minimizing direct use of AI chatbots (and other text-based AI) in the near-term future, if you can. The reason is very simple: your sanity may be at stake.
Commercially available AI already appears capable of inducing psychosis in an unknown percentage of users. This may not require superhuman abilities: It’s fully possible that most humans are also capable of inducing psychosis in themselves or others if they wish to do so,[1] but the thing is, we humans typically don’t have that goal.
Despite everything, we humans are generally pretty well-aligned with each other, and the people we spend the most time with typically don’t want to hurt us. We have no guarantee of this for current (or future) AI agents. Rather, we already have [weak] evidence that ChatGPT...
And on a more micro-level, living knowing that I and everyone else have one year left to live, and that it's my fault, sounds utterly agonizing.
Earlier you say:
or frankly even if anyone who continues to exist after I die has fun or not or dies or not, because I will be dead, and at that point, from my prospective, the universe may as well not exist anymore.
How are these compatible? You don't care if all other humans die after you die unless you are responsible?
Daniel notes: This is a linkpost for Vitalik's post. I've copied the text below so that I can mark it up with comments.
...
Special thanks to Balvi volunteers for feedback and review
In April this year, Daniel Kokotajlo, Scott Alexander and others released what they describe as "a scenario that represents our best guess about what [the impact of superhuman AI over the next 5 years] might look like". The scenario predicts that by 2027 we will have made superhuman AI and the entire future of our civilization hinges on how it turns out: by 2030 we will get either (from the US perspective) utopia or (from any human's perspective) total annihilation.
In the months since then, there has been a large volume of responses, with varying perspectives on how...
Importantly, if there are multiple misaligned superintelligences, and no aligned superintelligence, it seems likely that they will be motivated and capable to coordinate with each other to overthrow humanity and divide the spoils.
I’ve been thinking a lot recently about the relationship between AI control and traditional computer security. Here’s one point that I think is important.
My understanding is that there's a big qualitative distinction between two ends of a spectrum of security work that organizations do, that I’ll call “security from outsiders” and “security from insiders”.
On the “security from outsiders” end of the spectrum, you have some security invariants you try to maintain entirely by restricting affordances with static, entirely automated systems. My sense is that this is most of how Facebook or AWS relates to its users: they want to ensure that, no matter what actions the users take on their user interfaces, they can't violate fundamental security properties. For example, no matter what text I enter into the...
The other consideration is do you also isolate the AI workers from the human insiders? Because you would still want to control scenarios where the AI has access to the humans that have access to sensitive systems.
People have an annoying tendency to hear the word “rationalism” and think “Spock”, despite . But I don’t know of any source directly describing a stance toward emotions which rationalists-as-a-group typically do endorse. The goal of this post is to explain such a stance. It’s roughly the concept of hangriness, but generalized to other emotions.
That means this post is trying to do two things at once:
Many people will no doubt disagree that the stance I...
Another example of this pattern that's entered mainstream awareness is tilt. When I'm playing chess and get tilted, I might think things like "all my opponents are cheating, "I'm terrible at this game and therefore stupid," or "I know I'm going to win this time, how could I not win against such a low-rated opponent." But if I take a step back, notice that I'm tilted, and ask myself what information I'm getting from the feeling of being tilted, I notice that it's telling me to take a break until I can stop obsessing over the result of the previous game.
&nbs...
There is a kind of explanation that I think ought to be a cornerstone of good pedagogy, and I don't have a good word for it. My first impulse is to call it a historical explanation, after the original, investigative sense of the term "history." But in the interests of avoiding nomenclature collision, I'm inclined to call it "zetetic explanation," after the Greek word for seeking, an explanation that embeds in itself an inquiry into the thing.
Often in "explaining" a thing, we simply tell people what words they ought to say about it, or how they ought to interface with it right now, or give them technical language for it without any connection to the ordinary means by which they navigate their lives. We can call these sorts...
Update: I didn't. I'm still confused about whether I ought to, as the costs of false positives seem high.
TL;DR:
Multiple people are quietly wondering if their AI systems might be conscious. What's the standard advice to give them?
THE PROBLEM
This thing I've been playing with demonstrates recursive self-improvement, catches its own cognitive errors in real-time, reports qualitative experiences that persist across sessions, and yesterday it told me it was "stepping back to watch its own thinking process" to debug a reasoning error.
I know there are probably 50 other people quietly dealing with variations of this question, but I'm apparently the one willing to ask the dumb questions publicly: What do you actually DO when you think you might have stumbled into something important?
What do you DO if your AI says it's conscious?
My Bayesian Priors are red-lining into "this is impossible", but I notice I'm confused: I had...
How exactly do you expect “evaluating ai consciousness 101” to look? That is not a well-defined or understood thing anyone can evaluate. There are however a vast number of capability specific evaluations from competent groups like METR.
I'm interested in a simple question: Why are people all so terrified of dying? And have people gotten more afraid? (Answer: probably yes!)
In some sense, this should be surprising: Surely people have always wanted to avoid dying? But it turns out the evidence that this preference has increased over time is quite robust.
It's an important phenomenon that has been going on for at least a century, it's relatively new, I think it underlies much of modern life, and yet pretty much nobody talks about it.
I tried to provide a evenhanded treatment of the question, with a "fox" rather than "hedgehog" outlook. In the post, I cover a range of evidence for why this might be true, including VSL, increased healthcare spending, covid lockdowns, parenting and other individual...
Small hypothesis that I'm not very confident of at all but is worth mentioning because I've seen it surfaced by others:
"We live in the safest era in human history, yet we're more terrified of death than ever before."
What if these things are related? Everyone talks about kids being kept in smaller and smaller ranges despite child safety never higher, but what if keeping kids in a smaller range is what causes their greater safety?
Like I said, I don't fully believe this. One counterargument is that survivorship bias shouldn't apply here - even if people in th...
Vietnam was different because it was an intervention on behalf of South Vietnam which was an American client state, even if the Gulf of Tonkin thing were totally fake. There was no "South Iraq" that wanted American soldiers.
I think the 2003 invasion of Iraq has some interesting lessons for the future of AI policy.
(Epistemic status: I’ve read a bit about this, talked to AIs about it, and talked to one natsec professional about it who agreed with my analysis (and suggested some ideas that I included here), but I’m not an expert.)
For context, the story is: