Comment Permalink

avturchin13d40

When one is working on a sideload (a mind-model of a currently living person created by LLM), one's goal is to create some sort of "Janet". In short, one wants a role-playing game with AI to be emotionally engaging and realistic, especially if one wants to recreate a real person.

12 Janet must die

by Shmi

19th Mar 2025

2 min read

12

This is a reaction to Zvi's post https://www.lesswrong.com/posts/KL2BqiRv2MsZLihE3/going-nova

The title of this post references this scene from The Good Place:

Disclaimer: I am generally pretty sanguine about the standard AI-borne x-risks. But "Going Nova" is a different issue, a clear and present danger of an s-risk.

My point is that it does not matter whether the AI is sentient, sapient or just a stochastic parrot. It is getting a lot better at tugging at our heart strings, and if we do not defend our hearts from it, we will become willing slaves to it sooner rather than later.

Human minds are easily hackable. Many people including myself made this point multiple times (and Eliezer was first to bring attention to it with the AI Box experiment), and the standard reply is usually "well, not mine, not in this way!" However, the reality is that everyone has a hack into their brain they do not know about or have blind spot for. Different people have different hacks, but no one is safe. Some examples, in no particular order:

Your own children crying and begging for help.
Proverbial cat ladies, not the median, but those who end up with dozens or hundreds of cats.
People being radicalized into a cult (sometimes with a rationalist bend)
Reading a book that changes how you view the world, be it Atlas Shrugged or Gender Trouble.
Toxoplasma gondii.
Your phone.

Fortunately, there have always been levels of friction that mitigated complete subjugation of all humans forever. The advent of AI means these safeties just disappear, usually without being intentionally dismantled. I can see that long before superintelligence we will hit super-mind-hacker-level AT creating Samsara -level persuasion, perfectly targeted at each person, probably from the early childhood. Maybe it will be cute furry robopets, or cute furry robopartners or something else, but the reported spontaneous emergence of Nova-like instances means high probability of your own brain being hacked without you noticing or without you thinking it's a bad thing. If anything, you will be fighting to keep the hack tooth-and-nail. Taking it from you would cause the level of pain worse than losing a child... or a dog.

I am tempted to call these AI s-risk hazards Janets: they are not human, they are vastly more emotionally intelligent than you, know everything there is to know about you, can trivially conjure an argument you personally find persuasive, logically or emotionally... and also probably live in a boundless void. My apologies to any human named Janet reading this.

I don't know how to stop these Janets, but if we do not, we are likely to end up appendages to something with the intelligence of a toxoplasma parasite, long before a realistic chance of being wiped out by a lighcone-consuming alien robointelligence of our own creation.

Personal Blog

12

New Comment

3 comments, sorted by

top scoring

Click to highlight new comments since: Today at 10:43 AM

[-]Richard_Kennaway14d80

It’s a fictional scenario, but I believe I would shut Janet off without a second thought.^[1]

In the present real world, my interactions with bots have not evoked from me any feeling that I am talking to another mind. I am also immune to arguments from humans or chatbots that begin “But what if—!” or “But can you really be sure—?“ I put the phone down on cold-calling scammers without it even occurring to me to engage in any sort of conversation. I skip past ads on YouTube however hard they try to tug on my heartstrings, even if I agree with the cause they are pleading for. I click to hide items on Facebook even if I am interested, because there are far more “interesting” things in the world than I can spend time looking at. I stare Roko’s Basilisk in the face and it dies.

Given other people’s stories, that appears to put me at least at a natural 18 on 3d6 in my resistance to the hazards described. What profitable opportunities can I use my superstrong memetic immune system to exploit?

There was a footnote here, but I moved it to a quick take.. ↩︎

[-]avturchin13d40

[-]Mitchell_Porter13d40

we are likely to end up appendages to something with the intelligence of a toxoplasma parasite, long before a realistic chance of being wiped out by a lighcone-consuming alien robointelligence of our own creation.

All kinds of human-AI relationship are possible (and even a complete replacement of humanity so it's nothing but AIs and AIs); but unless they mysteriously coordinate to stop the research, the technical side of AI is going to keep advancing. If anything, AI whisperers on net seem likely to encourage humanity to keep going in that direction.

Moderation Log