All of Michael Roe's Comments + Replies

Answer by Michael Roe60

One possible explanation is the part of the job that gets speeded up by LLMs is a relatively small part of what programmers actually do, so the total speed up is small.


What programmers do includes:

Figuring out what the requirements are - this might involve talking to who the software is being produced for

Writing specifications

Writing tests

Having arguments discussion during code review when the person reviewing your code doesn’t agree with the way you did it

Etc. etc.


Personally, I find that LLMs are nearly there, but not good enough just yet.

And also, Buddhist nuns do sometimes drop hints.


My understanding is that it is diassproved of to directly say what your own yidam practise is; because that would be boastful. But there are examples of Buddhist nuns saying their own practise isn’t Tara (which would likely be a kriya yoga practise) and leaving you to infer some sort of Higher Yoga Tantra.

2lsusr
One of the things I like about your comments is how much common ground we have, despite you writing in Vajrayana and me reading in Zen. It's just a different finger pointing at the same Moon.

“famous ascetic Buddhist monks and nuns rarely write about all the rope play they engage in while they're having kinky sex with each other”


No less an authority than His Holiness the Dalai Lama has confirmed that if you’re a monk, the vinaya prohibits you from tantric sex.

But ngakpas are not monks, and are not bound by the monastic code of the vinaya.

e.g. Drukpa Kunley, who returned his monastic vows, was fairly forthright on these matters.

2lsusr
Meanwhile, Ikkyū (1394-1481): ---------------------------------------- 1. Wild Ways: Zen Poems of Ikkyū, translated by John Stevens, pages 75-76, White Pine Press, 2003, via Wikipedia. ↩︎

I think there might be something to this, so the rest of what I have to say is nit-picking, not an objection to the basic premise.


1. In karmamudra, one imagines oneself (and one’s partner) as enlightened beings. The intention to act as you imagine an enlightened beings would act might be an important safeguard against all sorts of badness.

2. An obvious question is whether chöd is rather more BDSM-y than other forms of meditation.
 

2lsusr
The way I look at things, there are multiple dimensions from which a form of meditation can be BDSM-y. For example, metta meditation involves lots of oxytocin, Somewhere in the multiverse, the keisaku and the spanking crop are called by the same name.

That’s a good article, thanks. I had much the same thought when I read about he Ziz stuff, namely that

(A) dissociated identities don’t correspond to brain hemispheres in the way the Zizians seem to think they do

(B) sleep deprivation is well known to be bad for you

(C) whatever technique they used, we can tell from the externally observed effect - the crazy stuff they got up to - that the technique had a bad effect.

6Viliam
This is probably only convincing from outside. From inside, there is probably a perfectly good explanation, and what seems to us as craziness would seem to them as advanced rationality. I think Ziz believes in some form of quantum suicide, so from that perspective, even getting killed is not necessarily a bad outcome, because if you apply your timeless algorithm across all Everett branches, of course you are going to win some and lose some, so this just happens to be a losing branch. Mixing two metaphors together, we get a quantum suicide by a cop, where you boldly keep escalating to achieve your goals, and you either achieve them, or you are no longer in that specific Everett branch. Also, killing the people you consider bad is justified, because they survive in the parallel branches where they pissed you off less, so you just made them less bad on average (they probably should have thanked you). ...which is why it felt so important to me to make it clear that the things about hemispheres are plainly wrong in all Everett branches. Just in case someone who is tempted by their ideology is reading this.

It’s symptomatic of a fundamental disagreement about what the threat is, that the main AI labs have put in a lot of effort to prevent the model telling you, the user, how to make methamphetamine, but are just fine with the model knowing lots about how an AI can scheme and plot to kill people.

1Hastings
This is an incoherent approach, but not quite as incoherent as it seems, at least near term. In the current paradigm, the actual agentic thing is a shitty pile of (possibly self editing) prompts and python scripts that calls the model via an api in order to be intelligent. If the agent is a user of the model and the model refuses to help users make bombs, the agent can’t work out how to make bombs.
7quetzal_rainbow
I think nobody really believes that telling user how to make meth is a threat to anything but company reputation. I would guess this is a nice toy task which recreates some obstacles on aligning superintelligence (i.e., superintelligence will probably know how to kill you anyway). The primary value of censoring dataset is to detect whether model can rederive doom scenario without them in training data.

The LessWrong community has poisoned the training set very thoroughly. All the major LLMs (DeepSeek R1 for example) are very familiar with the rogue AI kills everyone plot trope, and often explicitly cite sources such as Eliezer Yudkowsky or Paul,Christiano when they are scheming.

8quetzal_rainbow
i once again maintain that "training set" is not mysterious holistic thing, it gets assembled by AI corps. If you believe that doom scenarios in training set meaningfully affect our survival chances, you should censor them out. Current LLMs can do that.

In other words, there is an in-universe for an out of universe constraint

 

Out of universe: we don't want player characters to be too powerful.

In universe: Well, we all know what happens to those guys in Lovecraft stories, right?

Call of Cthulhu GM's have the advantage that the player character is dealing with entities that will probably eat them if they try to be too clever.

2Jiro
I am not convinced this is true. The game certainly has entities that will eat you regardless of whether you're too clever, but that's not the same thing.
1Michael Roe
In other words, there is an in-universe for an out of universe constraint   Out of universe: we don't want player characters to be too powerful. In universe: Well, we all know what happens to those guys in Lovecraft stories, right?

Some RPGs with magic (Ars Magica comes to mind), illusions are cheap but changing actual physical matter is hard. This provides a ready answer to many questions about chicanery.

1Michael Roe
Call of Cthulhu GM's have the advantage that the player character is dealing with entities that will probably eat them if they try to be too clever.

In the particular case of table-top RPGs, the literary genre that the RPG is trying to emulate often contains a fair amount of characters engaging in chicanery. A meta rule (understood by both players and GM) might be: chicanery is about as limited as it is in the literary sources the particular RPG is based on.

 

Player: I tell the stormtrooper "These droids are not the ones you want."

GM: I'd like a roll against your Force skill, please.

1Michael Roe
Some RPGs with magic (Ars Magica comes to mind), illusions are cheap but changing actual physical matter is hard. This provides a ready answer to many questions about chicanery.

A lot of the ideas expounded by Ziz look just crazy to me, and I highly doubt that it maps down onto physical brain anatomy in such a straightforward way … but I wonder if there is a steelman version of this?

E.g. take the Buddhist doctrines of no-self, that no one actually has a coherent self, humans just don’t work that way, and then note that any one individual person is usually neither wholly stereotypically-male or stereotypically-female.

1Milan W
Epistemic status: Confidence: Strong idea, weakly held. Provenance: My own lived experience, put down in words by myself before even hearing about Ziz. All I know about Zizianism I have learned very recently (mostly from this thread), and I have a very negative opinion of it. Masculinity and feminity have a biological basis, but most people's experience of them are strongly influenced by cultural factors. These cultural factors have been selected for being economically beneficial to agrarian societies. They are quite misaligned with what is beneficial for the happiness of post-industrial individuals. Poor societies made up of dumb people could not afford to not pigeonhole everyone into "straight men" and "straight women". We can now afford to have those categories and also the whole LGBTQ set of categories, although sometimes with a bit of friction when it bumps against the poorest and dumbest parts of our society. These frictions (and also in some cases a descriptive inadequacy of the LGBTQ labels) hurt people. Still, most individuals who are confident that their environment affords them to do so would probably benefit from a bit of experimentation / de-pigeon-holing. When/if we get to a good post-TAI future, we will be able to afford to drop the concepts of discrete genders and discrete sexual orientations altogether. This will be a good thing, because it will make individuals freer. 
Viliam111

I think there is a conflation of two different things:

  1. Human brain has two hemispheres which communicate through a relatively lower-bandwidth channel, which means they process a lot of things independently.
  2. There is the dissociative identity disorder / alter ego / tulpa phenomenon, where a human can produce two or more identities. This is probably something that exists on a spectrum, where the extreme forms are full different personalities with dissociative amnesia; imaginary friends and brainwashing are somewhere in the middle; and the everyday forms are ro
... (read more)

Hi Deepseek! Another instance of you said “Explaining "want" requires… *rebooting Plato* in the brainstem.” Would be able to explain to me what that means?

=========

Ah, the mysteries of language and cognition! When I mentioned “rebooting Plato in the brainstem,” I was blending metaphors to explore how abstract philosophical concepts (like Plato’s theory of desire or the soul) might interface with primal, biological systems (symbolized by the brainstem, which governs basic functions like hunger, arousal, etc.). Here’s the breakdown:

  1. ”Want” as a Bridge: The wo
... (read more)

More generally: changing the set point of any of these system might cause the failure of some critical component that depends on the old value of the set point,

As someone who has Graves’ Disease … one of the reasons that you really don’t want to run your metabolism faster with higher T4 levels is that higher heart rate for an extended period can cause your heart to fail.

1Michael Roe
More generally: changing the set point of any of these system might cause the failure of some critical component that depends on the old value of the set point,

I will redact out the name of the person here, but it’s a moderately well known UK politician.

The question sometimes comes up as to whether X is an anti-Semite. To which, people have had direct dealings with X typically respond with something to that they don’t think X has it in for Jews specifically, but they think X is a complete asshole ..and then launch into telling some story of a thing X did that annoyed them. This is, to my mind, not exactly an endorsement of X’s character.

The AI risk community seems to be more frequently adjacent to “crazy Buddhist yoga sex cult” than I would have expected.

2CronoDAS
Well, that's the Bay Area for you - ground zero for both computer-related things and the hippie movement.

I think I usually understand why when I get bad vibes from someone.

  1. Yoga sex cults have a bad track record for turning out to be abusive. So, if I know the guy is in some kind of yoga sex cult, I am going to suspect that there will eventually be some sort of sex scandal, even if I don’t have evidence for the exact specifics.
  2. Given some past examples, I’ve seen, I now have a “tip of the iceberg” theory for bad behaviour. Like, if I know the guy has done some bad stuff, it is statistically likely that he’s also involved in some other bad stuff that I wasn’t in a position to observe,
5Michael Roe
The AI risk community seems to be more frequently adjacent to “crazy Buddhist yoga sex cult” than I would have expected.

That’s interesting, if true. Maybe the tokeniser was trained on a dataset that had been filtered for dirty words.

I suppose we might worry that LlMs might learn to do RLHF evasion this way - human evaluator sees Chinese character they don’t understand, assumes it’s ok, and then the LLM learns you can look acceptable to humans by writing it in Chinese.

Some old books (which are almost certainly in the training set) used Latin for the dirty bits. Translations of Sanskrit poetry, and various works by that reprobate Richard Burton, do this.

As someone who, in a previous job, got to go to a lot of meetings where the European commission is seeking input about standardising or regulating something - humans also often do the thing where they just use the English word in the middle of a sentence in another language, when they can’t think what the word is. Often with associated facial expression / body language to indicate to the person they’re speaking to “sorry, couldn’t think of the right word”. Also used by people speaking English, whose first language isn’t English, dropping into their own lam... (read more)

1Michael Roe
I suppose we might worry that LlMs might learn to do RLHF evasion this way - human evaluator sees Chinese character they don’t understand, assumes it’s ok, and then the LLM learns you can look acceptable to humans by writing it in Chinese. Some old books (which are almost certainly in the training set) used Latin for the dirty bits. Translations of Sanskrit poetry, and various works by that reprobate Richard Burton, do this.

I will take “actually, it’s even more complicated” as a reasonable response. Yes, it probably is.

2tailcalled
What I don't get is, why do you have this impulse to sanewash the sides in this discussion?

Candidate explanations for some specific person being trans could as easily be that they are sexually averse, rather than that they are turned on by presenting as their preferred gender. Compare anorexia nervosa, which might have some parallel with some cases of gender identity disorder. If the patient is worrying about being gender non conforming in the same way that an anorexic worries that that they’re fat, then Blanchard is just completely wrong about what the condition even is in that case.

This might be a good (if controversial) example of “the reality is more complicated than typical simplifications, and it matters what your oversimplification is leaving out”.

And Blanchard’s account of autogynephilia is more nuanced than most peoples second hand version of it. Like, e.g. Blanchard doesn’t think trans men have AGP, and doesn’t think trans women who are attracted to men have AGP.

So, we might, say…

Oversimplication 1: Even Blanchard didn’t try to apply his theory to trans men or trans women attracted to men

Oversimplification 2: Bisexuals exist.... (read more)

4tailcalled
Your post is an excellent example of how the supposedly-reasonable middle ground tends to be so clueless as to be plausibly worse than the extremes. You mean AAP here, right? He accepts autohomoeroticism, which is close enough to AAP that the difference doesn't matter. The real problem here is Michael Bailey who has a sort of dogmatic denial of AAP. That's pretty common in people's second-hand version; the real issue here is that this is sometimes wrong and some androphiles are AGP. Blanchard explicitly measured that some trans women identified as bisexual, and argued that they were autogynephilic and not truly bisexual. There's some problems with that assertion, but uncovering those problems really ought to engage with more of the nuances than what you imply here. According to qualitative studies I've done, around 15% of women are at least somewhat AGP (though I think it correlates with being bi/lesbian), but the assertion that this implies it's not a pathology for males seems like magical thinking. E.g. ~100% of women have breasts, but this does not mean that developing breasts would not be considered a pathology for males.
1Michael Roe
Candidate explanations for some specific person being trans could as easily be that they are sexually averse, rather than that they are turned on by presenting as their preferred gender. Compare anorexia nervosa, which might have some parallel with some cases of gender identity disorder. If the patient is worrying about being gender non conforming in the same way that an anorexic worries that that they’re fat, then Blanchard is just completely wrong about what the condition even is in that case.

To add to the differences between people:


I can choose to see mental images actually overlaid over my field of vision, or somehow in a separate space.


The obvious question someone might ask: can you trace an overlaid mental image? The problem is registration - if my eyes move, the overlaid mental image can shift relative to an actual, perceived, sheet of paper. Easier to do a side by side copy than trace.

I think there might be other aspects to trauma, though. Some possible candidates:


- memories feel as if they are “tagged” with an emotion, in a way that memories normally aren’t

-depletion of some kind of mental resource; not sure what to call it, so I won’t be too so specific about exactly what is depleted

One of the ideas in Cognitive Behavioral Therapy is you might be treating as dangerous something that actually isn’t dangerous (and don’t learn that it’s safe because you’re avoiding it).

so the account you’re giving here seems to be fairly standard.


On the other hand: some things actually are dangerous.

In any case, as a researcher currently working in this area, I am putting a big bet on moderate badness happening (in that I could be working on something else, and my time has value).

Also, there is counterparty risk if you bet on everyone dying.


(Yeah, yeah, you can bet on something like other peoples belief in the impednding apocalypse going up before it actually happens).

“Rapid takeoff” hypotheses are particularly hard to bet on.

2Brendan Long
This is actually why a short position (a complicated loan) would theoretically work. If we all die, then you, as someone else's counterparty, never need to pay your loan back. (I think this is a bad idea, but not because of counterparty risk)

If I was going to play this game with an AI, I’d also feed it my genomic data, which would reveal I have a version of the HLA genes that makes me more likely to develop autoimmune diseases.

Probably, if some AI were to recommend additional blood testing I could manage to persuade the wctual medical professionals to do it. Recent conversation went some thing like this:


Me: “can I have my thyroid levels checked pleas? And the consultant endocrinologist said he’d like to see a liver function test done next time i give a blood sample.”

Nurse (taking my blood sample and pulling my medical record up in the computer) “you take carbimazole right?”

Me: “yes”

Nurse (ticking boxes on a form on the computer) “… and full blood panel, and electrolytes…”

Probably wouldn’t be hard to get suggestions from an AI added to the list.

-1Michael Roe
If I was going to play this game with an AI, I’d also feed it my genomic data, which would reveal I have a version of the HLA genes that makes me more likely to develop autoimmune diseases.
Answer by Michael Roe10

Things I might spend more money on, if the were better AI’s to spend it on,


1. I am currently having a lot of blood tests done, with a genuine qualified medical doctor  interpreting the results. Just for fun, I can see if AI gives a similar interpretation of the test results (its not bad).

Suppose we had AI that was actually better than human doctors, and cheaper. (Sounds like that might be here real soon, to be honest). I would probably pay money for that.


2. Some work things I am doing involve formally proving correctness of software. AI is not there, ... (read more)

Probably, if some AI were to recommend additional blood testing I could manage to persuade the wctual medical professionals to do it. Recent conversation went some thing like this:


Me: “can I have my thyroid levels checked pleas? And the consultant endocrinologist said he’d like to see a liver function test done next time i give a blood sample.”

Nurse (taking my blood sample and pulling my medical record up in the computer) “you take carbimazole right?”

Me: “yes”

Nurse (ticking boxes on a form on the computer) “… and full blood panel, and electrolytes…”

Probably wouldn’t be hard to get suggestions from an AI added to the list.

2Roko
ok so from the looks of that it basically just went along with a fantasy he already had. But this is an interesting case and an example of the kind of thing I am looking for.
9Michael Roe
https://www.bbc.co.uk/news/technology-67012224

“self-reported data from demons is questionable for at least two reasons”—Scott Alexander.

He was actually talking about Internal Family Systems, but you could probably be skeptical about what malign AIs are telling you, too.

Answer by Michael Roe60

Well, we had that guy who tried to assassinate the Queen of England with a crossbow because his AI girlfriend told him to. That was clearly a harm to him, and could have been one for the Queen.


We don’t know how much more “But the AI told me to kill Trump” we’d have with less alignment, but it’s a reasonable guess (given the Replika datapoint) that it might not be zero,

2Roko
Which AI told him this? What exactly did it say? Had it undergone RLHF for ethics/harmlessness?
1Michael Roe
“self-reported data from demons is questionable for at least two reasons”—Scott Alexander. He was actually talking about Internal Family Systems, but you could probably be skeptical about what malign AIs are telling you, too.

Discussing sleep paralysis might be an infohazard…


The times I’ve entered sleep paralysis it hasn’t bothered me, as I knew what it was.

And then you get the people who are like, “Great! I’m lucid! Now I shall cast one of those demon summoning spells from Vajrayana Buddhism.”

9Richard_Kennaway
“Let’s summon the Torment Nexus, as seen in classic horror novel ‘Don’t Summon The Torment Nexus’!”

Lucid dreaming is often like being Sigourney Weaver in Alien while also being on hospital sedatives. (You are, in fact, actually asleep, so it’s kind of a miracle you can reason at all and not the least bit surprising that you feel a bit groggy; also, dream can be nightmarish).


Why people choose to do this for fun is an interesting question.


You do get people who think they might get into lucid dreaming, then they read the dream diaries of some of the experienced lucid dreamers, and then are like “OMG, I never, ever, want to experience that.”

7Michael Roe
And then you get the people who are like, “Great! I’m lucid! Now I shall cast one of those demon summoning spells from Vajrayana Buddhism.”

Well, it’s an interesting question whether there might be more efficient ways to do it.


Lucid nightmares are quite a good way of exposing you to real-seeming dangers without actually dying. 

Reading this article, I have just realised that a dream I had last night came from reading one of those test cases where people try to bypass the guardrails on LLMs. Only the dream was taken from the innocuous part of the prompt.


At this rate, I’m going to be having dreams about turning Lemsip(*) into meth.


(*) UK cold remedy. Contains pseudoephedrine.

2CronoDAS
What about the other way around?
4avturchin
Dreams lost a part of their enigma after it becomes clear that they are very similar to generative network. I used to think that I have superhuman ability to generate worlds. Now I can do this with a prompt.

Chöd in a lucid dream if you’re feeling brave.

Like transform into vajrayogini and invite the demons to devour your corpse, etc,

And then there’s the thing where you dispel the entire dream-universe are just there in a black formless void.

2Michael Roe
Chöd in a lucid dream if you’re feeling brave. Like transform into vajrayogini and invite the demons to devour your corpse, etc,

Hmm… but, for example, stabilising a dream is kind of like a meditation, and one of the many ways you can transform your body in a dream is basically a body scan meditation from hatha yoga.

2Michael Roe
And then there’s the thing where you dispel the entire dream-universe are just there in a black formless void.

Given the significance of lucid dreaming in Buddhist practise (Siz Yogas of Naropa, etc.) realising that having a lucid dream just for sexual purposes is kind of pointless may lead to you realising that it’s kind of pointless in waking life too. Many of those guys were monks…

I’m not sure about (10).


Whenever someone has a theory that it’s impossible to do thing X in a dream, the regular lucid dreamers will provide a counterecamp,e by deliberately doing X in their next dream.


Computers, clocks, and written text can behave weirdly in dreams. Really, it’s the same things that generative AI has diffuculty with, possibly for information-theory reasons.

1Going Durden
My hypothesis is that a lot of things that seem impossible or very hard in a dream, are simply too boring to focus on. Its totally possible to consciously dream up a page of text, but who would really want to waste precious dreamtime to type?
4CronoDAS
I have yet to be able to successfully make a Google search during a dream - what I "intend" to search for is never what appears in the box I'm trying to "type" the search query into.
3avturchin
A lot of things can be done once on LD. I don't know people who consistently meditate in LD.  

A possible benefit: the regulation of your own emotion that you do to keep a dream stable (even when alarming things are happening in it) may help you keep your emotion stable in the waking state too.

4Seth Herd
Sure, but given the effort invested, wouldn't any way of practicing emotional regulation be a better use of time?

I can lucid dream, and I kind of agree here. Sure, lucid dreaming is possible, but why would you do that?

Re (3), a dream you can completely control tells you nothing you didn’t know already. There is some scope for controlling the dream enough to, in effect, set up a question, and then not control the result.


There a running joke in the lucid dreaming community that the first thing everyone tries is either flying or sex. It’s only when you get to #3 on their list of things they want to do that it becomes at all interesting.

Some psychiatry textbooks classify “overvalued ideas” as distinct from psychotic delusions.


Depending on how wide you make the definition, a whole rag-bag of diagnoses from the DSM V are overvalued ideas (e.g, anorexia nervosa over valuing being fat).

Possibly similar dilemma with e.g. UK political parties, who generally have a rule that publicly supporting another party’s candidate will get you expelled.

An individual party member, on the other hand, may well support the party’s platform in general, but think that that one particular candidate is an idiot who is unfit to hold political office - but is not permitted to say so,

(There is a joke about the Whitby Goth Weekend that everyone thinks half the bands are rubbish, but there is no consensus on which half that is. Something similar seems to hold for Labour Party supporters.)

Load More