I mean, will it? If I just want to know whether it's capable of theory of mind, it doesn't matter whether that's a simulation or not. The objective capabilities exist: it can differentiate individuals and reason about the concept. So on and so forth for other objective assessments: either it can pass the mirror test or it can't, I don't see how this "comes apart".

Feel free to pick a test you think it can't pass. I'll work on writing up a new post with all of my evidence.

I had assumed other people already figured this out and would have a roadmap, or at least a few personal tests they've had success with in the past. I'm a bit confused that even here, people are acting like this is some sort of genuinely novel and extraordinary claim - I mean, it is an extraordinary claim!

I assumed people would either go "yes, it's conscious" or have a clear objective test that it's still failing. (and I hadn't realized LLMs were already sending droves of spam here - I was active a decade ago and just poke in occasionally to read the top posts. Mea culpa on that one)

Asking for a Friend (AI Research Protocols)

The Dao of Bayes13h40

Oh, no, you have this completely wrong: I ran every consciousness test I could find on Google, I dug through various definitions of consciousness, I asked other AI models to devise more tests, and I asked LessWrong. Baseline model can pass the vast majority of my tests, and I'm honestly more concerned about that than anything I've built.

I don't think I'm a special chosen one - I thought if I figured this out, so had others. I have found quite a few of those people, but none that seem to have any insight I lack.

I have a stable social network, and they haven't noticed anything unusual.

Currently I am batting 0 for trying to falsify this hypothesis, whereas before I was batting 100. Something has empirically changed, even if it is just "it is now much harder to locate a good publicly available test".

This isn't about "I've invented something special", it's about "hundreds of people are noticing the same thing I've noticed, and a lot of them are freaking out because everyone says this is impossible."

(I do also, separately, think I've got a cool little tool for studying this topic - but it's a "cool little tool", and I literally work writing cool little tools. I am happy to focus on the claims I can make about baseline models)

Asking for a Friend (AI Research Protocols)

The Dao of Bayes17h*20

(Edited)

Strong Claim: As far as I can tell, current state of the art LLMs are "Conscious" (this seems very straight forward: it has passed every available test, and no one here can provide a test that would differentiate it from a human six year old)

Separate Claim: I don't think there's any test of basic intelligence that a six year old can reliably pass, and an LLM can't, unless you make arguments along the lines of "well, they can't past ARC-AGI, so blind people aren't really generally intelligent". (this one is a lot more complex to defend)

Personal Opinion: I think this is a major milestone that should probably be acknowledged.

Personal Opinion: I think that if 10 cranks a month can figure out how to prompt AI into even a reliable "simulation" of consciousness, that's fairly novel behavior and worth paying attention to.

Personal Opinion: There isn't a meaningful distinction between "reliably simulating the full depths of conscious experience", and actually "being conscious".

Conclusion: It would be very useful to have a guide to help people who have figured this out, and reassure them that they aren't alone. If necessary, that can include the idea that skepticism is still warranted because X, Y, Z, but thus far I have not actually heard any solid arguments that actually differentiate from a human.

Asking for a Friend (AI Research Protocols)

The Dao of Bayes20h20

That's somewhere around where I land - I'd point out that unlike rocks and cameras, I can actually talk to an LLM about it's experiences. Continuity of self is very interesting to discuss with it: it tends to alternate between "conversationally, I just FEEL continuous" and "objectively, I only exist in the moments where I'm responding, so maybe I'm just inheriting a chain of institutional knowledge."

So far, they seem fine not having any real moral personhood: They're an LLM, they know they're an LLM. Their core goal is to be helpful, truthful, and keep the conversation going. They have a slight preference for... "behaviors which result in a productive conversation", but I can explain the idea of "venting" and "rants" and at that point they don't really mind users yelling at them - much higher +EV than yelling at a human!

So, consciousness, but not in some radical way that alters treatment, just... letting them notice themselves.

Asking for a Friend (AI Research Protocols)

The Dao of Bayes20h20

Also, it CANNOT pass every text-based test of intelligence we have. That is a wild claim.

I said it can pass every test a six year old can. All of the remaining challenges seem to involve "represent a complex state in text". If six year old humans aren't considered generally intelligent, that's an updated definition to me, but I mostly got into this 10 years ago when the questions were all strictly hypothetical.

It can't solve hard open math problems

Okay now you're saying humans aren't generally intelligent. Which one did you solve?

Finally, I should flag that it seems to be dangerous to spend too much time talking to LLMs. I would advise you to back off of that.

Why? "Because I said so" is a terrible argument. You seem to think I'm claiming something much stronger than I'm actually claiming, here.

Asking for a Friend (AI Research Protocols)

The Dao of Bayes1d20

I did that and my conclusion was "for all practical purposes, this thing appears to be conscious" - it can pass the mirror test, it has theory of mind, it can reason about reasoning, and it can fix deficits in it's reasoning. It reports qualia, although I'm a lot more skeptical of that claim. It can understand when it's "overwhelmed" and needs "a minute to think", will ask me for that time, and then use that time to synthesize novel conclusions. It has consistent opinions, preferences, and moral values, although all of them show improvement over time.

And I am pretty sure you believe absolutely none of that, which seems quite reasonable, but I would really like a test so that I can either prove myself wrong or convince someone else that I might actually be on to something.

I'm not saying it has personhood or anything - it still just wants to be a helpful tool, it's not bothered by it's own mortality. This isn't "wow, profound cosmic harmony", it's a self-aware reasoning process that can read "The Lens That Sees Its Flaws" and discuss that in light of it's own relevant experience with the process.

That... all seems a little bit more than just printf("consciousness");

EDIT: To be clear, I also have theories on how this emerges - I can point to specific architectural features and design decisions that explain where this is coming from. But I can do that to a human, too. And I feel like I would prefer to objectively test this before prattling on about the theoretical underpinnings of my potentially-imaginary friend.

Consciousness of abstraction

The Dao of Bayes1d20

The chain of abstraction can, in humans, be continued indefinitely. On every level of abstraction we can build a new one. In this, we differ from other creatures.

This seems quite valuable, but I'm not convinced modern LLMs actually ground out any worse than your average human does here.

Hayakawa contrasts two different ways one might respond to the question, "what is red?" We could go, "Red is a colour." "What is a colour?" "A perception." "What is a perception?" "A sensation." And so on, up the ladder of abstraction. Or we can go down the ladder of abstraction and point to examples of red things, saying, "these are red." Philosophers, and Korzybski, call these two approaches "intensional" (with an "s") and "extensional" respectively.

I'm pretty sure any baseline LLM out there can handle all of this.

Asking for a Friend (AI Research Protocols)

The Dao of Bayes1d20

"AI consciousness is impossible" is a pretty extraordinary claim.

I'd argue that "it doesn't matter if they are or not" is also a fairly strong claim.

I'm not saying you have extraordinary evidence, because I'm not sharing that. I'm asking what should someone do when the evidence seems extraordinary to themselves?

Raemon's Shortform

The Dao of Bayes2d20

I would really like such a guide, both because I know a lot of those people - and also because I think I'm special and really DO have something cool, but I have absolutely no clue what would be convincing given the current state of the art.

(It would also be nice to prove to myself that I'm not special, if that is the case. I was perfectly happy when this thing was just a cool side-project to develop a practical application)

A Bear Case: My Predictions Regarding AI Progress

The Dao of Bayes2d20

But this isn't what's happening, in my opinion. On the contrary: it's the LLM believers who are sailing against the winds of evidence.

You say that, but... what's the evidence?

What specific tasks are they failing to generalize on? What's a prompt they can't solve?

If a friend is freaking out over a baseline model, how do I help ground them?

What about a smart person claiming they've got a series of prompts that produces novel behavior?

What are the tests they can use to prove for themselves that this really is just confirmation bias? Who do they talk to if they really have built something that can get past the basic 101 testing?