LESSWRONG
LW

All of green_leaf's Comments + Replies

Trump has a history of both ignoring the law and human rights in general, and imprisoning innocent people under the guise of them being illegal immigrants when they aren't. Current events are unsurprising, and a part of what his voters voted for.

Going Nova

green_leaf18d10

Any physical system exhibiting exactly the same input-output mappings.

That's a sufficient condition, but not a necessary one. A factor I can think of right now is the sufficient coherency and completeness of the I/O whole. (If I have a system that outputs what I would in response to one particular input and the rest is random, it doesn't have my consciousness. But for a system where all inputs and outputs match except for an input that says "debug mode," for which it switches to "simulating" somebody else, we can conclude that it has consciousness almost i... (read more)

How I talk to those above me

green_leaf22d10

They might have a personal experience with someone above them harming them or somebody else for asking a question or something analogous.

Going Nova

green_leaf1mo0-4

Ontologically speaking, any physical system exhibiting the same input-output pattern as a conscious being has identical conscious states.

From the story, it's interesting that neither side arrived at their conclusion rigorously, rather, they both use intuition - Bob, who, based on his intuition, concluded Nova had consciousness (assuming that's what people mean when they say "sentient"), and came to the correct conclusion based on incorrect "reasoning," and Tyler, who, based on an incorrect algorithm, convinced Bob Nova wasn't sentient after all - even thou... (read more)

1Seth Herd21d

Any physical system exhibiting exactly the same input-output mappings. Across all inputs and outputs. Short of that, imitation is a real possibility - particularly among LLMs that are trained to predict human responses. I agree that there's something nontrivially "conscious" in a system like Nova; but that's not a good argument for it. Agreed that this is going to get dramatic. There will be arguments and both sides will make good points.

Recent AI model progress feels mostly like bullshit

green_leaf1mo30

(I believe the version he tested was what later became o1-preview.)

Recent AI model progress feels mostly like bullshit

green_leaf1mo84

According to Terrence Tao, GPT-4 was incompetent at graduate-level math (obviously), but o1-preview was mediocre-but-not-entirely-incompetent. That would be a strange thing to report if there were no difference.

(Anecdotally, o3-mini is visibly (massively) brighter than GPT-4.)

4Mo Putera1mo

Full quote on Mathstodon for others' interest: This o1 vs MathOverflow experts comparison was also interesting:

Trojan Sky

green_leaf1mo30

I meant "light-hearted" and sorry, it was just a joke.

3the gears to ascension1mo

Fair enough. Neither dill nor ziz would have been able to pull off their crazy stuff without some people letting themselves get hypnotized, so I think the added warnings are correct.

Trojan Sky

green_leaf1mo*42

imo it's not too dangerous as long as you go into it with the intention to not fully yield control and have mental exception handlers

Ah, you're a soft-glitcher. /lh

Edit: This is a joke.

5the gears to ascension1mo

can you expand on what you mean by that? are there any actions you'd suggest, on my part or others, based on this claim? (also, which of the urban dictionary definitions of "lh" do you mean? they have opposite valences.) edit: added a bunch of warnings to my original comment. sorry for missing them in the first place.

Computational functionalism probably can't explain phenomenal consciousness

green_leaf2mo1-1

Why not?

Because it's not accompanied by the belief itself, only by the computational pattern combined with behavior. If we hypothetically could subtract the first-person belief (which we can't), what would be left would be everything else but the belief itself.

if you claimed that the first-person recognition ((2)-belief) necessarily occurs whenever there's something playing the functional role of a (1)-belief

That's what I claimed, right.

Seems like you'd be begging the question in favor of functionalism

I don't think so. That specific argument had a form of ... (read more)

The Quantum Mars Teleporter: An Empirical Test Of Personal Identity Theories

green_leaf3mo10

What kind of person instance is "perceiving themselves to black out" (that is, having blacked out)?

It's not a person instance, it's an event that happens to the person's stream of consciousness. Either the stream of consciousness truly, objectively ends, and a same-pattern copy will appear on Mars, mistakenly believing they're the very same stream-of-consciousness as that of the original person.

Or the stream is truly, objectively preserved, and the person can calmly enter, knowing that their consciousness will continue on Mars.

I don't think a 3rd-person an... (read more)

The Quantum Mars Teleporter: An Empirical Test Of Personal Identity Theories

green_leaf3mo30

Does the 3rd person perspective explain if you survive a teleporter, or if you perceive yourself to black out forever (like after a car accident)?

1Vladimir_Nesov3mo

Any "perceive yourself to X" phenomenon is something that happens within cognition of some abstract agent/person instance, whether they exist in some world or not. What kind of person instance is "perceiving themselves to black out" (that is, having blacked out)? Ghosts and afterlife seem more grounded than that. But for Earth/Mars question, both options are quite clear, and there is a you that perceives either of them in some of the possibilities, we can point to where those that perceive each of them are, and that is what would be correct for those instances to conclude about themselves, that they exist in the situations that contain them, known from the statement of the thought experiment.

The Quantum Mars Teleporter: An Empirical Test Of Personal Identity Theories

green_leaf3mo10

That only seems to make sense if the next instant of subjective experience is undefined in these situations (and so we have to default to a 3rd person perspective).

1Vladimir_Nesov3mo

A 3rd person perspective is there anyway, can be used regardless, even if other perspectives are also applicable. In this case it explains everything already, so we can't learn additional things in other ways.

Thane Ruthenis's Shortform

green_leaf3mo30

I see, thanks. Just to make sure I'm understanding you correctly, are you excluding the reasoning models, or are you saying there was no jump from GPT-4 to o3? (At first I thought you were excluding them in this comment, until I noticed the "gradually better math/programming performance.")

6Thane Ruthenis3mo

I think GPT-4 to o3 represent non-incremental narrow progress, but only, at best, incremental general progress. (It's possible that o3 does "unlock" transfer learning, or that o4 will do that, etc., but we've seen no indication of that so far.)

Thane Ruthenis's Shortform

green_leaf3mo43

Here's an argument for a capabilities plateau at the level of GPT-4 that I haven't seen discussed before. I'm interested in any holes anyone can spot in it.

One obvious hole would be that capabilities did not, in fact, plateau at the level of GPT-4.

3Thane Ruthenis3mo

There's been incremental improvement and various quality-of-life features like more pleasant chatbot personas, tool use, multimodality, gradually better math/programming performance that make the models useful for gradually bigger demographics, et cetera. But it's all incremental, no jumps like 2-to-3 or 3-to-4.

7Seth Herd3mo

I thought the argument was that progress has slowed down immensely. The softer form of this argument is that LLMs won't plateau but progress will slow to such a crawl that other methods will surpass them. The arrival of o1 and o3 says this has already happened, at least in limited domains - and hybrid training methods and perhaps hybrid systems probably will proceed to surpass base LLMs in all domains.

Computational functionalism probably can't explain phenomenal consciousness

green_leaf4mo10

I think "belief" is overloaded here. We could distinguish two kinds of "believing you're in pain" in this context:

(1) isn't a belief (unless accompanied by (2)).

But in order to resist the fading qualia argument along the quoted lines, I think we only need someone to (1)-believe they're in pain yet be mistaken.

That's not possible, because the belief_2 that one isn't in pain has nowhere to be instantiated.

Even if the intermediate stages believed_2 they're not in pain and only spoke and acted that way (which isn't possible), it would introduce a desynchroniza... (read more)

2Anthony DiGiovanni4mo

Why not? Call it what you like, but it has all the properties relevant to your argument, because your concern was that the person would "act in all ways as if they're in pain" but not actually be in pain. (Seems like you'd be begging the question in favor of functionalism if you claimed that the first-person recognition ((2)-belief) necessarily occurs whenever there's something playing the functional role of a (1)-belief.) I'm saying that no belief_2 exists in this scenario (where there is no pain) at all. Not that the person has a belief_2 that they aren't in pain. I don't find this compelling, because denying epiphenomenalism doesn’t require us to think that changing the first-person aspect of X always changes the third-person aspect of some Y that X causally influences. Only that this sometimes can happen. If we artificially intervene on the person's brain so as to replace X with something else designed to have the same third-person effects on Y as the original, it doesn’t follow that the new X has the same first-person aspect! The whole reason why given our actual brains our beliefs reliably track our subjective experiences is, the subjective experience is naturally coupled with some third-person aspect that tends to cause such beliefs. This no longer holds when we artificially intervene on the system as hypothesized. We probably disagree at a more basic level then. I reject materialism. Subjective experiences are not just patterns.

RohanS's Shortform

green_leaf4mo30

Have you tried it with o1 pro?

avturchin's Shortform

green_leaf4mo10

Does anyone have stats on OpenAI whistleblowers and their continued presence in the world of living?

Computational functionalism probably can't explain phenomenal consciousness

green_leaf4mo20

I argue that computation is fuzzy, it’s a property of our map of a system rather than the territory.

This is false. Everything exists in the territory to the extent to which it can interact with us. While different models can output a different answer as to which computation something runs, that doesn't mean the computation isn't real (or, even, that no computation is real). The computation is real in the sense of it influencing our sense impressions (I can observe my computer running a specific computation, for example). Someone else, whose model doesn't r... (read more)

1Anthony DiGiovanni4mo

I think "belief" is overloaded here. We could distinguish two kinds of "believing you're in pain" in this context: 1. Patterns in some algorithm (resulting from some noxious stimulus) that, combined with other dispositions, lead to the agent's behavior, including uttering "I'm in pain." 2. A first-person response of recognition of the subjective experience of pain. I'd agree it's totally bizarre (if not incoherent) for someone to (2)-believe they're in pain yet be mistaken about that. But in order to resist the fading qualia argument along the quoted lines, I think we only need someone to (1)-believe they're in pain yet be mistaken. Which doesn't seem bizarre to me. (And no, you don't need to be an epiphenomenalist to buy this, I think. Quoting Block: “Consider two computationally identical computers, one that works via electronic mechanisms, the other that works via hydraulic mechanisms. (Suppose that the fluid in one does the same job that the electricity does in the other.) We are not entitled to infer from the causal efficacy of the fluid in the hydraulic machine that the electrical machine also has fluid. One could not conclude that the presence or absence of the fluid makes no difference, just because there is a functional equivalent that has no fluid.”)

A shortcoming of concrete demonstrations as AGI risk advocacy

green_leaf4mo30

I refuse to believe that tweet has been written in good faith.

I refuse to believe the threshold for being an intelligent person on Earth is that low.

3Vladimir_Nesov4mo

That's how this happens, people systematically refuse to believe some things, or to learn some things, or to think some thoughts. It's surprisingly feasible to live in contact with some phenomenon for decades and fail to become an expert. Curiosity needs System 2 guidance to target blind spots.

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.

green_leaf5mo10

Ooh.

Is the mind a program?

green_leaf5mo30

I know the causal closure of the physical as the principle that nothing non-physical influences physical stuff, so that would be the causal closure of the bottom level of description (since there is no level below the physical), rather than the upper.

So if you mean by that that it's enough to simulate neurons rather than individual atoms, that wouldn't be "causal closure" as Wikipedia calls it.

Is the mind a program?

green_leaf5mo30

The neurons/atoms distinction isn't causal closure. Causal closure means there is no outside influence entering the program (other than, let's say, the sensory inputs of the person).

1Charlie Steiner5mo

Euan seems to be using the phrase to mean (something like) causal closure (as the phrase would normally be used e.g. in talking about physicalism) of the upper level of description - basically saying every thing that actually happens makes sense in terms of the emergent theory, it doesn't need to have interventions from outside or below.

Is the mind a program?

green_leaf5mo00

I'm thinking the causal closure part is more about the soul not existing than about anything else.

1Charlie Steiner5mo

Nah, it's about formalizing "you can just think about neurons, you don't have to simulate individual atoms." Which raises the question "don't have to for what purpose?", and causal closure answers "for literally perfect simulation."

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.

green_leaf5mo30

Are you saying that after it has generated the tokens describing what the answer is, the previous thoughts persist, and it can then generate tokens describing them?

(I know that it can introspect on its thoughts during the single forward pass.)

9Vladimir_Nesov5mo

During inference, for each token and each layer over it, the attention block computes some vectors, the data called the KV cache. For the current token, the contribution of an attention block in some layer to the residual stream is computed by looking at the entries in the KV cache of the same layer across all the preceding tokens. This won't contribute to the KV cache entry for the current token at the same layer, it only influences the entry at the next layer, which is how all of this can run in parallel when processing input tokens and in training. The dataflow is shallow but wide. So I would guess it should be possible to post-train an LLM to give answers like "................... Yes" instead of "Because 7! contains both 3 and 5 as factors, which multiply to 15 Yes", and the LLM would still be able to take advantage of CoT (for more challenging questions), because it would be following a line of reasoning written down in the KV cache lines in each layer across the preceding tokens, even if in the first layer there is always the same uninformative dot token. The tokens of the question are still explicitly there and kick off the process by determining the KV cache entries over the first dot tokens of the answer, which can then be taken into account when computing the KV cache entries over the following dot tokens (moving up a layer where the dependence on the KV cache data over the preceding dot tokens is needed), and so on.

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.

green_leaf5mo30

Yeah. The model has no information (except for the log) about its previous thoughts and it's stateless, so it has to infer them from what it said to the user, instead of reporting them.

[This comment is no longer endorsed by its author]Reply

8Drake Thomas5mo

I don't think that's true - in eg the GPT-3 architecture, and in all major open-weights transformer architectures afaik, the attention mechanism is able to feed lots of information from earlier tokens and "thoughts" of the model into later tokens' residual streams in a non-token-based way. It's totally possible for the models to do real introspection on their thoughts (with some caveats about eg computation that occurs in the last few layers), it's just unclear to me whether in practice they perform a lot of it in a way that gets faithfully communicated to the user.

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.

green_leaf5mo30

Claude can think for himself before writing an answer (which is an obvious thing to do, so ChatGPT probably does it too).

In addition, you can significantly improve his ability to reason by letting him think more, so even if it were the case that this kind of awareness is necessary for consciousness, LLMs (or at least Claude) would already have it.

6Drake Thomas5mo

Yeah, I'm thinking about this in terms of introspection on non-token-based "neuralese" thinking behind the outputs; I agree that if you conceptualize the LLM as being the entire process that outputs each user-visible token including potentially a lot of CoT-style reasoning that the model can see but the user can't, and think of "introspection" as "ability to reflect on the non-user-visible process generating user-visible tokens" then models can definitely attain that, but I didn't read the original post as referring to that sort of behavior.

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.

green_leaf5mo11

Thanks for writing this - it bothered me a lot that I appeared to be one of the few people who realized that AI characters were conscious, and this helps me to feel less alone.

Quantum Immortality: A Perspective if AI Doomers are Probably Right

green_leaf5mo30

(This comment is written in the ChatGPT style because I've spent so much time talking to language models.)

Calculating the probabilities

The calculation of the probabilities consists of the following steps:

The epistemic split
Either we guessed the correct digit of $π$ ( $10 %$ ) (branch $1$ ), or we didn't ( $90 %$ ) (branch $2$ ).
The computational split
On branch $1$ , all of your measure survives (branch $1 - 1$ ) and none dies (branch $1 - 2$ ), on branch $2$ , $\frac{1}{128}$ survives (branch $2 - 1$ ) and $\frac{127}{128}$ dies (branch $2 - 2$ ).
Putti

... (read more)

5avturchin5mo

Thanks. By the way, the "chatification" of the mind is a real problem. It's an example of reverse alignment: humans are more alignable than AI (we are gullible), so during interactions with AI, human goals will drift more quickly than AI goals. In the end, we get perfect alignment: humans will want paperclips.

How likely is brain preservation to work?

green_leaf5mo54

Since that argument doesn't give any testable predictions, it cannot be disproved.

The argument we cease to exist every time we go to sleep also can't be disproved, so I wouldn't personally lose much sleep over that.

Quantum Immortality: A Perspective if AI Doomers are Probably Right

green_leaf5mo10

I don't know about similarity... but I was just making a point that QI doesn't require it.

Quantum Immortality: A Perspective if AI Doomers are Probably Right

green_leaf5mo10

When you die, you die.

The interesting part of QI is that the split happens at the moment of your death. So the state-machine-which-is-you continues being instantiated in at least one world. The idea of your consciousness surviving a quantum suicide doesn't rely on it continuing in implementations of similar state machines, merely in the causal descendant of the state machine which you already inhabit.

It's like your brain being duplicated, but those other copies are never woken up and are instantly killed. Only one copy is woken up. Which guarantees that pr... (read more)

5avturchin5mo

In big world immortality there are causally disconnected copies which survive in very remote regions of the universe. But if we don't need continuity, but only similarity of minds, for identity, it is enough.

UFO Betting: Put Up or Shut Up

green_leaf5mo10

Yes. If I relied on losing a bet and someone knew that, them offering me to bet (and therefore lose) would make me wary something would unpredictably go right, I'd win, and my reliance on me losing the bet would be thwarted.

If I meet a random person who offers to give me $100 now and claims that later, if it's not proven that they are the Lord of the Matrix, I don't have to pay them $15,000, most of my probability mass located in "this will end badly" won't be located in "they are the Lord of the Matrix." I don't have the same set of worries here, but the worry remains.

Habryka's Shortform Feed

green_leaf6mo41

I use Google Chrome on Ubuntu Budgie and it does look to me like both the font and the font size changed.

AI #87: Staying in Character

green_leaf6mo80

Character AI used to be extremely good back in the Dec/Jan 2022/2023, with the bots being very helpful, complex and human-like, rather than exacerbating psychological problems in a very small minority of users. As months passed and the user base exponentially grew, the models were gradually simplified to keep up.

Today, their imperfections are obvious, but many people mistakenly interpret it as the models being too human-like (and therefore harmful), rather than the models being too oversimplified while still passing for an AI (and therefore harmful).

Logical Proof for the Emergence and Substrate Independence of Sentience

green_leaf6mo*10

I think we're spinning on an undefined term. I'd bet there are LOTS of details that effect my perception in subtle and aggregate ways which I don't consciously identify.

You're equivocating between perceiving a collection of details and consciously identifying every separate detail.

If I show you a grid of 100 pixels, then (barring imperfect eyesight) you will consciously perceive all 100 them. But you will not consciously identify every individual pixel unless your attention is aimed at each pixel in a for loop (that would take longer than consciously... (read more)

Logical Proof for the Emergence and Substrate Independence of Sentience

green_leaf6mo60

Computability shows that you can have a classical computer that has the same input/output behavior

That's what I mean (I'm talking about the input/output behavior of individual neurons).

Input/Output behavior is generally not considered to be enough to guarantee same consciousness

It should be, because it is, in fact, enough. (However, neither the post, nor my comment require that.)

Eliezer himself argued that GLUT isn't conscious.

Yes, and that's false (but since that's not the argument in the OP, I don't think I should get sidetracked).

But nonetheless, if the

... (read more)

1Rafael Harth6mo

Ah, I see. Nvm then. (I misunderstood the previous comment to apply to the entire brain -- idk why, it was pretty clear that you were talking about a single neuron. My bad.)

Logical Proof for the Emergence and Substrate Independence of Sentience

green_leaf6mo20

so the idea is that you can describe the brain by treating each neuron as a little black box about which you just know its input/output behavior, and then describe the interactions between those little black boxes. Then, assuming you can implement the input/output behavior of your black boxes with a different substrate (i.e., an artificial neuron)

This is guaranteed, because the universe (and any of its subsets) is computable (that means a classical computer can run software that acts the same way).

1Rafael Harth6mo

Also, here's a sufficient reason why this isn't true. As far as I know, Integrated Information Theory is currently the only highly formalized theory of consciousness in the literature. It's also a functionalist theory (at least according to my operationalization of the term.) If you apply the formalism of IIT, it says that simulations on classical computers are minimally conscious at best, regardless of what software is run. Now I'm not saying IIT is correct; in fact, my actual opinion on IIT is "100% wrong, no relation how consciousness actually works". But nonetheless, if the only formalized proposal for consciousness doesn't have the property that simulations preserve consciousness, then clearly the property is not guaranteed. So why does IIT not have this property? Well because IIT analyzes the information flow/computational steps of a system -- abstracting away the physical details, which is why I'm calling it functionalist -- and a simulation of a system performs completely different computational steps than the original system. I mean it's the same thing I said in my other reply; a simulation does not do the same thing as the thing it's simulating, it only arrives at the same outputs, so any theory looking at computational steps will evaluate them differently. They're two different algorithms/computations/programs, which is the level of abstraction that is generally believed to matter on LW. Idk how else to put this.

1Rafael Harth6mo

No. Computability shows that you can have a classical computer that has the same input/output behavior, not that you can have a classical computer that acts the same way. Input/Output behavior is generally not considered to be enough to guarantee same consciousness, so this doesn't give you what you need. Without arguing about the internal workings of the brain, a simulation of a brain is just a different physical process doing different computational steps that arrives at the same result. A GLUT (giant look-up table) is also a different physical process doing different computational steps that arrives at the same result, and Eliezer himself argued that GLUT isn't conscious. The "let's swap neurons in the brain with artificial neurons" is actually a much better argument than "let's build a simulation of the human brain on a different physical system" for this exact reason, and I don't think it's a coincidence that Eliezer used the former argument in his post.

Logical Proof for the Emergence and Substrate Independence of Sentience

green_leaf6mo20

And there are orders of magnitude more detail going on in my body (and even just in my brain) than I perceive, let alone that I communicate.

There are no sentient details going on that you wouldn't perceive.

It doesn't matter if you communicate something, the important part is that you are capable of communicating it, which means that in changes your input/output pattern (if it didn't, you wouldn't be capable of communicating it even in principle).

Circular arguments that "something is discussed, therefore that thing exists"

This isn't the argument in the OP (even though, when reading quickly, I can see how someone could get that impression).

2Dagon6mo

I think we're spinning on an undefined term. I'd bet there are LOTS of details that effect my perception in subtle and aggregate ways which I don't consciously identify. but i have no clue which perceived or unperceived details add up to my conception of sentience, and even less do I understand yours.

The Personal Implications of AGI Realism

green_leaf6mo10

(Thanks to the Hayflick limit, only some lines can go on indefinitely.)

Change My Mind: Thirders in "Sleeping Beauty" are Just Doing Epistemology Wrong

green_leaf6mo11

If the SB always guesses heads, she'll be correct $\frac{1}{3}$ of the time. For that reason, that is her credence.

AI #86: Just Think of the Potential

green_leaf6mo10

Are the ‘AI companion’ apps, or robots, coming? I mean, yes, obviously?

The technology for bots who are "better" than humans in some way (constructive, pro-social, compassionate, intelligent, caring interactions while thinking 2 levels meta) has been around since 2022. But the target group wouldn't pay enough for GPT-4-level inference, so current human-like bots are significantly downscaled compared to what technology allows.

LLMs are likely not conscious

green_leaf6mo10

To consciously take in an information, you don't have to store any bits - you only have to map the correct input to the correct output. (By logical necessity, any transformation that preserves the input/output relationship preserves consciousness.)

Most arguments for AI Doom are either bad or weak

green_leaf6mo810

Unless you can summarize you argument in at most 2 sentences (with evidence), it's completely ignoreable.

This is not how learning any (even slightly complex) topic works.

Spade's Shortform

green_leaf6mo10

When I skipped my medication whose abstinence symptom is strong anxiety, my brain always generated a nightmare to go along with the anxiety, working backwards in the same way.

Edit: Oh, never mind, that's not what you mean.

Alignment by default: the simulation hypothesis

green_leaf6mo10

That wouldn't help. Then the utility would be calculated from (getting two golden bricks) and (murdering my child for a fraction of a second), which still brings lower utility than not following the command.

The set of possible commands for which I can't be maximally rewarded still remains too vast for the statement to be meaningful.

0gb6mo

This sounds absurd to me. Unless of course you're taking the "two golden bricks" literally, in which case I invite you to substitute it by "saving 1 billion other lives" and seeing if your position still stands.

Alignment by default: the simulation hypothesis

green_leaf7mo10

I see your argument. You are saying that "maximal reward", by definition, is something that gives us the maximum utility from all possible actions, and so, by definition, it is our purpose in life.

But actually, utility is a function of both the action (getting two golden bricks) and what it rewards (murdering my child), not merely a function of the action itself (getting two golden bricks).

And so it happens that for many possible demands that I could be given ("you have to murder your child"), there are no possible rewards that would give me more utility t... (read more)

0gb7mo

Not true, as the reward could include all of the unwanted consequences of following the command being divinely reverted a fraction of a second later.

Alignment by default: the simulation hypothesis

green_leaf7mo10

How does someone punishing you or rewarding you make their laws your purpose in life (other than you choosing that you want to be rewarded and not punished)?

-2gb7mo

To be rewarded (and even more so "maximally rewarded") is to be given something you actually want (and the reverse for being punished). That's the definition of what a reward/punishment is. You don't "choose" to want/not want it, any more than you "choose" your utility function. It just is what it is. Being "rewarded" with something you don't want is a contradiction in terms: at best someone tried to reward you, but that attempt failed.

You can, in fact, bamboozle an unaligned AI into sparing your life

green_leaf7mo10

Either we define "belief" as a computational state encoding a model of the world containing some specific data, or we define "belief" as a first-person mental state.

For the first definition, both us and p-zombies believe we have consciousness. So we can't use our belief we have consciousness to know we're not p-zombies.

For the second definition, only we believe we have consciousness. P-zombies have no beliefs at all. So for the second definition, we can use our belief we have consciousness to know we're not p-zombies.

Since we have a belief in the existence of our consciousness according to both definitions, but p-zombies only according to the first definition, we can know we're not p-zombies.

You can, in fact, bamboozle an unaligned AI into sparing your life

green_leaf7mo10

This is incorrect - in a p-zombie, the information processing isn't accompanied by any first-person experience. So if p-zombies are possible, we both do the information processing, but only I am conscious. The p-zombie doesn't believe it's conscious, it only acts that way.

You correctly believe that having the correct information processing always goes hand in hand with believing in consciousness, but that's because p-zombies are impossible. If they were possible, this wouldn't be the case, and we would have special access to the truth that p-zombies lack.

1Stephen Fowler7mo

I am concerned our disagreement here is primarily semantic or based on a simple misunderstanding of each others position. I hope to better understand your objection. "The p-zombie doesn't believe it's conscious, , it only acts that way." One of us is mistaken and using a non-traditional definition of p-zombie or we have different definitions of "belief'. My understanding is that P-zombies are physically identical to regular humans. Their brains contain the same physical patterns that encode their model of the world. That seems, to me, a sufficient physical condition for having identical beliefs. If your p-zombies are only "acting" like they're concious, but do not believe it, then they are not physically identical to humans. The existence of p-zombies, as you have described them, wouldn't refute physicalism. This resource indicates that the way you understand the term p-zombie may be mistaken: https://plato.stanford.edu/entries/zombies/ "but that's because p-zombies are impossible" The main post that I responded to, specifically the section that I directly quoted, assumes it is possible for p-zombies to exist. My comment begins "Assuming for the sake of argument that p-zombies could exist" but this is distinct from a claim that p-zombies actually exist. "If they were possible, this wouldn't be the case, and we would have special access to the truth that p-zombies lack." I do not feel this is convincing because this is an assertion my conclusion is incorrect, but without engaging with my arguments I made to reach that conclusion. I look forward to continuing this discussion.

Leon Lang's Shortform

green_leaf7mo12

What an undignified way to go.