JBlack - LessWrong

The second type of preference seems to apply to anticipated perceptions of the world by the agent - such as the anticipated perception of eating ice cream in a waffle cone. It doesn't have to be so immediately direct, since it could also apply to instrumental goals such as doing something unpleasant now for expected improved experiences later.

The first seems to be a more like a "principle" than a preference, in that the agent is judging outcomes on the principle of whether needless suffering exists in it, regardless of whether that suffering has any effect on the agent at all.

To distinguish them, we could imagine a thought experiment in which such a person could choose to accept or deny some ongoing benefit for themselves that causes needless suffering on some distant world, and they will have their memory of the decision and any psychological consequences of it immediately negated regardless of which they chose.

Pi Rogers's Shortform

JBlack7h20

It's even worse than that. Maybe I would be happier with my ice cream in a waffle cone the next time I have ice cream, but actually this is just a specific expression of being happier eating a variety of tasty things over time and it's just that I haven't had ice cream in a waffle cone for a while. The time after that, I will likely "prefer" something else despite my underlying preferences not having changed. Or something even more complex and interrelated with various parts of history and internal state.

It may be better to distinguish instances of "preferences" that are specific to a given internal state and history, and an agent's general mapping over all internal states and histories.

Is CDT with precommitment enough?

JBlack7d20

Yes, such an agent will self-modify if it is presented with a Newcombe game before Omega determines how much money to put into boxes. It will even self-modify if there is a 1-in-1000 credence that Omega has not yet done so (or might change their mind).

At this point considerations come in such as what will happen if such an agent expects that they will face Newcombe-like games in the future but aren't yet certain what form they will take or what the exact payoffs will be. Should they self-modify to something UDT-like now?

Nathan Young's Shortform

JBlack8d20

Is it a necessary non-epistemic truth? After all, it has a very lengthy partial proof in Principia Mathematica, and maybe they got something wrong. Perhaps you should check?

But then maybe you're not using a formal system to prove it, but just taking it as an axiom or maybe as a definition of what "2" means using other symbols with pre-existing meanings. But then if I define the term "blerg" to mean "a breakfast product with non-obvious composition", is that definition in itself a necessary truth?

Obviously if you mean "if you take one object and then take another object, you now have two objects" then that's a contingent proposition that requires evidence. It probably depends upon what sorts of things you mean by "objects" too, so we can rule that one out.

Or maybe "necessary non-epistemic truth" means a proposition that you can "grok in fullness" and just directly see that it is true as a single mental operation? Though, isn't that subjective and also epistemic? Don't you have to check to be sure that it is one? Was it a necessary non-epistemic truth for you when you were young enough to have trouble with the concept of counting?

So in the end I'm not really sure exactly what you mean by a necessary truth that doesn't need any checking. Maybe it's not even a coherent concept.

Current LLMs are conscious and are AGI.

JBlack8d41

We don't know how consciousness arises, in terms of what sort of things have subjective experience. Your assertion is one reasonable hypothesis, but you don't support it or comment on any of the other possible hypotheses.
I don't think many people use "better than every human in every way" as a definition for the term "AGI". However, LLMs are fairly clearly not yet AGI even for less extreme meanings of the term, such as "at least as capable for almost all cognitive tasks as an average human". It is pretty clear that current LLMs are still quite a lot less capable in many important ways than fairly average humans, despite being as capable and even more capable in other important ways.
They do meet a very loose definition of AGI such as "comparable or better in most ways to the mental capabilities of a significant fraction of human population", so saying that they are AGI is at least somewhat justifiable.
LLMs emit text consistent with the training corpus and tuning processes. If that means using a first person pronoun "I am an ..." instead of a third-person description such as "This text is produced by an ...", then that doesn't say anything about whether the LLM is conscious or not. Even a 1-line program can print "I am a computer program but not a conscious being", and have that be a true statement to the extent that the pronoun "I" can be taken to mean "whatever entity produced the sentence" and not "a conscious being that produced the sentence".

To be clear, I am not saying that LLMs are not conscious, merely that we don't know. What we do know is that they are optimized to produce outputs that match those from entities that we generally believe to be conscious. Using those outputs as evidence to justify a hypothesis of consciousness is begging the question to a much greater degree than looking at outputs of systems that were not so directly optimized.

What would stop you from paying for an LLM?

JBlack11d61

Nothing would stop me from paying for an LLM, since I already do not pay for any LLM. All the organizations offering paid LLM access are engaging in highly unsafe race dynamics, regardless of what they say they are doing, and I will not contribute to that nor incentivize it. I can accept the minor reduction in my local utility this brings.

Is There Really a Child Penalty in the Long Run?

JBlack14d20

Yes, that was my first guess as well. Increased income from employment is most strongly associated with major changes, such as promotion to a new position with changed (and usually increased) responsibilities, or leaving one job and starting work somewhere else that pays more.

It seems plausible that these are not the sorts of changes that women are likely to seek out at the same rate when planning to devote a lot of time in the very near future to being a first-time parent. Some may, but all? Seems unlikely. Men seem more likely to continue to pursue such opportunities at a similar rate due to gender differences in child-rearing roles.

LLMs could be as conscious as human emulations, potentially

JBlack14d20

I don't expect this to "cash out" at all, which is rather the point.

The only really surprising part would be that we had a way to determine for certain whether some other system is conscious or not at all. That is, very similar (high) levels of surprisal for either "ems are definitely conscious" or "ems are definitely not conscious", but the ratio between them not being anywhere near "what the fuck" level.

As it stands, I can determine that I am conscious but I do not know how or why I am conscious. I have only a sample size of 1, and no way to access a larger sample. I cannot determine that you are conscious. I can't even determine for certain when or whether I was conscious in the past, and there are some time periods for which I am very uncertain. I have hypotheses regarding all of these uncertainties, but there are no prospects of checking whether they're actually correct.

So given that, why would I be "what the fuck" surprised if some of my currently favoured hypotheses such as "ems will be conscious" were actually false? I don't have anywhere near the degree of evidence required to justify that level of prior confidence. I am quite certain that you don't either. I would be very surprised if other active fleshy humans weren't conscious, but still not "what the fuck" surprised.

Semantic Disagreement of Sleeping Beauty Problem

JBlack18d20

Eh, I'm not doing anything else important right now, so let's beat this dead horse further.

"As defined, a universe state either satisfies or does not satisfy a proposition. If you're referring to propositions that may vary over space or time, then when modelling a given situation you have two choices"
Which I neither disagree nor have any interesting to add.

This is the whole point! That's why I pointed it out as the likely crux, and you're saying that's fine, no disagreement there. Then you reject one of the choices.

You agree that any non-empty set can be the sample space for some probability space. I described a set: those of universe states labelled by time of being asked about a credence.

I chose my set to be a cartesian product of the two relevant properties that Beauty is uncertain of on any occasion when she awakens and is asked for her credence: what day it is on that occasion of being asked (Monday or Tuesday), and what the coin flip result was (Heads or Tails). On any possible occasion of being asked, it is either Monday or Tuesday (but not both), and either Heads or Tails (but not both). I can set the credence for (Tuseday,Heads) to zero since Beauty knows that's impossible by the setup of the experiment.

If Beauty knew which day it was on any occasion when she is asked, then she should give one of two different answers for credences. These correspond to the conditional credences P(Heads | Monday) and P(Heads | Tuesday). Likewise, knowing what the coin flip was would give different conditional credences P(Monday | Heads) and P(Monday | Tails).

All that is mathematically required of these credences is that they obey the axioms of a measure space with total measure 1, because that's exactly the definition of a probability space. My only claim in this thread - in contrast to your post - is that they can.

Semantic Disagreement of Sleeping Beauty Problem

JBlack20d-30

More specifically this is the experiment where awakenings on Monday and Tuesday are mutually exclusive during one trial, such as No-Coin-Toss or Single-Awakening

No, I specifically was referring to the Sleeping Beauty experiment. Re-read my comment. Or not. At this point it's quite clear that we are failing to communicate in a fundamental way. I'm somewhat frustrated that you don't even comment on those parts where I try to communicate the structure of the question, but only on the parts which seem tangential or merely about terminology. There is no need to reply to this comment, as I probably won't continue participating in this discussion any further.

LESSWRONG
LW

Posts

Wiki Contributions

Comments