All of Signer's Comments + Replies

Endurist thinking treats reproduction as always acceptable or even virtuous, regardless of circumstances. The potential for suffering rarely factors into this calculation—new life is seen as inherently good.

Not necessary - you can treat creating new people differently from already existing and avoid creating bad (in Endurist sense - not enough positive experiences, regardless of suffering) lives without accepting death for existing people. I, for example, don't get why would you bring more death to the world by creating low-lifespan people, if you don't like death.

clearly the system is a lot less contextual than base models, and it seems like you are predicting a reversal of that trend?

The trend may be bounded, the trend may not go far by the time AI can invent nanotechnology - would be great if someone actually measured such things.

And there being a trend at all is not predicted by utility-maximization frame, right?

People are confused about the basics because the basics are insufficiently justified.

It is learning helpfulness now, while the best way to hit the specified ‘helpful’ target is to do straightforward things in straightforward ways that directly get you to that target. Doing the kinds of shenanigans or other more complex strategies won’t work.

Best by what metric? And I don't think it was shown, that complex strategies won't work - learning to change behaviour from training to deployment is not even that complex.

But it is important, and this post just isn’t going to get done any other way.

Speaking about streetlighting...

What makes it rational is that there is an actual underlying hypothesis about how weather works, instead of vague "LLMs are a lot like human uploads". And weather prediction outputs numbers connected to reality we actually care about. And there is no alternative credible hypothesis that implies weather prediction not working.

I don't want to totally dismiss empirical extrapolations, but given the stakes, I would personally prefer for all sides to actually state their model of reality and how they think evidence changed it's plausibility, as formally as possible.

There is no such disagreement, you just can't test all inputs. And without knowledge of how internals work, you may me wrong about extrapolating alignment to future systems.

4Roko
There are plenty of systems where we rationally form beliefs about likely outputs from a system without a full understanding of how it works. Weather prediction is an example.
Answer by Signer56

Yes, except I would object to phrasing this anthropic stuff as "we should expect ourselves to be agents that exist in a universe that abstracts well" instead of "we should value universe that abstracts well (or other universes that contain many instances of us)" - there is no coherence theorems that force summation of your copies, right? And so it becomes apparent that we can value some other thing.

Also even if you consider some memories a part of your identity, you can value yourself slightly less after forgetting them, instead of only having threshold for death.

It doesn't matter whether you call your multiplier "probability" or "value" if it results in your decision to not care about low-measure branch. The only difference is that probability is supposed to be about knowledge, and Wallace's argument involving arbitrary assumption, not only physics, means it's not probability, but value - there is no reason to value knowledge of your low-measure instances less.

this makes decision theory and probably consequentialist ethics impossible in your framework

It doesn't? Nothing stops you from making decisions in a wor... (read more)

1Jonah Wilberg
OK 'impossible' is too strong, I should have said 'extremely difficult'. That was my point in footnote 3 of the post. Most people would take the fact that it has implications like needing to "maximize splits of good experiences" (I assume you mean maximise the number of splits) as a reductio ad absurdum, due to the fact that this is massively different from our normal intuitions about what we should do. But some people have tried to take that approach, like in the article I mentioned in the footnote. If you or someone else can come up with a consistent and convincing decision approach that involves branch counting I would genuinely love to see it!

Things like lions, and chairs are other examples.

And counted branches.

This is how Wallace defines it (he in turn defines macroscopically indistinguishable in terms of providing the same rewards). It’s his term in the axiomatic system he uses to get decision theory to work. There’s not much to argue about here?

His definition leads to contradiction with informal intuition that motivates consideration of macroscopical indistinguishability in the first place.

We should care about low-measure instances in proportion to the measure, just as in classical

... (read more)
1Jonah Wilberg
I'm not at all saying the experiences of a person in a low-weight world are less valuable than a person in a high-weight world. Just that when you are considering possible futures in a decision-theoretic framework you need to apply the weights (because weight is equivalent to probability).  Wallace's useful achievement in this context is to show that there exists a set of axioms that makes this work, and this includes branch-indifference. This is useful because makes clear the way in which the branch-counting approach you're suggesting is in conflict with decision theory. So I don't disagree that you can care about the number of your thin instances, but what I'm saying is in that case you need to accept that this makes decision theory and probably consequentialist ethics impossible in your framework.

How many notions of consciousness do you think are implementable by a short Python program?

3weightt an
All of them, you can cook up something AIXI like in a very few bytes. But it will have to run for a very long time.

Because scale doesn't matter - it doesn't matter if you are implemented on thick or narrow computer.

First of all, macroscopical indistinguishability is not fundamental physical property - branching indifference is additional assumption, so I don't see how it's not as arbitrary as branch counting.

But more importantly, branching indifference assumption is not the same as informal "not caring about macroscopically indistinguishable differences"! As Wallace showed, branching indifference implies the Born rule implies you almost shouldn't care about you in a br... (read more)

1Jonah Wilberg
  You're right it's not a fundamental physical property - the overall philosophical framework here is that things can be real - as emergent entities - without being fundamental physical properties. Things like lions, and chairs are other examples. This is how Wallace defines it (he in turn defines macroscopically indistinguishable in terms of providing the same rewards). It's his term in the axiomatic system he uses to get decision theory to work. There's not much to argue about here?  Yes this is true. Not caring about low-measure instances is a very different proposition from not caring about macroscopically indistinguishable differences. We should care about low-measure instances in proportion to the measure, just as in classical decision theory we care about low-probability instances in proportion to the probability.

But why would you want to remove this arbitrariness? Your preferences are fine-grained anyway, so why retain classical counting, but deny counting in the space of wavefunction? It's like saying "dividing world into people and their welfare is arbitrary - let's focus on measuring mass of a space region". The point is you can't remove all decision-theoretic arbitrariness from MWI - "branching indifference" is just arbitrary ethical constraint that is equivalent to valuing measure for no reason, and without it fundamental physics, that works like MWI, does not prevent you from making decisions as if quantum immortality works.

1Jonah Wilberg
I don't get why you would say that the preferences are fine-grained, it kinda seems obvious to me that they are not fine-grained. You don't care about whether worlds that are macroscopically indistinguishable are distinguishable at the quantum level, because you are yourself macroscopic. That's why branching indifference is not arbitrary. Quantum immortality is a whole other controversial story.

“Decoherence causes the Universe to develop an emergent branching structure. The existence of this branching is a robust (albeit emergent) feature of reality; so is the mod-squared amplitude for any macroscopically described history. But there is no non-arbitrary decomposition of macroscopically-described histories into ‘finest-grained’ histories, and no non-arbitrary way of counting those histories.”

Importantly though, on this approach it is still possible to quantify the combined weight (mod-squared amplitude) of all branches that share a certain mac

... (read more)
1Jonah Wilberg
You're right that you can just take whatever approximation you make at the macroscopic level ('sunny') and convert that into a metric for counting worlds. But the point is that everyone will acknowledge that the counting part is arbitrary from the perspective of fundamental physics - but you can remove the arbitrariness that derives from fine-graining, by focusing on the weight. (That is kind of the whole point of a mathematical measure.)
Answer by Signer32

Even if we can’t currently prove certain axioms, doesn’t this just reflect our epistemological limitations rather than implying all axioms are equally “true”?

It doesn't and they are fundamentally equal. The only reality is the physical one - there is no reason to complicate your ontology with platonically existing math. Math is just a collection of useful templates that may help you predict reality and that it works is always just a physical fact. Best case is that we'll know true laws of physics and they will work like some subset of math and then axio... (read more)

1lbThingrb
This is an appealingly parsimonious account of mathematical knowledge, but I feel like it leaves an annoying hole in our understanding of the subject, because it doesn't explain why practicing math as if Platonism were correct is so ridiculously reliable and so much easier and more intuitive than other ways of thinking about math. For example, I have very high credence that no one will ever discover a deduction of 0=1 from the ZFC axioms, and I guess I could just treat that as an empirical hypothesis about what kinds of physical instantiations of ZFC proofs will ever exist. But the early set theorists weren't just randomly sampling the space of all possible axioms and sticking with whatever ones they couldn't find inconsistencies in. They had strong priors about what kinds of theories should be consistent. Their intuitions sometimes turned out to be wrong, as in the case of Russel's paradox, but overall their work has held up remarkably well, after huge amounts of additional investigation by later generations of mathematicians. So where did their intuitions come from? As I said in my answer, I have doubts about Platonism as an explanation, but none of the alternatives I've investigated seem to shed much light on the question.

It sure doesn't seem to generalize in GPT-4o case. But what's the hypothesis for Sonnet 3.5 refusing in 85% of cases? And CoT improving score and o1 being better in browser suggests the problem is in models not understanding consequences, not in them not trying to be good. What's the rate of capability generalization to agent environment? Are we going to conclude that Sonnet is just demonstrates reasoning, instead of doing it for real, if it solves only 85% of tasks it correctly talks about?

Also, what's the rate of generalization of unprompted problematic behaviour avoidance? It's much less of a problem if your AI does what you tell it to do - you can just don't give it to users, tell it to invent nanotechnology, and win.

2Simon Lermen
I had finishing this up on my to-do list for a while. I just made a full length post on it. https://www.lesswrong.com/posts/ZoFxTqWRBkyanonyb/current-safety-training-techniques-do-not-fully-transfer-to I think it's fair to say that some smarter models do better at this, however, it's still worrisome that there is a gap. Also attacks continue to transfer.

GPT-4 is insufficiently capable, even if it were given an agent structure, memory and goal set to match, to pull off a treacherous turn. The whole point of the treacherous turn argument is that the AI will wait until it can win to turn against you, and until then play along.

I don't get why actual ability matters. It's sufficiently capable to pull it off in some simulated environments. Are you claiming that we can't decieve GPT-4 and it is actually waiting and playing along just because it can't really win?

Whack-A-Mole fixes, from RLHF to finetuning, are about teaching the system to not demonstrate problematic behavior, not about fundamentally fixing that behavior.

Based on what? Problematic behavior avoidance does actually generalize in practice, right?

Here is a way in which it doesn't generalize in observed behavior:

Alignment does not transfer well from chat models to agents

TLDR: There are three new papers which all show the same finding, i.e. the safety guardrails from chat models don’t transfer well from chat models to the agents built from them. In other words, models won’t tell you how to do something harmful, but they will do it if given the tools. Attack methods like jailbreaks or refusal-vector ablation do transfer.

Here are the three papers, I am the author of one of them:

https://arxiv.org/abs/24... (read more)

Not at all. The problem is that their observations would mostly not be in a classical basis.

I phrased it badly, but what I mean is that there is a simulation of Hilbert space, where some regions contain patterns that can be interpreted as observers observing something, and if you count them by similarity, you won't get counts consistent with Born measure of these patterns. I don't think basis matters in this model, if you change basis for observer, observations and similarity threshold simultaneously? Change of basis would just rotate or scale patterns,... (read more)

https://mason.gmu.edu/~rhanson/mangledworlds.html

I mean that if turing machine is computing universe according to the laws of quantum mechanics, observers in such universe would be distributed uniformly, not by Born probability. So you either need some modification to current physics, such as mangled worlds, or you can postulate that Born probabilities are truly random.

2TAG
I assume you mean the laws of QM except the collapse postulate. Not at all. The problem is that their observations would mostly not be in a classical basis. Born probability relates to observations, not observers. Or collapse. Mangled worlds is kind of a nothing burger--its a variation on the idea than interference between superposed states leads to both a classical basi and the Born probabilities, which is an old idea, but wihtout making it any more quantiative. ??

Our observations are compatible with a world that is generated by a Turing machine with just a couple thousand bits.

Yes, but this is kinda incompatible with QM without mangled worlds.

2Alexander Gietelink Oldenziel
Oh ? What do you mean ! I don't know about mangled worlds

Imagining two apples is a different thought from imagining one apple, right?

I mean, is it? Different states of the whole cortex are different. And the cortex can't be in a state of imagining only one apple and, simultaneously, be in a state of imagining two apples, obviously. But it's tautological. What are we gaining from thinking about it in such terms? You can say the same thing about the whole brain itself, that it can only have one brain-state in a moment.

I guess there is a sense in which other parts of the brain have more various thoughts relativ... (read more)

5Steven Byrnes
You say “tautological”, I say “obvious”. You can’t parse a legal document and try to remember your friend’s name at the exact same moment. That’s all I’m saying! This is supposed to be very obvious common sense, not profound. Consider the following fact: FACT: Sometimes, I’m thinking about pencils. Other times, I’m not thinking about pencils. Now imagine that there’s a predictive (a.k.a. self-supervised) learning algorithm which is tasked with predicting upcoming sensory inputs, by building generative models. The above fact is very important! If the predictive learning algorithm does not somehow incorporate that fact into its generative models, then those generative models will be worse at making predictions. For example, if I’m thinking about pencils, then I’m likelier to talk about pencils, and look at pencils, and grab a pencil, etc., compared to if I’m not thinking about pencils. So the predictive learning algorithm is incentivized (by its predictive loss function) to build a generative model that can represent the fact that any given concept might be active in the cortex at a certain time, or might not be. Again, this is all supposed to sound very obvious, not profound. Yes, it’s also useful for the predictive learning algorithm to build generative models that capture other aspects of the brain state, outside the cortex. Thus we wind up with intuitive concepts that represent the possibility that we can be in one mood or another, that we can be experiencing a certain physiological reaction, etc.

I still don't get this "only one thing in awareness" thing. There are multiple neurons in cortex and I can imagine two apples - in what sense there can only be one thing in awareness?

Or equivalently, it corresponds equally well to two different questions about the territory, with two different answers, and there’s just no fact of the matter about which is the real answer.

Obviously the real answer is the model which is more veridical^^. The latter hindsight model is right not about the state of the world at t=0.1, but about what you thought about the world at t=0.1 later.

5Steven Byrnes
One thought in awareness! Imagining two apples is a different thought from imagining one apple, right? They’re different generative models, arising in different situations, with different implications, different affordances, etc. Neither is a subset of the other. (I.e., there are things that I might do or infer in the context of one apple, that I would not do or infer in the context of two apples.) I can have a song playing in my head while reading a legal document. That’s because those involve different parts of the cortex. In my terms, I would call that “one thought” involving both a song and a legal document. On the other hand, I can’t have two songs playing in my head simultaneously, nor can I be thinking about two unrelated legal documents simultaneously. Those involve the same parts of the cortex being asked to do two things that conflict. So instead, I’d have to flip back and forth. There are multiple neurons in the cortex, but they’re not interchangeable. Again, I think autoassociative memory / attractor dynamics is a helpful analogy here. If I have a physical instantiation of a Hopfield network, I can’t query 100 of its stored patterns in parallel, right? I have to do it serially. I don’t pretend that I’m offering a concrete theory of exactly what data format a “generative model” is etc., such that song-in-head + legal-contract is a valid thought but legal-contract + unrelated-legal-contract is not a valid thought. …Not only that, but I’m opposed to anyone else offering such a theory either! We shouldn’t invent brain-like AGI until we figure out how to use it safely, and those kinds of gory details would be getting uncomfortably close, without corresponding safety benefits, IMO.

If that’s your hope—then you should already be alarmed at trends

Would be nice for someone to quantify the trends. Otherwise it may as well be that trends point to easygoing enough and aligned enough future systems.

For some humans, the answer will be yes—they really would do zero things!

Nah, it's impossible for evolution to just randomly stumble upon such complicated and unnatural mind-design. Next you are going to say what, that some people are fine with being controlled?

Where an entity has never had the option to do a thing, we may not validly in

... (read more)

I genuinely think it's a "more dakha" situation - the difficulty of communication is often underestimated, but it is possible to reach a mutual understanding.

RLHF does not solve the alignment problem because humans can’t provide good-enough feedback fast-enough.

Yeah, but the point is that the system learns values before an unrestricted AI vs AI conflict.

As mentioned in the beginning, I think the intuition goes that neural networks have a personality trait which we call “alignment”, caused by the correspondence between their values and our values. But “their values” only really makes sense after an unrestricted AI vs AI conflict, since without such conflicts, AIs are just gonna propagate energy to whichever

... (read more)
4tailcalled
But if you just naively take the value that are appropriate outside of a life-and-death conflict and apply them to a life-and-death conflict, you're gonna lose. In that case, RLHF just makes you an irrelevant player, and if you insist on applying it to military/police technology, it's necessary for AI safety to pivot to addressing rogue states or gangsters. Which again makes RLHF really really bad because we shouldn't have to work with rogue states or gangsters to save the world. Don't cripple the good guys. If you propose a particular latent variable that acts in a particular way, that is a lot of complexity, and you need a strong case to justify it as likely. Human-regulation mechanisms could plausibly solve this problem by banning chip fabs. The issue is we use chip fabs for all sorts of things so we don't want to do that unless we are truly desperate. Idk. Big entities have a lot of security vulnerabilities which could be attacked by AIs. But I guess one could argue the surviving big entities are red-teaming themselves hard enough to be immune to these. Perhaps most significant is the interactions between multiple independent big things, since they could be manipulated to harm the big things. Small adversaries currently have a hard time exploiting these security vulnerabilities because intelligence is really expensive, but once intelligence becomes too cheap to meter, that is less of a problem. You could heavily restrict the availability of AI but this would be an invasive possibility that's far off the current trajectory.

But also, if you predict a completion model where a very weak hash is followed by its pre-image, it will probably have learned to undo the hash, even though the source generation process never performed that (potentially much more complicated than the hashing function itself) operation, which means it’s not really a simulator.

I'm saying that this won't work with current systems at least for strong hash, because it's hard, and instead of learning to undo, the model will learn to simulate, because it's easier. And then you can vary the strength of hash to... (read more)

4habryka
You can't learn to simulate an undo of a hash, or at least I have no idea what you are "simulating" and why that would be "easier". You are certainly not simulating the generation of the hash, going token by token forwards you don't have access to a pre-image at that point. Of course the reason why sometimes hashes are followed by their pre-image in the training set is because they were generated in the opposite order and then simply pasted in hash->pre-image order. 

And I don’t think we’ve observed any evidence of that.

What about any time a system generalizes favourably, instead of predicting errors? You can say it's just a failure of prediction, but it's not like these failures are random.

That is the central safety property we currently rely on and pushes things to be a bit more simulator-like.

And the evidence for this property, instead of, for example, the inherent bias of NNs, being central is what? Why wouldn't predictor exhibit more malign goal-directedness even for short term goals?

I can see that this who... (read more)

2habryka
I don't understand, how is "not predicting errors" either a thing we have observed, or something that has anything to do with simulation?  Yeah, I really don't know what you are saying here. Like, if you prompt a completion model with badly written text, it will predict badly written text. But also, if you predict a completion model where a very weak hash is followed by its pre-image, it will probably have learned to undo the hash, even though the source generation process never performed that (potentially much more complicated than the hashing function itself) operation, which means it's not really a simulator.

Why wouldn't myopic bias make it more likely to simulate than predict? And does't empirical evidence about LLMs support the simulators frame? Like, what observations persuaded you, that we are not living in the world, where LLMs are simulators?

6habryka
I don't think there is any reason to assume the system is likely to choose "simulation" over "prediction"? And I don't think we've observed any evidence of that.  The thing that is true, which I do think matters, is that if you train your AI system on only doing short single forward-passes, then it is less likely to get good at performing long chains of thought, since you never directly train it to do that (instead hoping that the single-step training generalizes to long chains of thought). That is the central safety property we currently rely on and pushes things to be a bit more simulator-like.
Answer by Signer10

In order to be “UP-like” in a relevant way, this procedure will have to involve running TMs, and the set of TMs that might be run needs to include the same TM that implements our beings and their world.

Why? The procedure just need to do some reasoning, constrained by UP and outer TM. And then UP-beings can just simulate this fast reasoning without problems of self-simulation.

Yes, AI that practically uses UP may fail to predict whether UP-beings simulate it in the center of their universe or on the boundary. But the point is that the more correct AI is in its reasoning, the more control UP-beings have.

Or you can not create AI that thinks about UP. But that's denying the assumption.

Yet, you can find valence in your own experiences

But why must you care about valence? It's not an epistemic error to not care. You don't have direct experience of there being a law that you must care about valence.

1Milan W
Empirically, I cannot help but care about valence. This could in principle be just a weird quirk of my own mind. I do not think this is the case (see the waterboarding bet proposal on the original shortform post).

Everyone but Elon himself would say the above is a different scenario from reality. Each of us knows which body our first-person perspective resides in. And that is clearly not the physical human being referred as Elon Musk. But the actual and imaginary scenarios are not differentiated by any physical difference of the world, as the universe is objectively identical.

They are either differentiated by a physically different location of some part of your experience - like your memory being connected to Elon's sensations, or your thought being executed in o... (read more)

Answer by Signer-1-5

Computationalism is an ethical theory, so it is fine for it to be based on high-level abstractions - ethics is arbitrary.

For (1) the multiverse needs to be immensely larger than our universe, by a factor of at least 10106 or so “instances”. The exact double exponent depends upon how closely people have to match before it’s reasonable to consider them to be essentially the same person. Perhaps on the order of millions of data points is enough, maybe more are needed. Evidence for MWI is nowhere near strong enough to justify this level of granularity in the state space and it doesn’t generalize well to space-time quantization so this probably isn’t enough.

Why? Even without u... (read more)

1JBlack
I'm not saying that it's impossible, just that we have no evidence of this degree of multiplicity. Even if the MWI interpretation was correct, the underlying state space could be very much coarser than this thought experiment requires without any effect on experimental observations at all. Or something even weirder! Quantum theories are an approximation, and pushing an approximation to extremes usually gives nonsense. Saying that there are literally uncountably infinite many real states is going far beyond the actual evidence. We don't - and can't - have any evidence of actual infinity or indeed any physically existing entities of number anything like 10^million.

There is non-zero measure on a branch that starts with you terminally ill and gradually proceeds to you miraculously recovering. So if you consider normally recovered you to be you, nothing stops you from considering this low-measure you to also be you.

I have never heard of anyone going to sleep as one of a pair of twins and waking up as the other.

According to MWI everyone wakes up as multiple selves all the time.

0Richard_Kennaway
To adapt Woody Allen, I don’t want to achieve immortality by imagining that someone else is me, I want to achieve immortality by not dying.

Still don't get how souls would get you psychic powers. Otherwise randomness and causality don't matter - you may as well simultaneously create people in numbered rooms and people in low-numbered rooms would have the same problems.

1Ape in the coat
If you are an immaterial soul then being born in a particular body can be conceptualized as receiving a jacket of a particular color - the idea that you could've been born in a different body doesn't immediately appear as a complete nonsense. And therefore there is at least some chance that doomsday inference is correct. 
Answer by Signer-2-3

conscious in the way that we are conscious

Whether it's the same way is an ethical question, so you can decide however you want.

So there should be some sort of hardware-dependence to obtain subjective experience.

I certainly don't believe in subjective experience without any hardware, but no, there is no much dependence except for your preferences for hardware.

As for generally accepted conclusions... I think it's generally accepted that some preferences for hardware are useful in epistemic contexts, so you can be persuaded to say "rock is not conscious" for the same reason you say "rock is not calculator".

Not sure this qualifies, but I try to avoid instantiating complicated models for ethical reasons.

What does "dumb" mean? Corrigibility basically is being selectively dumb. You can give power to a LLM and it would likely still follow instructions.

Given a low prior probability of doom as apparent from the empirical track record of technological progress, I think we should generally be skeptical of purely theoretical arguments for doom, especially if they are vague and make no novel, verifiable predictions prior to doom.

And why such use of the empirical track record is valid? Like, what's the actual hypothesis here? What law of nature says "if technological progress hasn't caused doom yet, it won't cause it tomorrow"?

MIRI’s arguments for doom are often difficult to pin down, given the informal n

... (read more)

There is a weaker and maybe shorter version by Chalmers: https://consc.net/papers/panpsychism.pdf. The short version is that there is no way for you to non-accidently know about quantization state of your brain and for that quantization not be a part of an easy problem: pretty much by definition, if you can just physically measure it, it's easy and not mysterious.

Answer by Signer1-5

Panpsychism is correct about genuineness and subjectivity of experiences, but you can quantize your caring about other differences between experiences of human and zygote however you want.

1notfnofn
That's a bit of a long read, and both your endorsement and the title seem too strong to be believable. If a few more people endorse that it's worth reading, I'll give it a go!

If we live in naive MWI, an IBP agent would not care for good reasons, because naive MWI is a “library of babel” where essentially every conceivable thing happens no matter what you do.

Isn't the frequency of amplitude-patterns changes depending on what you do? So an agent can care about that instead of point-states.

In the case of teleportation, I think teleportation-phobic people are mostly making an implicit error of the form “mistakenly modeling situations as though you are a Cartesian Ghost who is observing experiences from outside the universe”, not making a mistake about what their preferences are per se.

Why not both? I can imagine that someone would be persuaded to accept teleportation/uploading if they stopped believing in physical Cartesian Ghost. But it's possible that if you remind them that continuity of experience, like table, is just a description of ... (read more)

Analogy: When you’re writing in your personal diary, you’re free to define “table” however you want. But in ordinary English-language discourse, if you call all penguins “tables” you’ll just be wrong. And this fact isn’t changed at all by the fact that “table” lacks a perfectly formal physics-level definition.

You're also free to define "I" however you want in your values. You're only wrong if your definitions imply wrong physical reality. But defining "I" and "experiences" in such a way that you will not experience anything after teleportation is possib... (read more)

5Rob Bensinger
Sort of! * It's true that no law of nature will stop you from using "I" in a nonstandard way; your head will not explode if you redefine "table" to mean "penguin". * And it's true that there are possible minds in abstract mindspace that have all sorts of values, including strict preferences about whether they want their brain to be made of silicon vs. carbon. * But it's not true that humans alive today have full and complete control over their own preferences. * And it's not true that humans can never be mistaken in their beliefs about their own preferences. In the case of teleportation, I think teleportation-phobic people are mostly making an implicit error of the form "mistakenly modeling situations as though you are a Cartesian Ghost who is observing experiences from outside the universe", not making a mistake about what their preferences are per se. (Though once you realize that you're not a Cartesian Ghost, that will have some implications for what experiences you expect to see next in some cases, and implications for what physical world-states you prefer relative to other world-states.)

I'm asking how physicists in the laboratory know that their observation are sharp-valued and classical?

2TAG
Same way you know anything. "Sharp valued" and "classical" have meanings, which cash out in expected experience.

If we were just talking about word definitions and nothing else, then sure, define “self” however you want. You have the universe’s permission to define yourself into dying as often or as rarely as you’d like, if word definitions alone are what concerns you.

But this post hasn’t been talking about word definitions. It’s been talking about substantive predictive questions like “What’s the very next thing I’m going to see? The other side of the teleporter? Or nothing at all?”

There should be an actual answer to this, at least to the same degree there’s an ans

... (read more)
2Rob Bensinger
Which things count as "I" isn't an arbitrary definition; it's just a fuzzy natural-language concept. (I guess you can call that "arbitrary" if you want, but then all the other words in the sentence, like "doorway" and "step", are also "arbitrary".) Analogy: When you're writing in your personal diary, you're free to define "table" however you want. But in ordinary English-language discourse, if you call all penguins "tables" you'll just be wrong. And this fact isn't changed at all by the fact that "table" lacks a perfectly formal physics-level definition. The same holds for "Will Rob Bensinger's next experience be of sitting in his bedroom writing a LessWrong comment, or will it be of him grabbing some tomatoes in a supermarket in Beijing?" Terms like 'Rob Bensinger' and 'I' aren't perfectly physically crisp — there may be cases where the answer is "ehh, maybe?" rather than a clear yes or no. And if we live in a Big Universe and we allow that there can be many Beijings out there in space, then we'll have to give a more nuanced quantitative answer, like "a lot more of Rob's immediate futures are in his bedroom than in Beijing". But if we restrict our attention to this Beijing, then all that complexity goes away and we can pretty much rule out that anyone in Beijing will happen to momentarily exhibit exactly the right brain state to look like "Rob Bensinger plus one time step". The nuances and wrinkles don't bleed over and make it a totally meaningless or arbitrary question; and indeed, if I thought I were likely to spontaneously teleport to Beijing in the next minute, I'd rightly be making very different life-choices! "Will I experience myself spontaneously teleporting to Beijing in the next second?" is a substantive (and easy) question, not a deep philosophical riddle. Not all possible experiences; just all experiences of brains that have the same kinds of structural similarities to your current brain as, e.g., "me after I step through a doorway" has to "me be

What's the evidence for these "sharp-valued classical observations" being real things?

2TAG
Err...physicists can make them in the laboratory. Or were you asking whether they are fundamental constituents of reality?

In particular, a.follower many worlder has to discard unobserved results in the same way as a Copenhagenist—it’s just that they interpret doing so as the unobserved results existing in another branch, rather than being snipped off by collapse.

A many-worlder doesn't have to discard unobserved results - you may care about other branches.

-1TAG
I am talking about the minimal set of operations you have to perform to get experimental results. A many worlder may care about other branches philosophically, but if they don't renormalise , their results will be wrong, and if they don't discard, they will do unnecessary calculation.

The wrong part is mostly in https://arxiv.org/pdf/1405.7577.pdf, but: indexical probabilities of being a copy are value-laden - seems like the derivation first assumes that branching happens globally and then assumes that you are forbidden to count different instantiations of yourself, that were created by this global process.

Load More