Endurist thinking treats reproduction as always acceptable or even virtuous, regardless of circumstances. The potential for suffering rarely factors into this calculation—new life is seen as inherently good.
Not necessary - you can treat creating new people differently from already existing and avoid creating bad (in Endurist sense - not enough positive experiences, regardless of suffering) lives without accepting death for existing people. I, for example, don't get why would you bring more death to the world by creating low-lifespan people, if you don't like death.
clearly the system is a lot less contextual than base models, and it seems like you are predicting a reversal of that trend?
The trend may be bounded, the trend may not go far by the time AI can invent nanotechnology - would be great if someone actually measured such things.
And there being a trend at all is not predicted by utility-maximization frame, right?
People are confused about the basics because the basics are insufficiently justified.
It is learning helpfulness now, while the best way to hit the specified ‘helpful’ target is to do straightforward things in straightforward ways that directly get you to that target. Doing the kinds of shenanigans or other more complex strategies won’t work.
Best by what metric? And I don't think it was shown, that complex strategies won't work - learning to change behaviour from training to deployment is not even that complex.
But it is important, and this post just isn’t going to get done any other way.
Speaking about streetlighting...
What makes it rational is that there is an actual underlying hypothesis about how weather works, instead of vague "LLMs are a lot like human uploads". And weather prediction outputs numbers connected to reality we actually care about. And there is no alternative credible hypothesis that implies weather prediction not working.
I don't want to totally dismiss empirical extrapolations, but given the stakes, I would personally prefer for all sides to actually state their model of reality and how they think evidence changed it's plausibility, as formally as possible.
There is no such disagreement, you just can't test all inputs. And without knowledge of how internals work, you may me wrong about extrapolating alignment to future systems.
Yes, except I would object to phrasing this anthropic stuff as "we should expect ourselves to be agents that exist in a universe that abstracts well" instead of "we should value universe that abstracts well (or other universes that contain many instances of us)" - there is no coherence theorems that force summation of your copies, right? And so it becomes apparent that we can value some other thing.
Also even if you consider some memories a part of your identity, you can value yourself slightly less after forgetting them, instead of only having threshold for death.
It doesn't matter whether you call your multiplier "probability" or "value" if it results in your decision to not care about low-measure branch. The only difference is that probability is supposed to be about knowledge, and Wallace's argument involving arbitrary assumption, not only physics, means it's not probability, but value - there is no reason to value knowledge of your low-measure instances less.
this makes decision theory and probably consequentialist ethics impossible in your framework
It doesn't? Nothing stops you from making decisions in a world where you are constantly splitting. You can try to maximize splits of good experiences or something. It just wouldn't be the same decisions you would make without knowledge of splits, but why new physical knowledge shouldn't change your decisions?
No, it means they subscribe to the idea that there is something ethically different about qualia/experience. It's not unique, it's like riding a bike. Human sometimes call physical interactions, utility of which is not obtainable by just thinking, "knowledge", when these interactions are with human body. And "they feel" is not an argument about knowledge, it's an argument about feelings. The difference between instantiating fusion in brain or red is not epistemical, it's ethical - qualia are useful for humans, so they don't properly transfer intuitions into Mary's case, and even if they do, humans care more about their brain instantiating red, than their brain fusing. Fundamentally, they all can be viewed as different representations of knowledge. It's just some representations are valuable by themselves.
I'm saying that if differences in feelings of necessity are explainable by preferences, then what need there is to introduce problematic definitions of knowledge?
Yes, and the consequence of this point is that you shouldn't use such definition of knowledge, because it implies that knowing how to ride a bike is unphysical.
I don't see how there can be any objective knowledge - encodings are subjective. By the way, why is ok for Physicalism to mean different things in context of different arguments? Physicalism in the Hard Problem doesn't mean that knowing how to ride a bike is unphysical.
But if you insist on defining Physicalism in such a way, then yes, Mary doesn't gain additional knowledge. Which proves that such definition contradicts some people's intuition about experience and bikes. It doesn't prove anything about real epistemology without additional assumptions.
I'm saying it doesn't work as argument without ontological assumptions.
You can't conclude this without additional assumptions from Mary alone! I'm not arguing "physicalism is true" here, only that Mary is useless in disproving it. The only thing you can get from Mary, is that there is prediction-instantiation gap with experiences. But it's the same gap as between knowing about a state and being in a state, like with knowing how to ride a bike.
One of such possible assumptions is that there is some difference between relations of knowledge and state in cases of experience and bikes. That in some sense knowledge without acquaintance about bikes is still about bikes, but knowledge about experience is not actually about experience. That knowledge without acquaintance about experience is incomplete in some additional way, not related to the ordinary difference between knowing and being. But it is an additional assumption, not something you can derive from Mary's Room, because, again, Mary gaining something after leaving the room is predicted both in case of experience and in case of bikes.
I don't see how you can get identity if you can just can have an ontology that doesn't contain fire, only atoms. You can get some atoms to be numerically close to a reencoding of atoms of your brain that perceives fire. Or numerically close to some other previous model of fire. You can then check some limits for divergence and declare that atomic model matches empirical results precise-enough. But no one expects previous model to be precisely equivalent to new one? And reasoning about different specific fires definitely involves probabilistic induction.
Conversely, if by "necessary" you just mean ordinary way science does reductions, then Mary's situation with experience fully qualifies: she correctly concludes that red qualia are activity of neurons, have as complete knowledge about bats on LSD as of bikes and fires, can say that the red of roses will feel similar to the red of blood, can predict her entire field of view with pixels marked "red" with more precision, than she will be able to track in the moment, and will get new experience after leaving the room as predicted. No part of it contradicts qualia being reducible to physics. You can only argue it by bringing zombies. Or show on which step Mary fails to reduce red qualia, that doesn't work the same way with bikes or fires.
I guess the more substantial disagreement may be in the part, where a description of a bat-on-LSD experience in the language of physics or neural activity is somehow doesn't count, as opposed to... I'm not sure what people expect, a description in the terms usually used to describe human experience starting with "it feels like..."? But that's just the question of precision - why should describing nuclear reactor in terms of fire would be required to claim success in reduction? There is of course the difference between you knowing about bats-on-LSD and you having such experience - but that is also true about riding a bike or any other physical state.
It all looks to me like people are confusing knowing about qualia and being in a state of having qualia - that's why they assume perfect certain knowledge of qualia they have, talk about qualia being impossible to communicate and so on.
Sorry, typo - "plausibly includes".