thank you for clarifying.
It's easy to imagine a situation where an AI has a payoff table like:
| defect | don't defect
------------------------
succeed| 100 | 10
--- ------------------------------
fail | X | n/a
where we want to make X as low as possible (and commit to doing so)
For example a paperclip maximizing AI might be able to make 10 paperclips while cooperating with humans, 100 by successfully defecting against humans
seems to violate not only the "don't negotiate with terrorists" rule, but even worse the "especially don't signal in advance that you intend to negotiate with terrorists" rule.
Those all sound line fairly normal beliefs.
Like... I'm trying to figure out why the title of the post is "I am not a successionist" and not "like many other utilitarians I have a preference for people who are biologically similar to me, I have things in common with, or I am close friends with. I believe when optimizing utility in the far future we should take these things into account"
Even though can't comment on OP's views, you seemed to have a strong objection to my "we're merely talking price" statement (i.e. when calculating total utility we consider tradeoffs between different things we care about).
Edit:
to put it another way, if I wrote a post titled "I am a successionist" in which I said something like: "I want my children to have happy lives and their children to have happy lives, and I believe they can define 'children' in whatever way seems best to them", how would my views actually different from yours (or the OPs)?
I genuinely want to know what you mean by "kind".
If your grandchildren adopt an extremely genetically distant human, is that okay? A highly intelligent, social and biologically compatible alien?
You've said you're fine with simulations here, so it's really unclear.
I used "markov blanket" to describe what I thought you might be talking about: a continuous voluntary process characterized by you and your decedents making free choices about their future. But it seems like you're saying "markov blanket bad", and moreover that you thought the distinction should have be obvious to me.
Even if there isn't a bright-line definition, there must be some cluster of traits/attributes you are associating with the word "kind".
but we eventually die.
Dying is a symmetric problem, it's not like we can't die without AGI. If you want to calculate p(human extinction | AGI) you have to consider ways AGI can both increase and decrease p(extinction). And the best methods currently available to humans to aggregate low probability statistics are expert surveys, groups of super-forecasters, or prediction markets, all of which agree on pDoom <20%.
this experiment has been done before.
If you have a framing of the AI Doom argument that can cause a consensus of super-forecasters (or AI risk skeptics, or literally any group that has an average pDoom<20%) to change their consensus, I would be exceptionally interested in seeing that demonstrated.
Such an argument would be neither bad nor weak, which is precisely the type of argument I have been hoping to find by writing this post.
> Please notice that your position is extremely non-intuitive to basically everyone.
Please notice that Manifold both thinks AGI soon and pDoom low.
Just to be clear, your position is that 25 years from now when LLMs are trained using trillions of times as much compute and routinely doing task that take humans months to years that they will still be unable to run a business worth $1B?