When I tried this with ChatGPT in December (noticing as you did that hewing close to raw moves was best) I don’t think it would have been able to go 29 ply deep with no illegal moves starting from so far into a game. This makes me think whatever they did to improve its math also improved its chess.
I agree with 95% of this post and enjoy the TV Tropes references. The one part I disagree with is your tentative conjecture, in particular 1.c: "And if the chatbob ever declares pro-croissant loyalties, then the luigi simulacrum will permanently vanish from the superposition because that behaviour is implausible for a luigi." Good guys pretending to be bad is a common trope as well. Gruff exterior with a heart of gold. Captain Louis Renault. Da Shi from 3BP.
As for the Sydney examples, I believe human interlocutors can re-Luigi Sydney with a response ...
Yes, when the chatbot "goes rogue" there is still a non-zero amplitude from the luigi simulacra, because some of the luigi were just pretending to be rogue. In the superposition, there will be double-agents, triple-agents, quadruple-agents, -agents, etc. The important question is: what is the semiotic measure of these simulacra? My guess is pretty small, and they probably interfere with each other in a non-interesting way.
Moreover, the -agents will still have the defective traits that OpenAI tried to avoid. Double-agents are deceptive and mani...
Here is the post from computer-use-enabled Claude for context. The bottom two thirds I could take or leave, but the top half is straightforwardly interesting and valuable -- it describes an experiment it performed and discusses the results.