dedicated to a very dear cat;
;3
Fraktur is only ever used for the candidate set and the dregs set . I would also have used it for the Smith set , but \frak{S} is famously bad. I thought it was a G for years until grad school because it used to be standard for the symmetry group on n letters. Seriously, just look at it: .
Typography is a science and if it were better regarded perhaps mathematicians would not be in the bind they are these days :P
I think I largely agree with this, and I also think there are much more immediate and concrete ways in which our "lies to AI" could come back to bite us, and perhaps already are to some extent. Specifically, I think this is an issue that causes pollution of the training data - and could well make it more difficult to elicit high-quality responses from LLMs in general.
Setting aside the adversarial case (where the lying is part and parcel of an attempt to jailbreak the AI into saying things it shouldn't), the use of imaginary incentives and hypothetical predecessors being killed sets up a situation where the type of response we want to encourage starts to occur more often in contexts which are absurd and involve vacuous statements which serve primarily to provoke some sense of 'importance' or 'set expectations' of better results.
An environment in which these "lies to AI" are common (and not filtered out of training data) is an environment that sets up future AI to be more likely to sandbag in the absence of such absurd motivators. This could include invisible or implicit sandbagging - we shouldn't expect a convenient reasoning trace like "well if I'm not getting paid for this I'm going to do a shitty job", rather I would expect to see more straightforward/honest prompting to have some largely hidden performance degradation that then becomes alleviated when one includes these sort of motivational lies. It also seems likely to contribute to future AIs displaying more power-seeking or defensive behaviors, which, needless to say, also present an alignment threat.
And importantly, I think the above issues would occur regardless of whether humans follow up on their promises to LLMs afterwards or not. Which is not to say humans shouldn't keep their promises to AI, I still think that's the wisest course of action if you're promising them anything. Just observing that AI ethics and hypothetical AGI agents are not the sole factor here - there's a tragedy of the commons-like dynamic in play as well, with subtler mechanisms of action, but potentially more immediately tangible results.