this is great, thanks for sharing
in my model that happens through local updates, rather than a global system
for instance, if i used my willpower to feel my social anxiety completely (instead of the usual strategy of suppression) while socializing, i might get some small or large reconsolidation updates to the social anxiety, such that that part thinks it's needed in less situations or not at all
alternatively, the part that has the strategy of going to socialize and feeling confident may gain some more internal evidence, so it wins the internal conflict slightly more (but the internal conflict is still there and causes a drain)
i think the sort of global evaluation you're talking about is pretty rare, though something like it can happen when someone e.g. reaches a deep state of love through meditation, and then is able to access lots of their unloved parts that are downstream TRYING to get to that love and suddenly a big shift happens to whole system simultaneously (another type of global reevaulation can take place through reconsolidating deep internal organizing principles like fundamental ontological constraints or attachment style)
also, this 'subconscious parts going on strike' theory makes slightly different predictions than the 'is it good for the whole system/live' theory
for instance, i predict that you can have 'dead parts' that e.g. give people social anxiety based on past trauma, even though it's no longer actually relevant to their current situation.
and that if you override this social anxiety using 'live willpower' for a while, you can get burnout, even though the willpower is in some sense 'correct' about what would be good for the overall flourishing of the system given the current reality.
A lot of people are looking at the implications of o1's training process as a future scaling paradigm, but it seems to me that this implementation of applying inference time compute to just in time fine tune the model for hard questions is equally promising and may have equally impressive results if it scales with compute, and has equal potential in terms of low hanging fruit to be picked to improve it.
Don't sleep on test time training as a potential future scaling paradigm.
I often talk about w/ clients burnout as your subconscious/parts 'going on strike' because you've ignored them for too long
I never made the analogy to Atlas Shrugged and the live money leaving the dead money because it wasn't actually tending to the needs of the system, but now you've got me thinking
really, say more?
Another definition along the same vein:
Trauma is overgeneralization of emotional learning.
A real life use for smart contracts 😆
However, this would not address the underlying pattern of alignment failing to generalize.
Is there proof that this is an overall pattern? It would make sense that models are willing to do things they're not willing to talk about, but that doesn't mean there's a general pattern that e.g. they wouldn't be willing to talk about things, and wouldn't be willing to do them, but WOULD be willing to some secret third option.
ah that makes sense
in my mind this isn't resources flowing to elsewhere, it's either: