level 4 makes each of the other levels into partially-grounded keynesian beauty contests - a thing from economics that was intended to model the stock market - which I think is where a lot of "status signaling" stuff comes from. But that doesn't mean there isn't a real beauty contest underneath.
Yes!
I wrote a low-edit post about how individual interactions give rise to consistent status hierarchies, a few month ago. (That blog is only for quick low-edit writing of mine. Those are called Tumbler posts?)
Briefly, people help people who can help them. A person who has many people who want to help them can be more helpful, so more people want to help them.
My current thinking about how to implement this without having to build full sized agents is to make little stateful reinforcement learner type things in a really simple agent-world, something like a typed-message-passing type thing. possibly with 2d or 3d locations and falloff of action effects by distance? then each agent can take actions, can learn to map agent to reward, etc.
could use really small neural networks I guess, or maybe just linear matrices of [agents, actions] and then mcmc sample from actions taken and stuff?
I'm confused precisely how to implement deservingness... seems like deservingness is something like a minimum control target for others' reward, retribution is a penalty that supersedes it? maybe?
if using neural networks implementing the power thing on level 3 is a fairly easy prediction task, using bayesian mcmc whatever it's much harder. maybe that's an ok place to use NNs? trying to use NNs in a model like this feels like a bad idea unless the NNs are extremely regularized.... also the inference needed for level 4 is hard without NNs.
I haven't had time to fully load this up into my working memory to think it through, check implications, etc, but for now wanted to say I very much appreciate the spirit in which the post is presented. (Specifically: it attempts to present a concrete model specific enough to be falsifiable and predictive)
Yes. Strong upvote. I'm very excited to see hypothesized models that preport to give rise to high level phenomena, and models that are on their way to be executable are even better.
So we're making an executable model of part of the brain, so I'm going to write it as a series of changes I'm going to make.
1. To start our brain thingy off, add direct preferences:
Haven't finished reading and apologies if this is cleared up later, but I wasn't clear what you meant by "changes you're going to make" – is this relative to a blank rock of nothingness, or to a crude learning algorithm, or something else?
0. start with blank file
1. add preference function
2. add time
3. add the existence of another agent
4. add the existence of networks of other agents
something that I realized bothers me about this model: I basically didn't include TAPs reasoning aka classical conditioning, I started from operant conditioning.
also, this explanation fails miserably at the "tell a story of how you got there in order to convey the subtleties" thing that eg ben hoffman was talking about recently.
yeahhhhhh missing TAP type reasoning is a really critical failure here, I think a lot of important stuff happens around signaling whether you'll be an agent that is level 1 valuable to be around, and I've thought before about how keeping your hidden TAP depth short in ways that are recognizeable to others makes you more comfortable to be around because you're more predictable. or something
this would have to take the form of something like, first make the agent as a slightly-stateful pattern-response bot, maybe with a global "emotion" state thing that sets which pattern-response networks to use. then try to predict the world in parts, unsupervised. then have preferences, which can be about other agents' inferred mental states. then pull those preferences back through time, reinforcement learned. then add the retribution and deservingness things on top. power would be inferred from representations of other agents, something like trying to predict the other agents' unobserved attributes.
also this doesn't put level 4 as this super high level thing, it's just a natural result of running the world prediction for a while.
the better version of this model probably takes the form of a list of the most important built-in input-action mappings.
Not sure if you were already thinking along these lines or not (not not entirely sure I think it's how my brain works much less normal brains) but since you were borrowing from economics how are your preferences balanced internally? Looking at some constrained max reward? Decision-making a la marginal rewards? Something else?
so I'm very interested in anything you feel you can say about how this doesn't work to describe your brain.
with respect to economics - I'm thinking about this mostly in terms of partially-model-based reinforcement learning/build-a-brain, and economics arises when you have enough of those in the same environment. the thing you're asking about there is more on the build-a-brain end and is like pretty open for discussion, the brain probably doesn't actually have a single scalar reward but rather a thing that can dispatch rewards with different masks or something
That is very difficult to articulate for me. If we take the standard econ choice equilibrium definition of equating marginal utility per dollar, and then toss out the price element since we're purely comparing the utility (ignoring the whole subjective versus other forms issue here) we don't need to normalize on cost (I think).
That implies that the preference for completely different actions/choices I make are directly comparable. In other words it is a choice between differences in kind and not categorical differences.
However, when I really find myself in a position where I have a hard choice to make, it's never a problem with some simply mental calculation such as above but feels entirely different. The challenge I face is that I'm not making that type of comparison but something more along the lines of choosing between two alternatives that lack a common basis for comparison.
I was thinking a bit about this in a different context a while back. If economic decision theory, at least from a consumer perspective, is all about indifference curves, is that really a decision theory or merely a rule following approach? The real decision arises in the setting where you are in a position of indifference between multiple alternative but economics cannot say anything about that -- the answer there is flip a coin/random selection but is that really a rational though process for choice?
But, as I said, I'm not entirely sure I think like other people.
> From the inside, this is an experience that in-the-moment is enjoyable/satisfying/juicy/fun/rewarding/attractive to you/thrilling/etc etc.
people’s preferences change in different contexts since they are implicitly always trying to comply with what they think is permissible/safe before trying to get it, up to some level of stake outweighing this, along many different axes of things one can have a stake in
to see people’s intrinsic preferences we have to consider that people often aren’t getting what they want and are tricked into wanting suboptimal things wrt some of their long-suppressed wants, because of social itself
this has to be really rigorous because it’s competing against anti-inductive memes
this is really important to model because if we know anything about people’s terminal preferences modulo social we know we are confused about social anytime we can’t explain why they aren’t pursuing opportunities they should know about or anytime they are internally conflicted even though they know all the consequences of their actions relative to their real ideal-to-them terminal preferences
> Social sort of exists here, but only in the form that if an agent can give something you want, such as snuggles, then you want that interaction.
is it social if a human wants another human to be smiling because perception of smiles is good?
is it social if a human wants another human to be smiling because perception of smiles is good?
I wouldn't say so, no.
good point about lots of level 1 things being distorted or obscured by level 3. I think the model needs to be restructured to not have a privileged instrinsicness to level 1, but rather initialize moment to moment preferences with one thing, then update that based on pressures from the other things
EDIT, 2022:
this post is still a reasonable starting point, but I need to post a revised version that emphasizes preventing dominance outside of play. All forms of dominance must be prevented, if society is to heal from our errors. These days I speak of human communication primarily in terms of the network messaging protocols that equate to establishing the state of being in communication, modulated by permissions defined by willingness to demonstrate friendliness via action. I originally wrote this post to counter "status is all you need" type thinking, and in retrospect, I don't think I went anywhere near far enough in eliminating hierarchy and status from my thinking.
With that reasoning error warning in mind, original post continues:
Preface
(I can't be bothered to write a real Serious Post, so I'm just going to write this like a tumblr post. y'all are tryhards with writing and it's boooooring, and also I have a lot of tangentially related stuff to say. Pls critique based on content. If something is unclear, quote it and ask for clarification)
Alright so, this is intended to be an explicit description that, hopefully, could be turned into an actual program, that would generate the same low-level behavior as the way social stuff arises from brains. Any divergence is a mistake, and should be called out and corrected. it is not intended to be a fake framework. it's either actually a description of parts of the causal graph that are above a threshold level of impact, or it's wrong. It's hopefully also a good framework. I'm pretty sure it's wrong in important ways, I'd like to hear what people suggest to improve it.
Recommended knowledge: vague understanding of what's known about how the cortex sheet implements fast inference/how "system 1" works, how human reward works, etc, and/or how ANNs work, how reinforcement learning works, etc.
The hope is that the computational model would generate social stuff we actually see, as high-probability special cases - in semi-technical terms you can ignore if you want, I'm hopeful it's a good causal/generative model, aka that it allows compressing common social patterns with at least somewhat accurate causal graphs.
The thing
So we're making an executable model of part of the brain, so I'm going to write it as a series of changes I'm going to make. (I'm uncomfortable with the structured-ness of this, if anyone has any ideas for how to generalize it, that would be helpful.)
Things that seem like they're missing to me
misc interesting consequences
Examples of things to analyze would be welcome, to exercise the model, whether the examples fit in it or not; I'll share some more at some point, I have a bunch of notes to share.