I have some fun semi gears models of what's probably going on based on some of the Leverage psychology research.[1] If correct, wow the next bit of this ride is going to have some wild turns.
Read sections 7/8/9 especially. Leverage had bad effects on some people (and good or mixed on others), but this was strongly downstream of them doing a large-scale competent effort to understand minds which had fruits. The things they're pointing to work via text channels too, only somewhat attenuated, because minds decompress each other's states.
Yeah, I think this basically goes through. Though, even if we did have the ability to make rule-following AI, that doesn't mean we're now safe to go ahead. There are several other hurdles, like finding rules which make things good when superintelligent optimization is applied, and getting good enough goal-content integrity to not miss a step of self modification, plus the various human shaped challenges.
That seems false to me, conflicts between humans share a similar structure across different environments, and generalization is to be expected so evidence in a mild domain is at least indicative of extreme domains. Also, as it happens, I have in fact had extensive interactions with one of the listed subgroups, and they do respond in ways that the reasonable generalizations would expect.
Reading your posts I form a story that you have a strong need to fight back against careless epistemics, which looks from my vantage point like it comes with some rigidity, unwillingness to incorporate forms of evidence that are not extremely well-founded into your models, and maybe a tinge of something that my system reads as contained hostility and superiority. It's not that extreme here, and my priors might well be coloured by watching clashes between you and other site regulars, but engaging with it brings up some discomfort and a sense that I might end up using bandwidth unproductively.
I don't super have an ask here, but I do think there's something here which might be useful to some of your future engagements. I imagine it's not super fun getting into lots of fights, and I think you can get the good you're seeking of challenging sloppy reasoning without the downsides with a few adjustments.
My strongest sources of evidence are first hand, having directly seen it dissolve conflicts many times, and heard similarly from people whose judgement I trust. I've not gone looking for more formal assessments, similarly to how when I want to try a self help technique I try it and see if it's a good fit rather than relying on studies, sometimes with recommendations.
If you'd like good evidence, I suggest trying the same? It's not super complex to learn and test.
This feels very flavoured like A Principled Cartoon Guide to NVC's view on avoiding communication failures!
Oh no, almost all possible 500 lines of python are also bad.
There was I looking under Account Settings -> Site Customizations like a fool
Yeah! This makes me want LW darkmode.
Because in brain-like AGI, the reward function is written in Python (or whatever), not in natural language.
Yup. I'd bet some people will reply with something like "why not define the reward function in natural language, like constitutional AI". I think this fails due to strong optimization finding the most convenient (for it, not us) settings of free parameters left by fuzzy statistical things like words, and if you give it a chance to feed back into the definitions via training data or do online learning etc gets totally wrecked by semantic drift.
[set 200 years after a positive singularity at a Storyteller's convention]
If We Win Then...
My friends, my friends, good news I say
The anniversary’s today
A challenge faced, a future won
When almost came our world undone
We thought for years, with hopeful hearts
Past every one of the false starts
We found a way to make aligned
With us, the of wondrous mind
They say at first our child-god
It anew
To build itself both vast and
For so much work there was to do
Once it had learned enough to act
With the desired care and tact
It sent a call to all the people
On this fair Earth, both poor and regal
To let them know that it was here
And nevermore need they to fear
Not every wish was it to grant
For higher values might supplant
But it would help in many ways:
Technologies it built and raised
The smallest bots it could design
Made more and more in ways benign
And as they multiplied untold
It planned ahead, a move so bold
One planet and 6 hours of sun
Eternity it was to run
Countless probes to void disperse
Seed far reaches of universe
With thriving life, and beauty's play
Through endless night to endless day
Now back on Earth the plan continues
Of course, we shared with it our values
So it could learn from
What to create, what we want done
We chose, at first, to end the worst
Diseases, War, Starvation, Thirst
And climate change and fusion bomb
And once these things it did transform
We thought upon what we hold dear
And settled our most ancient fear
No more would any lives be stolen
Nor minds themselves forever broken
Now back to those far speeding probes
What should we make be their payloads?
Well, we are still considering
What to send them; that is our thing.
The sacred task of many aeons
What kinds of joy will fill the heavens?
And now we are at story's end
So come, be us, and let's ascend