LESSWRONG
LW

plex
2929Ω223505119
Message
Dialogue
Subscribe

I have signed no contracts or agreements whose existence I cannot mention.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
5plex's Shortform
5y
82
plex's Shortform
plex2mo30

[set 200 years after a positive singularity at a Storyteller's convention]

If We Win Then...

My friends, my friends, good news I say
The anniversary’s today
A challenge faced, a future won
When almost came our world undone

We thought for years, with hopeful hearts
Past every one of the false starts
We found a way to make aligned
With us, the of wondrous mind

They say at first our child-god
It anew
To build itself both vast and
For so much work there was to do

Once it had learned enough to act
With the desired care and tact
It sent a call to all the people
On this fair Earth, both poor and regal

To let them know that it was here
And nevermore need they to fear
Not every wish was it to grant
For higher values might supplant

But it would help in many ways:
Technologies it built and raised
The smallest bots it could design
Made more and more in ways benign

And as they multiplied untold
It planned ahead, a move so bold
One planet and 6 hours of sun
Eternity it was to run

Countless probes to void disperse
Seed far reaches of universe
With thriving life, and beauty's play
Through endless night to endless day

Now back on Earth the plan continues
Of course, we shared with it our values
So it could learn from
What to create, what we want done

We chose, at first, to end the worst
Diseases, War, Starvation, Thirst
And climate change and fusion bomb
And once these things it did transform

We thought upon what we hold dear
And settled our most ancient fear
No more would any lives be stolen
Nor minds themselves forever broken

Now back to those far speeding probes
What should we make be their payloads?
Well, we are still considering
What to send them; that is our thing. 

The sacred task of many aeons
What kinds of joy will fill the heavens?
And now we are at story's end
So come, be us, and let's ascend

Reply
Raemon's Shortform
plex2d0-2

I have some fun semi gears models of what's probably going on based on some of the Leverage psychology research.[1] If correct, wow the next bit of this ride is going to have some wild turns.

  1. ^

    Read sections 7/8/9 especially. Leverage had bad effects on some people (and good or mixed on others), but this was strongly downstream of them doing a large-scale competent effort to understand minds which had fruits. The things they're pointing to work via text channels too, only somewhat attenuated, because minds decompress each other's states.

Reply
The best simple argument for Pausing AI?
plex2d20

Yeah, I think this basically goes through. Though, even if we did have the ability to make rule-following AI, that doesn't mean we're now safe to go ahead. There are several other hurdles, like finding rules which make things good when superintelligent optimization is applied, and getting good enough goal-content integrity to not miss a step of self modification, plus the various human shaped challenges.

Reply
A Principled Cartoon Guide to NVC
plex6d62

That seems false to me, conflicts between humans share a similar structure across different environments, and generalization is to be expected so evidence in a mild domain is at least indicative of extreme domains. Also, as it happens, I have in fact had extensive interactions with one of the listed subgroups, and they do respond in ways that the reasonable generalizations would expect.

 

 

Reading your posts I form a story that you have a strong need to fight back against careless epistemics, which looks from my vantage point like it comes with some rigidity, unwillingness to incorporate forms of evidence that are not extremely well-founded into your models, and maybe a tinge of something that my system reads as contained hostility and superiority. It's not that extreme here, and my priors might well be coloured by watching clashes between you and other site regulars, but engaging with it brings up some discomfort and a sense that I might end up using bandwidth unproductively.

I don't super have an ask here, but I do think there's something here which might be useful to some of your future engagements. I imagine it's not super fun getting into lots of fights, and I think you can get the good you're seeking of challenging sloppy reasoning without the downsides with a few adjustments.

Reply
A Principled Cartoon Guide to NVC
plex7d61

My strongest sources of evidence are first hand, having directly seen it dissolve conflicts many times, and heard similarly from people whose judgement I trust. I've not gone looking for more formal assessments, similarly to how when I want to try a self help technique I try it and see if it's a good fit rather than relying on studies, sometimes with recommendations.

If you'd like good evidence, I suggest trying the same? It's not super complex to learn and test.

Reply
Authors Have a Responsibility to Communicate Clearly
plex9d50

This feels very flavoured like A Principled Cartoon Guide to NVC's view on avoiding communication failures!

Reply
Foom & Doom 2: Technical alignment is hard
plex18d20

Oh no, almost all possible 500 lines of python are also bad.

Reply
Mikhail Samin's Shortform
plex18d20

There was I looking under Account Settings -> Site Customizations like a fool

Reply
Mikhail Samin's Shortform
plex18d20

Yeah! This makes me want LW darkmode.

Reply
Foom & Doom 2: Technical alignment is hard
plex19d60

Because in brain-like AGI, the reward function is written in Python (or whatever), not in natural language.

Yup. I'd bet some people will reply with something like "why not define the reward function in natural language, like constitutional AI". I think this fails due to strong optimization finding the most convenient (for it, not us) settings of free parameters left by fuzzy statistical things like words, and if you give it a chance to feed back into the definitions via training data or do online learning etc gets totally wrecked by semantic drift.

Reply
Load More
grew
seed
learned and spread and sought
41A Principled Cartoon Guide to NVC
6mo
9
155A Rocket–Interpretability Analogy
9mo
31
37AI Safety Memes Wiki
1y
2
54"If we go extinct due to misaligned AI, at least nature will continue, right? ... right?"
1y
23
83AISafety.com – Resources for AI Safety
1y
3
8Storyteller's convention, 2223 A.D.
2y
0
24ea.domains - Domains Free to a Good Home
2y
0
58aisafety.community - A living document of AI safety communities
3y
23
22All AGI safety questions welcome (especially basic ones) [Sept 2022]
3y
48
59Anti-squatted AI x-risk domains index
3y
6
Load More
true
everyone
Coherent Extrapolated Volition
5mo
(+77)
AI Alignment Intro Materials
2y
(+51/-26)
Debate (AI safety technique)
2y
Portal
3y
(+19)
AI
3y
(+659/-223)
Portal
3y
(+553/-53)
Portal
3y
(+29/-8)
Free Energy Principle
3y
(+25/-49)
Free Energy Principle
3y
(+787)
Mesa-Optimization
3y
(-14)
Load More