I really liked your quote and remarks. So much so, that I made an edited version of them as a new post here: http://mflb.com/ai_alignment_1/d_250207_insufficient_paranoia_gld.html
The only general remarks that I want to make
are in regards to your question about
the model of 150 year long vaccine testing
on/over some sort of sample group and control group.
I notice that there is nothing exponential assumed
about this test object, and so therefore, at most,
the effects are probably multiplicative, if not linear.
Therefore, there are lots of questions about power dynamics
that we can overall safely ignore, as a simplification,
which is in marked contrast to anything involving ASI.
If we assume, as you requested, "no side effec...
> Humans do things in a monolithic way,
> not as "assemblies of discrete parts".
Organic human brains have multiple aspects.
Have you ever had more than one opinion?
Have you ever been severely depressed?
> If you are asking "can a powerful ASI prevent
> /all/ relevant classes of harm (to the organic)
> caused by its inherently artificial existence?",
> then I agree that the answer is probably "no".
> But then almost nothing can perfectly do that,
> so therefore your question becomes
> seemingly trivial and uninteres...
> Our ASI would use its superhuman capabilities
> to prevent any other ASIs from being built.
This feels like a "just so" fairy tale.
No matter what objection is raised,
the magic white knight always saves the day.
> Also, the ASI can just decide
> to turn itself into a monolith.
No more subsystems?
So we are to try to imagine
a complex learning machine
without any parts/components?
> Your same SNC reasoning could just well
> be applied to humans too.
No, not really, insofar as the power being
assumed and presumed afforded to the ASI
is very very much g...
> Lets assume that a presumed aligned ASI
> chooses to spend only 20 years on Earth
> helping humanity in whatever various ways
> and it then (for sure!) destroys itself,
> so as to prevent a/any/the/all of the
> longer term SNC evolutionary concerns
> from being at all, in any way, relevant.
> What then?
I notice that it is probably harder for us
to assume that there is only exactly one ASI,
for if there were multiple, the chances that
one of them might not suicide, for whatever reason,
becomes its own class of signific...
So as to save space herein, my complete reply is at http://mflb.com/2476
Included for your convenience below are just a few (much shortened) highlight excerpts of the added new content.
> Are you saying "there are good theoretical reasons
> to reasonably think that ASI cannot 100% predict
> all future outcomes"?
> Does that sound like a fair summary?
The re-phrased version of the quote added
these two qualifiers: "100%" and "all".
Adding these has the net effect
that the modified claim is irrelevant,
for the reasons you (cor...
Noticing that a number of these posts are already very long, and rather than take up space here, I wrote up some of my questions, and a few clarification notes regarding SNC in response to the above remarks of Dakara, at [this link](http://mflb.com/ai_alignment_1/d_250126_snc_redox_gld.html).
Simplified Claim: that an AGI is 'not-aligned' *if* its continued existence for sure eventually results in changes to all of this planets habitable zones that are so far outside the ranges any existing mammals could survive in, that the human race itself (along with most of the other planetary life) is prematurely forced to go extinct.
Can this definition of 'non-alignment' be formalized sufficiently well so that a claim 'It is impossible to align AGI with human interests' can be well supported, with reasonable reasons, logic, argument, etc?
The term 'exist'...
> The summary that Will just posted posits in its own title that alignment is overall plausible "even ASI alignment might not be enough". Since the central claim is that "even if we align ASI, it will still go wrong", I can operate on the premise of an aligned ASI.
The title is a statement of outcome -- not the primary central claim. The central claim of the summary is this: That each (all) ASI is/are in an attraction basin, where they are all irresistibly pulled towards causing unsafe conditions over time.
Note there is no requirement for th...
If soldiers fail to control the raiders in at least preventing them from entering the city and killing all the people, then yes, that would be a failure to protect the city in the sense of controlling relevant outcomes. And yes, organic human soldiers may choose to align themselves with other organic human people, living in the city, and thus to give their lives to protect others that they care about. Agreed that no laws of physics violations are required for that. But the question is if inorganic ASI can ever actually align with organic ...
As a real world example, consider Boeing. The FAA, and Boeing both, supposedly and allegedly, had policies and internal engineering practices -- all of which are control procedures -- which should have been good enough to prevent an aircraft from suddenly and unexpectedly loosing a door during flight. Note that this occurred after an increase in control intelligence -- after two disasters of whole Max aircraft lost. On the basis of small details of mere whim, of who choose to sit where, there could have been someone sitting in that particular s...
"Suppose a villager cares a whole lot about the people in his village...
...and routinely works to protect them".
How is this not assuming what you want to prove? If you 'smuggle in' the statement of the conclusion "that X will do Y" into the premise, then of course the derived conclusion will be consistent with the presumed premise. But that tells us nothing -- it reduces to a meaningless tautology -- one that is only pretending to be a relevant truth. That Q premise results in Q conclusion tells us nothing new, nothing actually relevant. ...
Hi Linda,
In regards to the question of "how do you address the possibility of alignment directly?", I notice that the notion of 'alignment' is defined in terms of 'agency' and that any expression of agency implies at least some notion of 'energy'; ie, is presumably also implying at least some sort of metabolic process, as as to be able to effect that agency, implement goals, etc, and thus have the potential to be 'in alignment'. Hence, the notion of 'alignment' is therefore at least in some way contingent on at least some sort of notion of "world exc...
Maybe we need a "something else" category? An alternative other than simply business/industry and academics?
Also, while this is maybe something of an old topic, I took some notes regarding my thoughts on this topic and and related matters posted them to:
https://mflb.com/ai_alignment_1/academic_or_industry_out.pdf
There are a lot of issues with the article cited above. Due to the need for more specific text formatting, I wrote up my notes, comments, and objections here:
http://mflb.com/ai_alignment_1/d_250206_asi_policies_gld.html