All of Mike S (StuffnStuff)'s Comments + Replies

A lot of it comes down to timescales and sequences of events, long term vs short term.

"I will incur a little suffering today, but my well-being will be much better tomorrow".

"We need to get rid of the <undesired national or social group> at cost of suffering, but our nation will have bright future as a result".

People can be tricked into doing unspeakable evil to themselves and others if they have incorrect predictions of the future well-being.

I see the biggest problem not on the technical side of things, but on the social side. The existing power balance withing the population and the fact that it discourages cooperation is in my opinion a much bigger obstacle to alignment. Heck, it prevents alignment between human groups, let alone between humans and the future AGI. I don't see how increased intelligence of a small select group of humans can solve this problem. Well, maybe I am just not smart enough.

I propose a goal of perpetuating interesting information, rather than goals of maximizing  "fun" or "complexity". In my opinion, such goal solves both problems of complex but bleak and desolate future and the fun maximizing drug haze or Matrix future. Of course, the rigorous technical definition of "interesting" must be developed. At least "interesting" assumes there is an appreciating agent and continuous development.

2the gears to ascension
I don't see how that prompt change makes the logical reasoning to identify the math easier yet. Can you elaborate significantly?

I think we should start with asking what is meant by "flourishing civilizations"? In the AI's view, a "flourishing civilization" may not necessarily mean "human civilization".

I generally agree with Stephen Fowler, specifically that "there is no evidence that alignment is a solvable problem."

But even if a solution can be found which provably works for up to N level AGI, what about N+1 level? A sustainable alignment is just not possible. Our only hope is that there may be some  limits on N, for example N=10 requires more resources than the Universe can provide. But it is likely that our ability to prove the alignment will stop well before a significant limit.

3Nathan Helm-Burger
I agree that I don't think we're going to get any proofs of alignment or guarantees of a N level alignment system working for N + 1. I think we also have reason to believe that N extends pretty far, further than we can hope to align with so little time to research. Thus, I believe our hope lies in using our aligned model to prevent anyone from building an N + 1 model ( including the aligned N level model). If our model is both aligned and powerful, this should be possible.