If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the new Concepts section.
The Open Thread tag is here. The Open Thread sequence is here.
I notice that I am confused by not seeing discourse about using AI alignment solutions for human alignment. It seems like the world as we know it is badly threatened by humans behaving in ways I'd describe as poorly aligned, for an understanding of "alignment" formed mostly from context in AI discussions in this community.
I get that AI is different from people -- we assume it's much "smarter", for one thing. Yet every "AI" we've built so far has amplified traits of humanity that we consider flaws, as well as those we consider virtues. Do we expect that this would magically stop being the case if it passed a certain threshhold?
And doesn't alignment, in the most general terms, get harder when it's applied to "smarter" entities? If that's the case, then it seems like the "less smart" entities of human leaders would be a perfect place to test strategies we think will generalize to "smarter" entities. Conversely, if we can't apply alignment findings to humans because alignment gets "easier" / more tractable when applied to "smarter" entities, doesn't that suggest a degenerate case of minimum alignment difficulty for a maximally "smart" AI?
Ah, what? (I'm reacting to the "every" qualifier here.)
I'd say it comes down to founder effects.
I wouldn't necessarily call it 'using AI alignment solutions for human alignment' though.
Perhaps a better starting point would be: how to discern alignment. And, are there predictable betrayals? Can that situation be improved?
... (read more)