helpful people can point me to specific articles
I suggest taking a look at the Complexity of Value page on the LW wiki, not because "complexity of value" as defined there is exactly what I think you're missing (it isn't) but because several of the links there will take you to relevant stuff in (as you put it) the canons. The "Fake Utility Functions" post mentioned there and its predecessors are worth a read, for instance. Also "Value is Fragile" and "The Hidden Complexity of Wishes".
(All this talk of "canons&...
This topic comes up every once in a while. In fact, one of the more recent threads was started by me, though it may not be obvious to you at first how that thread is related to this topic.
I think it's actually fun to talk about the structure of an "ultra-stable metric" or even an algorithm by which some kind of "living metric" may be established and then evolved/curated as the state of scientific knowledge evolves.
For a shared and stable value metric to function as a solution to the AI alignment problem it would need also to be:
To illustrate the last requirement, let me make an example. Let's suppose that to a new AI is given the task of dividing some fund between the existing four prototype of nuclear fusion plants. It will need to calculate the value of each prototype and their very different supply chains. But it also need to calculate the value of th...
I think what you're missing is that metrics are difficult - I've written about that point in a number of contexts; www.ribbonfarm.com/2016/06/09/goodharts-law-and-why-measurement-is-hard/
There are more specific metric / goal problems with AI; Eliezer wrote this https://intelligence.org/2016/12/28/ai-alignment-why-its-hard-and-where-to-start/ - and Dario Amodei has been working on it as well; https://openai.com/blog/faulty-reward-functions/ - and there is a lot more in this vein!
I am not perfectly sure how this site has worked (although I skimmed the "tutorials") and I am notorious for not understanding systems as easily and quickly as the general public might. At the same time I suspect a place like this is for me, for what I can offer but also for what I can receive (ie I intend on (fully) traversing the various canons).
I also value compression and time in this sense, and so I think I can propose a subject that might serve as an "ideal introduction" (I have an accurate meaning for this phrase I won't introduce atm).
I've read a lot of posts/blogs/papers that are arguments which are founded on a certain difficulties, where the observation and admission of this difficulty leads the author and the reader (and perhaps the originator of the problem/solution outlines) to defer to some form of a (relative to what will follow) long winded solution.
I would like to suggest, as a blanket observation and proposal, that most of these difficult problems described, especially on a site like this, are easily solvable with the introduction of an objective and ultra-stable metric for valuation.
I think maybe at first this will seem like an empty proposal. I think then, and also, some will see it as devilry (which I doubt anyone here thinks exists). And I think I will be accused of many of the fallacies and pitfalls that have already been previously warned about in the canons.
That latter point I think might suggest that I might learn well and fast from this post as interested and helpful people can point me to specific articles and I WILL read them with sincere intent to understand them (so far they are very well written in the sense that I feel I understand them because they are simple enough) and I will ask questions.
But I also think ultimately it will be shown that my proposal and my understanding of it doesn't really fall to any of these traps, and as I learn the canonical arguments I will be able to show how my proposal properly addresses them.