The reason I blog is to have the discipline of formulating the problem for others, and to get some immediate feedback. I would recommend it for anything that doesn't need to be kept secret, as simply writing down the problem for others helps to clarify it in your own mind.
You're talking about an ontological crisis, though Aribital has a slightly different term. Naturally people have started to work on the problem and MIRI believes it can be solved.
It also seems like the exact same issue arises with satisficing, and you've hidden this fact by talking about "some abstract notion of good and bad" without explaining how the AGI will relate this notion to the world (or distribution) that it thinks it exists in.
MIRI believes it can be solved.
If it can't be solved, how will MIRI know?
It also seems like the exact same issue arises with satisficing, and you've hidden this fact by talking about "some abstract notion of good and bad" without explaining how the AGI will relate this notion to the world (or distribution) that it thinks it exists in.
Satisficers can use credit assignment strategies of various means without always falling to wireheading, because they are not trying to maximise credit. I'm interested in more flexible schemes that can paradigm shift, but the basic idea from Reinforcement Learning seems sound.
If it can't be solved, how will MIRI know?
For one, they wouldn't find a single example of a solution. They wouldn't see any fscking human beings maintaining any goal not defined in terms of their own perceptions - eg, making others happy, having an historical artifact, or visiting a place where some event actually happened - despite changing their understanding of our world's fundamental reality.
If I try to interpret the rest of your response charitably, it looks like you're saying the AGI can have goals wholly defined in terms of perception, because it can avoid wireheading via satisficing. That seems incompatible with what you said before, which again invoked "some abstract notion of good and bad" rather than sensory data. So I have to wonder if you understand anything I'm saying, or if you're conflating ontological crises with some less important "paradigm shift" - something, at least, that you have made no case for caring about.
For one, they wouldn't find a single example of a solution. They wouldn't see any fscking human beings maintaining any goal not defined in terms of their own perceptions - eg, making others happy, having an historical artifact, or visiting a place where some event actually happened - despite changing their understanding of our world's fundamental reality.
Fscking humans aren't examples of maximizers with coherent ontologies changing in them in a way that will guarantee that the goals will be followed. They're examples of systems with multiple different languages for describing the world that exist simultaneously. The multiple languages sometimes come into conflict and people sometimes go insane. People are only generally maximis-ish in certain domains and that maximisation is not constant over people's life times.
If I try to interpret the rest of your response charitably, it looks like you're saying the AGI can have goals wholly defined in terms of perception, because it can avoid wireheading via satisficing. That seems incompatible with what you said before, which again invoked "some abstract notion of good and bad" rather than sensory data
You can hard code some sensory data to mean an abstract notion of good or bad, if you know you have a helpful human around to supply that sensory data and keep that meaning.
ontological crises with some less important "paradigm shift"
Paradigm shifts are ontological crises limited to the language used to describe a domain. You can go read about them on wikipedia if you want and make up your own mind if they are important.
If you are right about the first points, it is basically impossible to have a general intelligence with an absolute but limited goal like "maximizing the number of paperclips." I agree that this is impossible in practice (even if it might be possible in principle, in some sense of "in principle"). I've argued that a few times here with somewhat similar reasoning
So what do we do about it? With intelligence as currently nebulous as it is, I think it makes sense to have a group of people working on 'What if maximisers' and also a group working on "What if satisficers". We currently lack the latter.
This means I've got to care about a whole bunch of other problems that the AI singleton people don't have to worry about.
I realize I should unpack all these, But blogging is not the answer. To know what sort of satisficing system will work in the real world with real people we need to experiment (and maybe paradigm shift ourselves a few times). Only then can we figure out how it will evolve over time,
Having a proof of concept will also focus people's attention, more than writing a bunch of words.
If you want to work with me, know someone who might want to, or to point out some flaws in my reasoning such that there is a simpler way forward within the kind of world I think it is, I am contactable at wil (one l) . my surname @gmail.com. But i think I done with LW for now. Good luck with the revamp.