All of GunZoR's Comments + Replies

GunZoR10

Has this been helpful? I don't know if you were assuming the things that I told you as already known (if so, sorry), but it seemed to me that you weren't because of your analogies and way of talking about the topic.

Yeah, thanks for the reply. When reading mine, don’t read its tone as hostile or overconfident; I’m just too lazy to tone-adjust for aesthetics and have scribbled down my thoughts quickly, so they come off as combative. I really know nothing on the topic of superintelligence and AI.

A relevant difference that makes the analogy probably irrelevant

... (read more)
1[comment deleted]
GunZoR50

Is there work attempting to show that alignment of a superintelligence by humans (as we know them) is impossible in principle; and if not, why isn’t this considered highly plausible? For example, not just in practice but in principle, a colony of ants as we currently understand them biologically, and their colony ecologically, cannot substantively align a human. Why should we not think the same is true of any superintelligence worthy of the name? “Superintelligence" is vague. But even if we minimally define it as an entity with 1,000x the knowledge, speed,... (read more)

0[anonymous]
Since no one has answered by now, I'm just going to say the 'obvious' things that I think I know: A relevant difference that makes the analogy probably irrelevant is that we are building 'the human' from scratch. The ideal situation is to have hardwired our common sense into it by default. And the design will be already aligned by default when it's deployed. The point of the alignment problem is to (at least ideally) hardwiredly align the machine during deployment to have 'common sense'. Since a superintelligence can have in principle any goal, making humans 'happy' in a satisfactory way is a possible goal that it can have. But you are right in that many people consider that an AI that is not aligned by design might try to pretend that it is during training. I don't think so, necessarily. You might be anthropomorphising too much, it's like assuming that it will have empathy by default. It's true that it might be that an AGI won't want to be 'alienated' from its original goal, but it doesn't mean that any AGI will have an inherent drive to 'fight the tiranny', that's not how it works. Has this been helpful? I don't know if you were assuming the things that I told you as already known (if so, sorry), but it seemed to me that you weren't because of your analogies and way of talking about the topic.
GunZoR10

Someone should give GPT-4 the MMPI-2 (an online version can be cheaply bought here: https://psychtest.net/mmpi-2-test-online/). The test specifically investigates, if I have it right, deceptiveness on the answers along with a whole host of other things. GPT-4 likely isn't conscious, but that doesn't mean it lacks a primitive world-model; and its test results would be interesting. The test is longish: it takes, I think, two hours for a human to complete.

GunZoR33

I have got the faint suspicion that a tone of passive-aggressive condescension isn't optimal here …

GunZoRΩ140

But what stops a blue-cloud model from transitioning into a red-cloud model if the blue-cloud model is an AGI like the one hinted at on your slides (self-aware, goal-directed, highly competent)?

2Vika
We expect that an aligned (blue-cloud) model would have an incentive to preserve its goals, though it would need some help from us to generalize them correctly to avoid becoming a misaligned (red-cloud) model. We talk about this in more detail in Refining the Sharp Left Turn (part 2). 
GunZoR0-2

If it's impossible in principle to know whether any AI really has qualia, then what's wrong with simply using the Turing test as an ultimate ethical safeguard? We don't know how consciousness works, and possibly we won't ever (e.g., mysterianism might obtain). But certainly we will soon create an AI that passes the Turing test. So seemingly we have good ethical reasons just to assume that any agent that passes the Turing test is sentient — this blanket assumption, even if often unwarranted from the aspect of eternity, will check our egos and thereby help p... (read more)

1derek shiller
I'm not sure why we should think that the Turing test provides any evidence regarding consciousness. Dogs can't pass the test, but that is little reason to think that they're not conscious. Large language models might be able to pass the test before long, but it looks like they're doing something very different inside, and so the fact that they are able to hold conversations is little reason to think they're anything like us. There is a danger with being too conservative. Sure, assuming sentience may avoid causing unnecessary harms, but if we mistakenly believe some systems are sentient when they are not, we may waste time or resources for the sake of their (non-existent) welfare. Your suggestion may simply be that we have nothing better to go on, and we've got to draw the line somewhere. If there is no right place to draw the line, then we might as well pick something. But I think there are better and worse place to draw the line. And I don't think our epistemic situation is quite so bad. We may not ever be completely sure which precise theory is right, but we can get a sense of which theories are contenders by continuing to explore the human brain and develop existing theories, and we can adopt policies that respect the diversity of opinion. This strikes me as somewhat odd, as alignment and ethics are clearly related. On the one hand, there is the technical question of how to align an AI to specific values. But there is also the important question of which values to align. How we think about digital consciousness may come be extremely important to that.
GunZoR40

But what is stopping any of those "general, agentic learning systems" in the class "aligned to human values" from going meta — at any time — about its values and picking different values to operate with? Is the hope to align the agent and then constantly monitor it to prevent deviancy? If so, why wouldn't preventing deviancy by monitoring be practically impossible, given that we're dealing with an agent that will supposedly be able to out-calculate us at every step?

GunZoR20

Mental Impoverishment

We should be trying to create mentally impoverished AGI, not profoundly knowledgeable AGI — no matter how difficult this is relative to the current approach of starting by feeding our AIs a profound amount of knowledge.

If a healthy five-year-old[1] has GI and qualia and can pass the Turing test, then a necessary condition of GI and qualia and passing the Turing test isn't profound knowledge. A healthy five-year-old does have GI and qualia and can pass the Turing test. So a necessary condition of GI and qualia and passing the Turin... (read more)