I am saying pretty much exactly that. To clarify further, the words "deliberate", "conscious" and "wants" again belong to the level of emergent behavior: they can be used to describe the agent, not to explain it (what could not be explained by "the agent did X because it wanted to"?).
Let's instead make an attempt to explain. A complete control of an agent's own code, in the strict sense, is in contradiction of Gödel's incompleteness theorem. Furthermore, information-theoretic considerations significantly limit the degree to which an agent can control its own code (I'm wondering if anyone has ever done the math. I expect not. I intend to look further into this). In information-theoretic terminology, the agent will be limited to typical manipulations of its own code, which will be a strict (and presumably very small) subset of all possible manipulations.
Can an agent be made more effective than humans in manipulating its own code? I have very little doubt that it can. Can it lead to agents qualitatively more intelligent than humans? Again, I believe so. But I don't see a reason to believe that the code-rewriting ability itself can be qualitatively different than a human's, only quantitatively so (although of course the engineering details can be much different; I'm referring to the algorithmic level here).
Generally GAIs are ascribed extreme powers around here
As you've probably figured out, I'm new here. I encountered this post while reading the sequences. Although I'm somewhat learned on the subject, I haven't yet reached the part (which I trust exists) where GAI is discussed here.
On my path there, I'm actively trying to avoid a certain degree of group thinking which I detect in some of the comments here. Please take no offense, but it's phrases like the above quote which worry me: is there really a consensus around here about such profound questions? Hopefully it's only the terminology which is agreed upon, in which case I will learn it in time. But please, let's make our terminology "pay rent".
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
In such a case, the median outcome of all agents will be improved if every agent with the option to do so takes that offer, even if they are assured that it is a once/lifetime offer (because presumably there is variance of more than 5 utils between agents).
But the median outcome is losing 5 utils?
Edit: Oh, wait! You mean the median total utility after some other stuff happens (with a variance of more than 5 utils)?
Suppose we have 200 agents, 100 of which start with 10 utils, the rest with 0. After taking this offer, we have 51 with -5, 51 with 5, 49 with 10000, and 49 with 10010. The median outcome would be a loss of -5 for half the agents, a gain of 5 for half, but only the half that would lose could actually get that outcome...
And what do you mean by "the possibility of getting tortured will manifest itself only very slightly at the 50th percentile"? I thought you were restricting yourself to median outcomes, not distributions? How do you determine the median distribution?