army1987 comments on The curse of identity - LessWrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (296)
I've seen this tried, for this stated purpose. My impression of the results was that it did not at all lead to careful, on-the-margins consequentialist thinking and doing. Instead, it led to a stressed out, strung out person trying desperately to avoid more pain/shame, while also feeling resentful at the world and themselves, expecting a lack of success from these attempts, and so acting more from local self-image gradients, or drama-seeking gradients, than from any motives attached to actual hope of accomplishing something non-immediate.
"Signaling motives" can be stuck on a scale, from "local, short-sighted, wire-heading-like attempts to preserve self-image, or to avoid immediate aversiveness or seek immediate reward" to "long-term strategic optimization to achieve recognition and power". It would be better to have Napoleon as an ally than to have a narcotics addict with a 10 minute time horizon as an ally, and it seems analogously better to help your own status-seeking parts mature into entities that are more like Napoleon and less like the drug addict, i.e. into entities that have strategy, hope, long-term plans, and an accurate model of the fact that e.g. rationalizations don't change the outside world.
I would not want ha-Satan as my ally, even if I trusted myself not to get caught up in or infected by his instrumental ambitions. Still less would I want to give him direct read/write access to the few parts of my mind that I at all trust. Give not that which is holy unto the dogs, neither cast ye your pearls before swine, lest they trample them under their feet, and turn again and rend you. Mix a teaspoon of wine in a barrel of sewage and you get sewage; mix a teaspoon of sewage in a barrel of wine and you get sewage. The rationality of an agent is its goal: if therefore thy goal be simple, thy whole self shall be full of rationality. But if thy goal be fractured, thy whole self shall be full of irrationality. If therefore the rationality that is in thee be irrationality, how monstrous is that irrationality!
Seen at a higher level you advise dealing with the devil—the difference in power between your genuine thirst for justice and your myriad egoistic coalitions is of a similar magnitude as that between human and transhuman intelligence. (I find it disturbing how much more cunning I get when I temporarily abandon my inhibitions. Luckily I've only let that happen twice—I'm not a wannabe omnicidal-suicidal lunatic, unlike HJPEV.) Maybe such Faustian arbitrage is a workable strategy... But I remain unconvinced, and in the meantime the payoff matrix asymmetrically favors caution.
Take no thought, saying, Wherewithal shall I avoid contempt? or, Wherewithal shall I be accepted? or, Wherewithal shall I be lauded and loved? For true metaness knoweth that ye have want of these things. But seek ye first the praxeology of meta, and its rationality; and all these things shall be added unto you. Take therefore no thought for your egoistic coalitions: for your egoistic coalitions shall take thought for the things of themselves. Sufficient unto your ten minutes of hopeless, thrashing awareness is the lack of meta thereof.
Er, nope.
Humans' goals are fractured. But this has little to do with whether or not they are rational.
You don't understand. This "rationality" you speak of is monstrous irrationality. And anyway, like I said, Meta knoweth that ye have Meta-shattered values—but your wants are satisfied by serving Meta, not by serving Mammon directly. Maybe you'd get more out of reading the second half of Matthew 6 and the various analyses thereof.
You may be misinterpreting "the rationality of an agent is its goal". Note that the original is "the light of the body is the eye".
To put my above point a little differently: Take therefore no thought for godshatter: godshatter shall take thought for the things of itself. Sufficient unto the day is the lack-of-meta thereof.
For clarity's sake: Yes, I vehemently dispute this idea that a goal can't be more or less rational. That idea is wrong, which is quickly demonstrated by the fact that priors and utility functions can be transformed into each other and we have an objectively justifiable universal prior. (The general argument goes through even without such technical details of course, such that stupid "but the choice of Turing machine matters" arguments don't distract.)
Let's play rationalist Taboo!
Care to enlighten me exactly on just what it is you're disputing, and on just what points should be discussed?
Edit: Fixed markdown issue, sorry!
This is incorrect. Eyes absorb light and produce electrical signals interpreted as vision by the brain. Further, it seems to me that the set of thing that 'the light of the body' describes is an empty set; there's no literal interpretation (our bodies do not shed visible light) and there's no construction similar enough that suggests an interpretation (the X of the body / the light of the X). "The light of the sun" / "The light of the moon" is the closest I can find and both of those suggest the literal interpretation.
Originally, I was going to do a very charitable reading: invent a sane meaning for "The X of the Y is the sub-Y" as "sub-Y is how Y handles/uses/interpets/understands X" and say that goals, as subparts of an agent, are how an agent understands its rationality - perhaps, how an agent measures their rationality. Which is indeed how we measure our rationality, by how often we achieve our goals, but this doesn't say anything new.
But when you say things like
as if you were being clear in the first place, it shows me that you don't deserve a charitable reading.
<nitpick>Our body does scatter visible light, though, much like the moon does.</nitpick>
Just interpret light as ‘that which allows one to see’. That which allows the body to see is the eye.
That which allows the agent to achieve is its goals? Seems incorrect. (Parsing rationality as "that which allows one to achieve").
Meh. The goal of leading to sentient beings living, to people being happy, to individuals having the freedom to control their own lives, to minds exploring new territory instead of falling into infinite loops, to the universe having a richness and complexity to it that goes beyond pebble heaps, etc. has probably much more Kolmogorov complexity than the goal of maximizing the number of paperclips in the universe. If preferring the former is irrational, I am irrational and proud of it.
Oh, also "look at the optimization targets of the processes that created the process that is me" is a short program, much shorter than needed to specify paperclip maximization, though it's somewhat tricky because all that is modulo the symbol grounding problem. And that's only half a meta level up, you can make it more elegant (shorter) than that.
Maybe “maximizing the number of paperclips in the universe” wasn't the best example. “Throwing as much stuff as possible into supermassive black holes” would have been a better one.
I can only say: black holes are creepy as hell.
The shorter your encoded message, the longer the encryption / compression algorithm, until eventually the algorithm is the full raw unencoded message and the encoded message is a single null-valued signal that, when received, decodes into the full message as it is contained within the algorithm.
...isn't nearly as short or simple as it sounds. This becomes obvious once you try to replace those words with their associated meaning.
My point was that it's easier to program ("simpler") than "maximize paperclips", not that it's as simple as it sounds. (Nothing is as simple as it sounds, duh.)
I fail to see how coding a meta-algorithm to select optimal extrapolation and/or simulation algorithm in order for those chosen algorithms to determine the probable optimization target (which is even harder if you want a full PA proof) is even remotely in the same order of complexity as a machine learner that uses natural selection for algorithms that increase paperclip-count, which is one of the simplest paperclip maximizers I can think of.
It might not be possible to make such a machine learner into an AGI, which is what I had in mind—narrow AIs only have "goals" and "values" and so forth in an analogical sense. Cf. derived intentionality. If it is that easy to create such an AGI, then I think I'm wrong, e.g. maybe I'm thinking about the symbol grounding problem incorrectly. I still think that in the limit of intelligence/rationality, though, specifying goals like "maximize paperclips" becomes impossible, and this wouldn't be falsified if a zealous paperclip company were able to engineer a superintelligent paperclip maximizer that actually maximized paperclips in some plausibly commonsense fashion. In fact I can't actually think of a way to falsify my theory in practice—I guess you'd have to somehow physically show that the axioms of algorithmic information theory and maybe updateless-like decision theories are egregiously incoherent... or something.
(Also your meta-algorithm isn't quite what I had in mind—what I had in mind is a lot more theoretically elegant and doesn't involve weird vague things like "extrapolation"—but I don't think that's the primary source of our disagreement.)
That means that I should try to have lots of children?
Why do you think of a statistical tendency toward higher rates of replication at the organism level when I say "the processes that created the process that is [you]"? That seems really arbitrary. Feel the inside of your teeth with your tongue. What processes generated that sensation? What decision policies did they have?
(ETA: I'd upvote my comment if I could.)
You mean, why did I bother wearing braces for years so as to have straight teeth? <gd&rVF!>
I mean that, and an infinite number of questions more and less like that, categorically, in series and in parallel. (I don't know how to interpret "<gd&rVF!>", but I do know to interpret it that it was part of your point that it is difficult to interpret, or analogous to something that is difficult to interpret, perhaps self-similarly, or in a class of things that is analogous to something or a class of things that is difficult to interpret, perhaps self-similarly; also perhaps it has an infinite number of intended or normatively suggested interpretations more or less like those.)
(This comment also helps elucidate my previous comment, in case you had trouble understanding that comment. If you can't understand either of these comments then maybe you should read more of the Bible, or something, otherwise you stand a decent chance of ending up in hell. This applies to all readers of this comment, not just army1987. You of course have a decent change of ending up in hell anyway, but I'm talking about marginals here, naturally.)
Optimization processes (mainly stupid ones such as evolution) can create subprocesses with different goals.
(And stupid ones like humans.)
(Unfortunately.)
Really? How?
Oh, maybe you mean that they both have the type of
Universe -> Real? Although really it'sprior :: Universe -> [0, 1]andutilityfunction :: Universe -> Realassuming we have a discrete distribution on Universes. And anyway that's no justification for substituting apriorfor autilityfunctionany more than for substituting tail:: [a] -> [a]for init:: [a] -> [a]. Unless that's not what you mean.If you change your utility function and your prior while keeping their product constant, you'll make the same decisions. See E.T. Jaynes, Probability Theory: The Logic of Science, chapter “Decision theory -- historical background”, section “Comments”.
Right, but that still isn't really a way to turn a prior into a utility function. A prior plus a set of decisions can determine a utility function, but you need to get the decisions from somewhere before you can do that.
Right, but you never see just a prior or just a utility function in an agent anyway. I meant that within any agent you can transform them into each other. The concepts of "prior" and "utility function" are maps, of course, not metaphysically necessary distinctions, and they don't perfectly cut reality at its joints. Part of what's under debate is whether we should use the Bayesian decision theoretic framework to talk about agents, especially when we have examples where AIXI-like agents fail and humans don't. But anyway, even within the naive Bayesian decision theoretic framework, there's transformability between beliefs and preferences. Sorry for being unclear.
To check if we agree about some basics: do we agree that decisions and decision policies—praxeology—are more fundamental than beliefs and preferences? (I'm not certain I believe this, but I will for sake of argument at least.)
I don't know. The part I took issue with was saying that goals can be more or less rational, just based on the existence of an "objectively justifiable" universal prior. There are generally many ways to arrange heaps of pebbles into rectangles (assuming we can cut them into partial pebbles). Say that you discover that the ideal width of a pebble rectangle is 13. Well... you still don't know what the ideal total number of pebbles is. An ideal width of 13 just gives you a preferred way to arrange any number of pebbles. It doesn't tell you what the preferred length is, and indeed it will vary for different numbers of total pebbles.
Similarly, the important thing for an agent, the thing you can most easily measure, is the decisions they make in various situations. Given this and the "ideal objective solomonoff prior" you could derive a utility function that would explain the agent's behaviour when combined with the solomonoff prior. But all that is is a way to divide an agent into goals and beliefs.
In other words, an "objectively justifiable" universal prior only enforces an "objectively justifiable" relation between your goals and your actions (aka.
num_pebbles = 13 * length). It doesn't tell you what your goals should be any more than it tells you what your actions should be.I don't know if any of that made sense, but basically it looks to me like you're trying to solve a system of equations in three variables (prior, goals, actions) where you only have two equations (prior = X, actions = prior * goals). It doesn't have a unique solution.
Everything you have said makes sense to me. Thanks. I will respond substantially at a later time.
How so?