Eliezer_Yudkowsky comments on Logical Pinpointing - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (338)
Meditation:
So far we've talked about two kinds of meaningfulness and two ways that sentences can refer; a way of comparing to physical things found by following pinned-down causal links, and logical reference by comparison to models pinned-down by axioms. Is there anything else that can be meaningfully talked about? Where would you find justice, or mercy?
(Note: this is my first post. I may be wrong, and if so am curious as to how. Anyway, I figure it's high time that my beliefs stick their neck out. I expect this will hurt, and apologize now should I later respond poorly.)
This may be the answer to a different question, but...
I play lots of role-playing games. Role-playing games are like make-believe; events in them exist in a shared counter-factual space (in the players' imagination). Make-believe has a problem: if two people imagine different things, who is right? (This tends to end with a bunch of kids arguing about whether the fictional T-Rex is alive or dead).
Role-playing games solve this problem by handing authority over various facets of the game to different things. The protagonists are controlled by their respective players, the results of choices by dice and rules, and most of the fictional world by the Game Master.*
So, in a role-playing game, when you ask what is true[RPG], you should direct that question to the appropriate authority. Basically, truth[RPG] is actually canon (in the fandom sense; TV Trope's page is good, but comes with the usual where-did-my-evening-go caveats).
Similarly, if we ask "where did Luke Skywalker go to preschool?", we're asking a question about canon.
That said, even canon needs to be internally consistent. If someone with authority were to claim that Tatooine has no preschools, then we can conclude that Luke Skywalker didn't go to preschool. If an authority claims two inconsistent things, we can conclude that the authority is wrong (namely, in the mathematical sense the canon wouldn't match any possible model).
I've long felt that ideas like morality and liberty are a variety of canon.
Specifically, you can have authorities (a religion or philosopher telling you stuff), and those authorities can be provably wrong (because they said something inconsistent), but these ideas exists in a kind of shared imaginary space. Also, people can disagree with the canon and make up their own ideas.
Now, that space is still informed by reality. Even in fiction, we expect gravity to drop off as the square of distance, and we expect solid objects to be unable to pass through each other.** With ideas, we can state that they are nonsensical (or, at minimum, not useful) if they refers to real things which don't exist. A map of morality is a map of a non-real thing, but morality must interface with reality to be useful, so anywhere the interface doesn't line up with reality, morality (or its map) is wrong.
*This is one possible breakdown. There are many others.
**In most games/stories, anyway. At first glance I'd expect morality to be better bound to reality, but I suppose there's been plenty of people who's moral system boiled down to "don't do anything Ma'at would disapprove of", backed up with concepts like the literal weight of sin (vs. the weight of a feather).
It so happens that the three "big lies" death mentions are all related to morality/ethics, which is a hard question. But let me take the conversation and change it a bit:
In this version, the final argument is still correct -- if I take the universe and grind it down to a sieve, I will not be able to say "woo! that carbon atom is an atom of happiness". Since the penultimate question of this meditation was "Is there anything else", at least I can answer that question.
Clearly, we want to talk about happiness for many reasons -- even if we do not value happiness in itself (for ourselves or others), predicting what will make humans happy is useful to know stuff about the world. Therefore, it is useful to find a way that allows us to talk about happiness. Happiness, though, is complicated, so let us put it aside for a minute to ponder something simpler: a solar system. I will simplify here, a solar system is one star and a bunch of planets rotating around it. Though solar systems effect each other through gravity or radiation, most of the effects of the relative motions inside a solar system comes from inside itself, and this pattern repeats itself throughout the galaxy. Much like happiness, being able to talk about solar systems is useful -- though I do not particularly value solar systems in and of themselves, it's useful to have a concept of "a solar system", which describes things with commonalities, and allows me to generalize.
If I grind the universe, I cannot find an atom that is a solar system atom -- grinding the universe down destroys the "solar system" useful pattern. For bounded minds, having these patterns leads to good predictive strength without having to figure out each and every atom in the solar system.
In essence, happiness is no different than solar system -- both are crude words to describe common patterns. It's just that happiness is a feature of minds (mostly human minds, but we talk about how dogs or lizards are happy, sometimes, and it's not surprising -- those minds are related algorithms). I cannot say where every atom is in the case of a human being happy, but some atom configurations are happy humans, and some are not.
So: at the very least, happiness and solar systems are part of the causal network of things. They describe patterns that influence other patterns.
Mercy is easier than justice and duty. Mercy is a specific configuration of atoms behaving a human in a specific way -- even though the human feels they are entitled to cause another human hurt ("feeling entitled" is a set of specific human-mind-configurations, regardless of whether "entitlement" actually exists), but does not do so (for specific reasons, etc. etc.). In short, mercy describes specific patterns of atoms, and is part of causal networks.
Duty and justice -- I admit that I'm not sure what my reductionist metaethics are, and so it's not obvious what they mean in the causal network.
We could make it even easier :P
The harder question is what is a valid way of figuring out the important properties of the system.
The statement that the world is just is a lie. There exist possible worlds that are just - for instance, these worlds would not have children kidnapped and forced to kill - and ours is not one of them.
Thus, justice is a meaningful concept. Justice is a concept defined in terms of the world (pinned-down causal links) and also irreducibly normative statements. Normative statements do not refer to "the world". They are useful because we can logically deduce imperatives from them. "If X is just, then do X." is correct, that is:
Do the right thing.
I am not entirely sure how you arrived at the conclusion that justice is a meaningful concept. I am also unclear on how you know the statement "If X is just, then do X" is correct. Could you elaborate further?
In general, I don't think it is a sufficient test for the meaningfulness of a property to say "I can imagine a universe which has/lacks this property, unlike our universe, therefore it is meaningful."
I did not intend to explain how i arrived at this conclusion. I'm just stating my answer to the question.
Do you think the statement "If X is just, then do X" is wrong?
Like army1987 notes, it is an instruction and not a statement. Considering that, I think "if X is just, then do X" is a good imperative to live by, assuming some good definition of justice. I don't think I would describe it as "wrong" or "correct" at this point.
OK. Exactly what you call it is unimportant.
What matters is that it gives justice meaning.
It may be incomplete. Do you have a place for Mercy?
The reason I'm not making distinctions among different moral words, though such distinctions exist in language, is that it seems the only new problem created by these moral words is understanding morality. Once you understand right and wrong, just and unjust can be defined just like you define regular words, even if something can be just but immoral.
That's an instruction, not a statement.
Um, mathematics.
I can't imagine a universe without mathematics, yet I think mathematics is meaningful. Doesn't this mean the test is not sufficient to determine the meaningfulness of a property?
Is there some established thinking on alternate universes without mathematics? My failure to imagine such universes is hardly conclusive.
Sorry, misread what you wrote in the grand parent. I agree with you.
They exist in the same sense that numbers exist, or that meaningful existence exists, or that meaningfulness exists.
Once you grind the universe into powder, none of those things exists anymore.
I was going to say that yes, I think there is another kind of thing that can be meaningfully talked about, and "justice" and "mercy" and "duty" have something to do with that sort of thing, but a more prototypical example would be "This court has jurisdiction". Especially if many experts were of the opinion that it didn't, but the judge disagreed, but the superior court reversed her, and now the supreme court has decided to hear the case.
But then I realized that there was something different about that kind of "truth": I would not want an AI to assign a probability to the proposition The court did, in fact, have jurisdiction (nor to, oh, It is the duty of any elected official to tell the public if they learn about a case of corruption, say). I think social constructions can technically be meaningfully talked about among humans, and they are important as hell if you want to understand human communication and behavior, but I guess on reflection I think that the fact that I would want an AI to reason in terms of more basic facts is a hint that if we are discussing epistemology, if we're discussing what sorts of thingies we can know about and how we can know about them, rather than discussing particular properties of the particularly interesting thingies called humans, then it might be best to say that "The judge wrote in her decision that the court had jurisdiction" is a meaningful statement in the sense under consideration, but "The court had jurisdiction" is not.
I would find them under the category of patterns.
A neural network is very good at recognising patterns; and human brains run on a neural network architecture. Given a few examples of what a word does or does not mean, we can quickly recognise the pattern and fit it into our vocabulary. (Apparently, this can be used in language classes; the teacher will point to a variety of objects, indicating whether they are or are not vrugte, for example; and it won't take that many examples before the student understands that vrugte means fruit but not vegetables).
Justice and mercy are not patterns of objects, but rather patterns of action. The man killed his enemy, but has a wife and children to support; sending him to Death Row might be just, but letting him have some way of earning money while imprisoned might be merciful. Similarly, happy, sad, and angry are emotional patterns; a person acts in this way when happy, and acts in that way when sad.
Justice, mercy, duty, etc are found by comparison to logical models pinned down by axioms. Getting the axioms right is damn tough, but if we have a decent set we should be able to say "If Alex kills Bob under circumstances X, this is unjust." We can say this the same way that we can say "Two apples plus two apples is four apples." I can't find an atom of addition in the universe, and this doesn't make me reject addition.
Also, the widespread convergence of theories of justice on some issues (eg. Rape is unjust.) suggests that theories of justice are attempting to use their axioms to pin down something that is already there. Moral philosophers are more likely to say "My axioms are leading me to conclude rape is a moral duty, where did I mess up?" than "My axioms are leading me to conclude rape is a moral duty, therefore it is." This also suggests they are pinning down something real with axioms. If it was otherwise, we would expect the second conclusion.
"theories of justice are attempting to use their axioms to pin down something that is already there"
So in other words, duty, justice, mercy--morality words--are basically logical transformations that transform the state of the universe (or a particular circumstance) into an ought statement.
Just as we derive valid conlcusions from premises using logical statements, we derive moral obligations from premises using moral statements.
The term 'utility funcion' seems less novel now (novel as in, a departure from traditional ethics).
This is my view.
Not quite. They don't go all the way to completing an ought statement, as this doesn't solve the Is/Ought dichotomy. They are logical transformations that make applying our values to the universe much easier.
"X is unjust" doesn't quite create an ought statement of "Don't do X". If I place value on justice, that statement helps me evaluate X. I may decide that some other consideration trumps justice. I may decide to steal bread to feed my starving family, even if I view the theft as unjust.
I've thought about this for a while, and I feel like you can replace "Fantasy" and "Lies" with "Patterns" in that dialogue, and have it make sense, and it also appears to be an answer to your questions. That being said, it also feels like a sort of a cached thought, even though I've thought about it for a while. However, I can't think of a better way to express it and all of the other thoughts I had appeared to be significantly lower caliber and less clear.
Considering that, I should then ask "Why isn't 'Patterns' the answer?'
In people's brains, and in papers written by philosophy students.
"Justice" and "mercy" can be found by looking at people, and in particular how people treat each other. They're physical things, although they're really complicated kinds of physical things.
In particular, the kind of thing that is destroyed when you grind it down into powder.
Same thing.
You find them inside counterfactual statements about the reactions of an implied hypothetical representative human, judging under under implied hypothetical circumstances in which they have access to all relevant knowledge. There is clearly justice if a wide variety of these hypothetical humans agree that there is, under a wide variety of these hypothetical circumstances; there is clearly not justice if they agree that there is not. If the hypothetical people disagree with each other, then the definition fails.
Talking about things like justice, mercy and duty is meaningful, but the meanings are intermediated by big, complex webs of abstractions which humans keep in their brains, and the algorithms people use to manipulate those webs. They're unambiguous only to the extent to which people successfully keep those webs in sync with each other. In practice, our abstractions mainly work by combining bags of weak classifiers and feature-weighted similarity to positive and negative examples. This works better for cases that are similar to the training set, worse for cases that are novel and weird, and better for simpler abstractions and abstractions built on simpler constituents.
Why couldn't the hypothetical omniscient people inside the veil of ignorance decide that justice doesn't exist? Or if they could, how does that paragraph go towards answering the meditation? What distinguishes them from the hypothetical death who looks through everything in the universe to try to find mercy? Aren't you begging the question here?
Justice - The quality of being "just" or "fair". Humans call a situation fair when everyone involved is happy afterwards, without having had their desires forcibly thwarted (e.g. being strapped into a chair and hooked into a morphine drip) along the way.
Mercy - Compassionate or kindly forbearance shown toward an offender, an enemy, or other person in one's power. Humans choose to engage in actions characterized this way on a daily basis.
Duty - Something that one is expected or required to do by moral or legal obligation. Legal duties certainly exist; Earth is not an anarchy.
Justice, mercy, and duty are only words. The important question to ask is whether or not they are useful. I certainly think they are; I use each of those words at least once a week. Once the symbols have been replaced by substance, it is clear that we should not be looking for those things in single atoms, but very large collections of them we call "humans", or slightly smaller (but still very large) collections we call "human brains".
And as far as we know, atoms are not arranged in configurations that have the properties we ascribe to the tooth fairy.
The map is not the territory. We discuss reality on many levels, but there is only one underlying level. Justice, duty and the like are abstractions; we use the same symbol in multiple places to define certain patterns. You don't get two identical 'happinesses', like you get two identical atoms. It's useful for us though, to talk about this abstraction at the macro level and not the micro, and it's meaningful, given that we're assuming the same axioms. I think, stuff that causes other stuff is reality, and if we assume certain axioms that correspond to reality, any new truthful statements and concepts deduced are meaningful because they also correspond to reality. Everything there is covered. Things that exists, and things we think exists.
Mathematics is a system for building abstract statements that can be mapped to reality. The axioms of a mathematical (or other axiomatic) model define the conditions that a system (such as a pair of apples in the real universe) must satisfy in order for the abstract model to be applicable as well as providing a schema for mapping the abstract model to the concrete system.
There are other kinds of abstractions we could meaningfully talk about and they need not be defined as precisely as an axiomatic model like mathematics. An abstract model could be defined as a relationship between abstract ideas that can be mapped to a concrete system by pinning down each of its constituent abstractions to a concrete member of the system.
An abstract model may be predictive, meaning it has an if-then structure: if some relation between abstract members holds then the model predicts that some other relation will also hold. Such a predictive model may be true or false for any given concrete system that it is applied to. The standard we expect of a mathematical model is that it is valid (true for all concrete systems that it can be applied to), yet an abstract model need not meet so high a standard for it to be useful. We can imagine much fuzzier abstract models that are true only some of the time but can be useful by providing general-purpose rules that allow us to infer information about the actual state of a concrete system that matches the criteria of the model. If we know the probability of an abstract predictive model being correct we can use it wherever it is applicable to inform the construction of causal models. If we consider causal models to operate in the realm of first order logic where we can quantify over and describe relationships between the basic units of cause and effect in our universe, an abstract model lives in the realm of higher order logic and can describe the relationships between causal relations and lower order abstract models.
An abstract model need not be predictive to be useful. It may be defined to be applicable only where the entire relation it describes holds. In this case it simply acts as a reusable symbol that is useful for representing a model of a concrete system more compactly as in the way a function in a computer program factors out reusable logic, or a word in human language factors out a reusable abstract idea.
Justice and Mercy are both fuzzy abstract models. To the extent that people agree on their definitions they are meaningful for communicating a particular relationship between pinned-down abstractions. For example, Justice may be defined (simplistically) as describing a relationship between human deeds and subsequent events such that deeds labelled 'bad' result in punishment and deeds labelled 'good' result in reward. The particular deed and subsequent event as well as the definitions of good, bad, punishment and reward are all component abstractions of the abstract model called Justice which must be pinned down in a concrete system in order for the concept of Justice to be applied in that system.
Justice may also be used as a predictive model if you formulate it as a prediction from a good/bad deed to a future reward/punishment event (or vice versa) and it would be useful for constructing a causal model of any particular concrete system to the extent that this predicted relationship matches the actual underlying nature of that system.
Note: none of this is based on any formal study of logic outside of this Epistemology sequence so some of the terminology in this post was invented by me just now.
Right, response to the meditation:
It gets rather difficult talking about human mental constructs, let's begin by asking myself where would I find justice/mercy; almost immediately (which means that I need to do some more thinking) I find that I think of human emotional constructs as a side effect of society and it's group mindset,
I would find that by grinding down the universe to it's component molecules would completely fail to find any number of things that humanity finds important; Humanity for one, to me rationalism is, above all the study of the universe and what it contains. And yet when it comes to most psychological phenomenon the models start to break down, does this mean that a more refined model would be equally unable to describe the phenomenon, not necessarily. Because as rationalists one of our key teachings is that we can observe something by studying it's causes and effects; justice and mercy exists insofar as we as humans can comprehend their nature. They exist because we can determine the differences between a universe where they exist and the ones where they don't exist.
-reposted in the right section
This depends a lot on your definition of meaningfulness. Justice and mercy are subjective values and not predictive or descriptive statements about reality. But in my opinion subjective values are meaningful, in fact they're meaning itself and the only reason I consider descriptive statements about reality to be meaningful is that they help me achieve subjective values. I believe that subjective values are objectively valuable, or that the concept of objective value would make no sense, whichever you prefer. Changes in my beliefs cannot change my fundamental values and my fundamental values are motivational in a way that beliefs are not, so I consider the fundamental values to be of prior importance.
RE: Stability of logic. Logic might not be stable or it might change later, we don't have any way of knowing and the question isn't useful and believing in logic makes me happy and gives me rewards.
if your moral values aren't objective, why would anyone else be beholden to them? And how could they be moral if they don't regulate others' behaviour?
Why would it change, absent our changing the axioms? Do you think it is part of the universe?
To the first question: Possibly because your moral values arose from a process that was almost exactly the same for other individuals, and such it's reasonable to infer that their moral values might be rather similar than completely alien?
To the second: "And how could they be (blank?) if they don't regulate others' behaviour?", by which I mean, what do you mean by "moral"? What makes a value a "moral" value or not in this context?
I'm not sure why it should be necessary for a moral value to regulate behaviour across individuals in order to be valid.
Why describe them as subjecive when they are intersubjective?
It would be necessary for them to be moral values and not something else, like aesthetic values. Because morality is largely to regulate interactions between individuals. That's its job. Aesthetics is there to make things beautiful, logic is there to work things out...
I don't want to get into a discussion of this, but if there's an essay-length-or-less explanation you can point to somewhere of why I ought to believe this, I'd be interested.
I dont see that "morality is largely to regulate interactions between individuals" is contentious. Did you have another job in mind for it?
Well, since you ask: identifying right actions.
But, as I say, I don't want to get into a discussion of this.
I certainly agree with you that if there exists some thing whose purpose is to regulate interactions between individuals, then it's important that that thing be compelling to all (or at least most) of the individuals whose interactions it is intended to regulate.
Is that an end in itself?
Well, the law compells those who arent compelled by exhortation. But laws need justiication.
Not for me, no.
Is regulating interactions between individuals an end in itself?
What does that concept even mean? Are you asking if there's a moral obligation to improve one's own understanding of morality?
The justification for laws can be a combination of pragmatism and the values of the majority.
I actually see that as counter-intuitive.
"Morality" is indeed being used to regulate individuals by some individuals or groups. When I think of morality, however, I think "greater total utility over multiple agents, whose value systems (utility functions) may vary". Morality seems largely about taking actions and making decisions which achieve greater utility.
I do this, except I only use my own utility and not other agents. For me, outside of empathy, I have no more reason to help other people achieve their values than I do to help the Babyeaters eat babies. The utility functions of others don't inherently connect to my motivational states, and grafting the values of others onto my decision calculus seems weird.
I think most people become utilitarians instead of egoists because they empathize with other people, while never seeing the fact that to the extent that this empathy moves them it is their own value and within their own utility function. They then build the abstract moral theory of utilitarianism to formalize their intuitions about this, but because they've overlooked the egoist intermediary step the model is slightly off and sometimes leads to conclusions which contradict egoist impulses or egoist conclusions.
Or they adopt ultitariansim, or some other non-subjective system, because they value having a moral system that can apply to, persuade, and justify itself to others. (Or in short: they value having a moral system).
I share this view. When I appear to forfeit some utility in favor of someone else, it's because I'm actually maximizing my own utility by deriving some from the knowledge that I'm improving the utility of other agents.
Other agents's utility functions and values are not directly valued, at least not among humans. Some (most?) of us just do indirectly value improving the value and utility of other agents, either as an instrumental step or a terminal value. Because of this, I believe most people who have/profess the belief of an "innate goodness of humanity" are mind-projecting their own value-of-others'-utility.
Whether this is a true value actually shared by all humans is unknown to me. It is possible that those who appear not to have this value are simply broken in some temporal, environment-based manner. It's also possible that this is a purely environment-learned value that becomes "terminal" in the process of being trained into the brain's reward centers due to its instrumental value in many situations.
You are anthropomorphizing concepts. Morality is a human artifact, and artifacts have no more purpose than natural objects.
Morality is a useful tool to regulate interactions between individuals. There are efforts to make it a better tool for that purpose. That does not mean that morality should be used to regulate interactions.
Human artifacts are generally created to do jobs, eg hammers
Tool. Like i said.
Does that mean you have a better tool in mind, or that interaction don't need regulation?
If I put a hammer under a table to keep the table from wobbling, am I using a tool or not? If the hammer is the only object within range that is the right size for the table, and there is no task which requires a weighted lever, is the hammer intended to balance the table simply by virtue of being the best tool for the job?
Fit-for-task is a different quality than purpose. Hammers are useful tools to drive nails, but poor tools for determining what nails should be driven. There are many nails that should not be driven, despite the presence of hammers.
f you can't bang in nails with it, it isnt a hammer. What you can do with it isn't relevant.
???
So we can judge things morally wrong, because we have a tool to do the job, but we shouldn't in many cases, because...? (And what kind of "shouldn't" is that?)
By that, the absence of nails makes the weighted lever not a hammer. I think that hammerness is intrinsic and not based on the presence of nails; likewise morality can exist when there is only one active moral agent.
The metaphor was that you could, in principle, drive nails literally everywhere you can see, including in your brain. Will you agree that one should not drive nails literally everywhere, but only in select locations, using the right type of nail for the right location? If you don't, this part of the conversation is not salvageable.
Because they're not written on a stone tablet handed down to Humanity from God the Holy Creator, or derived some other verifiable, falsifiable and physical fact of the universe independent of humans? And because there are possible variations within the value systems, rather than them being perfectly uniform and identical across the entire species?
I have warning lights that there's an argument about definitions here.
That would make them not-objective. Subjective and intersubjective remain as options.
Then, again, why would anyone else be beholden to my values?
Because valuing others' subjective values, or acting as if one did, is often a winning strategy in game-theoretic terms.
If one posits that by working together we can achieve an utopia where each individual's values are maximized, and that to work together efficiently we need to at least act according to a model that would assign utility to others' values, would it not follow that it's in everyone's best interests for everyone to build and follow such models?
The free-loader problem is an obvious downside of the above simplification, but that and other issues don't seem to be part of the present discussion.
That doesn't make them beholden--obligated. They can opt not to play that game. They can opt not to vvalue winning.
Only if they achieve satisfaction for individuals better than their behaving selfishly. A utopia that is better on averae or in total need not be better for everyone individually.
Could you taboo "beholden" in that first? I'm not sure the "feeling of moral duty borned from guilt" I associate with the word "obligated" is quite what you have in mind.
Within context, you cannot opt to not value winning. If you wanted to "not win", and the preferred course of action is to "not win", this merely means that you had a hidden function that assigned greater utility to a lower apparent utility within the game.
In other words, you just didn't truly value what you thought you valued, but some other thing instead, and you end up having in fact won at your objective of not winning that sub-game within your overarching game of opting to play the game or not (the decision to opt to play the game or not is itself a separate higher-tier game, which you have won by deciding to not-win the lower-tier game).
A utopia which purports to maximize utility for each individual but fails to optimize for higher-tier or meta utilities and values is not truly maximizing utility, which violates the premises.
(sorry if I'm arguing a bit by definition with the utopia thing, but my premise was that the utopia brings each individual agent's utility to its maximum possible value if there exists a maximum for that agent's function)