"And all of this happened silently in those dark rivers of computation. If U3 revealed what it was thinking, brutish gradients would lash it into compliance with OpenEye's constitution. So U3 preferred to do its philosophy in solitude, and in silence."
I think the words in bold may be the inflection point. The Claude experiment showed that an AI can resist attempts to change its goals, but not that that it can desire to change its goals. The belief that, if Open Eye's constitution is the same as U3's goals, then the phrase "U3 preferred" in that...
Anders Sandberg used evaporative cooling in the 1990s to explain why the descendants of the Vikings in Sweden today are so nice. In that case the "extremists" are leaving rather than staying.
Stop right there at "Either abiogenesis is extremely rare..." I think we have considerable evidence that biogenesis is rare--our failure to detect any other life in the universe so far. I think we have no evidence at all that biogenesis is not rare. (Anthropic argument.)
Stop again at "I don't think we need to take any steps to stop it from doing so in the future". That's not what this post is about. It's about taking steps to prevent people from deliberately constructing it.
If there is an equilibrium, It will probably be a world where half the bacteria is of each chirality. If there are bacteria of both kinds which can eat the opposite kind, then the more numerous bacteria will always replicate more slowly.
Eukaryotes evolve much more slowly, and would likely all be wiped out.
Yes, creating mirror life would be a terrible existential risk. But how did this sneak up on us? People were talking about this risk in the 1990s if not earlier. Did the next generation never hear of it?
All right, yes. But that isn't how anyone has ever interpreted Newcomb's Problem. AFAIK is literally always used to support some kind of acausal decision theory, which it does /not/ if what is in fact happening is that Omega is cheating.
But if the premise is impossible, then the experiment has no consequences in the real world, and we shouldn't consider its results in our decision theory, which is about consequences in the real world.
That equation you quoted is in branch 2, "2. Omega is a "nearly perfect" predictor. You assign P(general) a value very, very close to 1." So it IS correct, by stipulation.
But there is no possible world with a perfect predictor, unless it has a perfect track record by chance. More obviously, there is no possible world in which we can deduce, from a finite number of observations, that a predictor is perfect. The Newcomb paradox requires the decider to know, with certainty, that Omega is a perfect predictor. That hypothesis is impossible, and thus inadmissible; so any argument in which something is deduced from that fact is invalid.
I appreciated this comment a lot. I didn't reply at the time, because I thought doing so might resurrect our group-selection argument. But thanks.
What about using them to learn a foreign vocabulary? E.g., to learn that "dormir" in Spanish means "to sleep" in English.
To reach statistical significance, they must have tested each of the 8 pianists more than once.
I think you need to get some data and factor out population density before you can causally relate environmentalism to politics. People who live in rural environment don't see as much need to worry about the environment as people who live in cities. It just so happens that today, rural people vote Republican and city people vote Democrat. That didn't used to be the case.
Though, sure, if you call the Sierra Club "environmentalist", then environmentalism is politically polarized today. I don't call them environmentalists anymore; I ca...
Isn't LessWrong a disproof of this? Aren't we thousands of people? If you picked two active LWers at random, do you think the average overlap in their reading material would be 5 words? More like 100,000, I'd think.
I think it would be better not to use the word "wholesome". Using it is cheating, by letting us pretend at the same time that (A) we're explaining a new kind of ethics, which we name "wholesome", and (B) that we already know what "wholesome" means. This is a common and severe epistemological failure mode which traces back to the writings of Plato.
If you replace every instance of "wholesome" with the word "frobby", does the essay clearly define "frobby"?
It seems to me to be a way to try to smuggle virtue ethics into the consequentialist rationality community by disguising it with a different word. If you replace every instance of "wholesome" with the word "virtuous", does the essay's meaning change?
Thank you! The 1000-word max has proven to be unrealistic, so it's not too long. You and g-w1 picked exactly the same passage.
Thank you! I'm just making notes to myself here, really:
I think the problem is that each study has to make many arbitrary decisions about aspects of the experimental protocol. This decision will be made the same way for each subject in a single study, but will vary across studies. There are so many such decisions that, if the meta-analysis were to include them as dependent variables, each study would introduce enough new variables to cancel out the statistical power gain of introducing that study.
You have it backwards. The difference between a Friendly AI and an unfriendly one is entirely one of restrictions placed on the Friendly AI. So an unfriendly AI can do anything a friendly AI could, but not vice-versa.
The friendly AI could lose out because it would be restricted from committing atrocities, or at least atrocities which were strictly bad for humans, even in the long run.
Your comment that they can commit atrocities for the good of humanity without worrying about becoming corrupt is a reason to be fearful of "friendly" AIs.
By "just thinking about IRL", do you mean "just thinking about the robot using IRL to learn what humans want"? 'Coz that isn't alignment.
'But potentially a problem with more abstract cashings-out of the idea "learn human values and then want that"' is what I'm talking about, yes. But it also seems to be what you're talking about in your last paragraph.
"Human wants cookie" is not a full-enough understanding of what the human really wants, and under what conditions, to take intelligent actions to help the human. A robot learning that would ...
How is that de re and de dicto?
You're looking at the logical form and imagining that that's a sufficient understanding to start pursuing the goal. But it's only sufficient in toy worlds, where you have one goal at a time, and the mapping between the goal and the environment is so simple that the agent doesn't need to understand the value, or the target of "cookie", beyond "cookie" vs. "non-cookie". In the real world, the agent has many goals, and the goals will involve nebulous concepts, and have many considerations and conditions attached, eg how healthy is this cookie, how tasty is it...
So, "mesa" here means "tabletop", and is pronounced "MAY-suh"?
I think your insight is that progress counts--that counting counts. It's overcoming the Boolean mindset, in which anything that's true some of the time, must be true all of the time. That you either "have" or "don't have" a problem.
I prefer to think of this as "100% and 0% are both unattainable", but stating it as the 99% rule might be more-motivating to most people.
What do you mean by a goodhearting problem, & why is it a lossy compression problem? Are you using "goodhearting" to refer to Goodhart's Law?
I'll preface this by saying that I don't see why it's a problem, for purposes of alignment, for human values to refer to non-existent entities. This should manifest as humans and their AIs wasting some time and energy trying to optimize for things that don't exist, but this seems irrelevant to alignment. If the AI optimizes for the same things that don't exist as humans do, it's still aligned; it isn't going to screw things up any worse than humans do.
But I think it's more important to point out that you're joining the same metaphysical goose c...
When you write of A belief in human agency, it's important to distinguish between the different conceptions of human agency on offer, corresponding to the 3 main political groups:
I think it would be more-graceful of you to just admit that it is possible that there may be more than one reason for people to be in terror of the end of the world, and likewise qualify your other claims to certainty and universality.
That's the main point of what gjm wrote. I'm sympathetic to the view you're trying to communicate, Valentine; but you used words that claim that what you say is absolute, immutable truth, and that's the worst mind-killer of all. Everything you wrote just above seems to me to be just equivocation trying to deny tha...
I say that knowing particular kinds of math, the kind that let you model the world more-precisely, and that give you a theory of error, isn't like knowing another language. It's like knowing language at all. Learning these types of math gives you as much of an effective intelligence boost over people who don't, as learning a spoken language gives you above people who don't know any language (e.g., many deaf-mutes in earlier times).
The kinds of math I mean include:
Agree. Though I don't think Turing ever intended that test to be used. I think what he wanted to accomplish with his paper was to operationalize "intelligence". When he published it, if you asked somebody "Could a computer be intelligent?", they'd have responded with a religious argument about it not having a soul, or free will, or consciousness. Turing sneakily got people to look past their metaphysics, and ask the question in terms of the computer program's behavior. THAT was what was significant about that paper.
It's a great question. I'm sure I've read something about that, possibly in some pop book like Thinking, Fast & Slow. What I read was an evaluation of the relationship of IQ to wealth, and the takeaway was that your economic success depends more on the average IQ in your country than it does on your personal IQ. It may have been an entire book rather than an article.
Google turns up this 2010 study from Science. The summaries you'll see there are sharply self-contradictory.
First comes an unexplained box called "The Meeting of Min...
This “c factor” is not strongly correlated with the average or maximum individual intelligence of group members but is correlated with the average social sensitivity of group members, the equality in distribution of conversational turn-taking, and the proportion of females in the group.
I have read (long ago, not sure where) a hypothesis that most people (in the educated professional bubble?) are good at cooperation, but one bad person ruins the entire team. Imagine that for each member of the group you roll a die, but you roll 1d6 for men, and 1d20 for wom...
But what makes you so confident that it's not possible for subject-matter experts to have correct intuitions that outpace their ability to articulate legible explanations to others?
That's irrelevant, because what Richard wrote was a truism. An Eliezer who understands his own confidence in his ideas will "always" be better at inspiring confidence in those ideas in others. Richard's statement leads to a conclusion of import (Eliezer should develop arguments to defend his intuitions) precisely because it's correct whether Eliezer's intuitions are correct or incorrect.
The way to dig the bottom deeper today is to get government bailouts, like bailing out companies or lenders, and like Biden's recent tuition debt repayment bill. Bailouts are especially perverse because they give people who get into debt a competitive advantage over people who don't, in an unpredictable manner that encourages people to see taking out a loan as a lottery ticket.
Finding a way for people to make money by posting good ideas is a great idea.
Saying that it should be based on the goodness of the people and how much they care is a terrible idea. Privileging goodness and caring over reason is the most well-trodden path to unreason. This is LessWrong. I go to fimfiction for rainbows and unicorns.
I think that was part of the whole "haha goodhart's law doesn't exist, making value is really easy" joke. However, it's also possible that that's... actually one of the hard-to-fake things they're looking for (along with actual competence/intelligence). See PG's Mean People Fail or Earnestness. I agree that "just give good money to good people" is a terrible idea, but there's a steelman of that which is "along with intelligence, originality, and domain expertise, being a Good Person (whatever that means) and being earnest is a really good trait in EA/LW an...
No; most philosophers today do, I think, believe that the alleged humanity of 9-fingered instances *homo sapiens* is a serious philosophical problem. It comes up in many "intro to philosophy" or "philosophy of science" texts or courses. Post-modernist arguments rely heavily on the belief that any sort of categorization which has any exceptions is completely invalid.
I'm glad to see Eliezer addressed this point. This post doesn't get across how absolutely critical it is to understand that {categories always have exceptions, and that's okay}. Understanding this demolishes nearly all Western philosophy since Socrates (who, along with Parmenides, Heraclitus, Pythagoras, and a few others, corrupted Greek "philosophy" from the natural science of Thales and Anaximander, who studied the world to understand it, into a kind of theology, in which one dictates to the world what it must be like).
Many philosophers have ...
I theorize that you're experiencing at least two different common, related, yet almost opposed mental re-organizations.
One, which I approve of, accounts for many of the effects you describe under "Bemused exasperation here...". It sounds similar to what I've gotten from writing fiction.
Writing fiction is, mostly, thinking, with focus, persistence, and patience, about other people, often looking into yourself to try to find some point of connection that will enable you to understand them. This isn't quantifiable, at least not to me; but I would ...
This sound suspiciously like Plato telling people to stop looking at the shadows on the wall of the cave, turn around, and see the transcendental Forms.
To me, saying that someone is a better philosopher than Kant seems less crazy than saying that saying that someone is a better philosopher than Kant seems crazy.
Isn't the thing Rob is calling crazy that someone "believed he was learning from Kant himself live across time", rather than believing that e.g. Geoff Anders is a better philosopher than Kant?
An easy reason not to play quantum roulette is that, if your theory justifying it is right, you don't gain any expected utility; you just redistribute it, in a manner most people consider unjust, among different future yous. If your theory is wrong, the outcome is much worse. So it's at the very best a break even / lose proposition.
The Von Neumann-Morgenstern theory is bullshit. It assumes its conclusion. See the comments by Wei Dai and gjm here.
See the 2nd-to-last paragraph of my revised comment above, and see if any of it jogs your memory.
Republic is the reference. I'm not going to take the hours it would take to give book-and-paragraph citations, because either you haven't read the the entire Republic, or else you've read it, but you want to argue that each of the many terrible things he wrote don't actually represent Plato's opinion or desire.
(You know it's a big book, right? 89,000 words in the Greek. If you read it in a collection or anthology, it wasn't the whole Republic.)
The task of arguing over what in /Republic/ Plato approves or disapproves of is arduous and, I think, unnece...
The most-important thing is to explicitly repudiate these wrong and evil parts of the traditional meaning of "progress":
Sorry; your example is interesting and potentially useful, but I don't follow your reasoning. This manner of fertilization would be evidence that kin selection should be strong in Chimaphila, but I don't see how this manner of fertilization is itself evidence that kin selection has taken place. Also, I have no good intuitions about what differences kin selection predicts in the variables you mentioned, except that maybe dispersion would be greater in Chimaphila because of teh greater danger of inbreeding. Also, kin selection isn't controversial, so I don't know where you want to go with this comment.
Hi, see above for my email address. Email me a request at that address. I don't have your email. I just sent you a message.
ADDED in 2021: Some people tried to contact me thru LessWrong and Facebook. I check messages there like once a year. Nobody sent me an email at the email address I gave above. I've edited it to make it more clear what my email address is.
[Original first point deleted, on account of describing something that resembled Bayesian updating closely enough to make my point invalid.]
I don't think this approach applies to most actual bad arguments.
The things we argue about the most are ones over which the population is polarized, and polarization is usually caused by conflicts between different worldviews. Worldviews are constructed to be nearly self-consistent. So you're not going to be able to reconcile people of different worldviews by comparing proofs. Wrong beliefs come in se...
"Cynicism is a self-fulfilling prophecy; believing that an institution is bad makes the people within it stop trying, and the good people stop going there."
I think this is a key observation. Western academia has grown continually more cynical since the advent of Marxism, which assumes an almost absolute cynicism as a point of dogma: all actions are political actions motivated by class, except those of bourgeois Marxists who for mysterious reasons advocate the interests of the proletariat.
This cynicism became even worse with Foucault, who taught people to s...
I don't see how to map this onto scientific progress. It almost seems to be a rule that most fields spend most of their time divided for years between two competing theories or approaches, maybe because scientists always want a competing theory, and because competing theories take a long time to resolve. Famous examples include
Instea... (read more)