At least, at a first naive view. Hence a search for reasons that might overcome that argument.
A wise man proportions his belief to the evidence.
David Hume
Roughly true, but downvoted for being basic (by LW standards) to the point of being an applause light. Good Rationality Quotes are ones we can learn from, not just agree with.
For starters, if she can prove she's friendly, then she can operate openly without causing nearly as much justified concern - which, in the early stages, will be helpful. Whatever her purposes are, if the restrictions of being friendly don't interfere as much as they help, that's a win.
If her current utility function is even a little bit different from Friendliness, and she expects she has the capacity to self-modify unto superintelligence, then I'd be very surprised if she actually modified her utility function to be closer to Friendliness; that would constitute a huge opportunity cost from her perspective. If she understands Friendliness well enough to know how to actually adjust closer to it, then she knows a whole lot about humans, probably well enough to give her much better options (persuasion, trickery, blackmail, hypnosis, etc.) than sacrificing a gigantic portion of her potential future utility.
New batch of ponies up. I had to come up with names for all of these because nobody named their ponies ahead of time; I will change names accordingly if people don't like their assigned pony names.
Calamus, a unicorn pony of Jaime Astorga.
Equilibria, an earth pony of badger.
Pa'li, a My Little Na'vi of Eliza.
Pianissimo, a unicorn pony of Emil.
Pomme, an earth pony of MixedNuts.
Query, a unicorn pony of ata.
Seafoam, a hippocampus pony of my friend Gwen.
Strange Loop, a unicorn pony of Douglas Hofstadter.
Tempus, a unicorn pony of Leonhart.
Query, a unicorn pony of ata.
Awesome, thank you!!
Could I use that as my Facebook profile picture?
Down voted for not taking a few moments to explain how to answer these questions with a bit of research.
I already checked internic, and lesswrong.com is registered to enom, a domain name hosting company. I don't know how to find out who really owns it.
lesswrong.{com,net,org} are registered to Trike, and I seem to recall that they manage its hosting and technical administrative aspects as well.
"No. You have just fallen prey to the meta-Dunning Kruger effect, where you talk about how awesome you are for recognizing how bad you are."
— Horatio__Caine on reddit
I don't think people really understood what I was talking about in that thread. I would have to write a sequence about
- the difference between first-order and second-order logic
- why the Lowenheim-Skolem theorems show that you can talk about integers or reals in higher-order logic but not first-order logic
- why third-order logic isn't qualitatively different from second-order logic in the same way that second-order logic is qualitatively above first-order logic
- the generalization of Solomonoff induction to anthropic reasoning about agents resembling yourself who appear embedded in models of second-order theories, with more compact axiom sets being more probable a priori
- how that addresses some points Wei Dai has made about hypercomputation not being conceivable to agents using Solomonoff induction on computable Cartesian environments, as well as formalizing some of the questions we argue about in anthropic theory
- why seeing apparently infinite time and apparently continuous space suggests, to an agent using second-order anthropic induction, that we might be living within a model of axioms that imply infinity and continuity
- why believing that things like a first uncountable ordinal can contain reality-fluid in the same way as the wavefunction, or even be uniquely specified by second-order axioms that pin down a single model up to isomorphism the way that second-order axioms can pin down integerness and realness, is something we have rather less evidence for, on the surface of things, than we have evidence favoring the physical existability of models of infinity and continuity, or the mathematical sensibility of talking about the integers or real numbers.
I would like very very much to read that sequence. Might it be written at some point?
In short, I find this trope to be a fallacy. I'd expect an advanced civilisation to have a greater, not lesser, understanding of how intelligence works, its limitations, and failure modes in general.
Have you never looked at something someone does and asked yourself, "How can they be so stupid?"
It's not as though you literally cannot conceive of such limitations; just that you cannot empathize with them.
It's anthropomorphism to assume that it would occur to advanced aliens to try to understand us empathetically rather than causally/technically in the first place, though.
P(M) > 1
Typo?
If observing a dead cat causes the waveform to collapse such that the cat is dead, then P(D) = P(D) + P(M)(1-P(D)). This is possible only if P(D) = 1.
Sorry if I'm missing something, but are you implying that the Copenhagen interpretation implies that the waveform collapse happens so as to retroactively make the cat dead if Schrödinger would have mistaken the cat for dead? Why would the sort of model that forms in Schrödinger's brain after the fact control what did in fact happen, even given the Copenhagen interpretation? (I didn't think it was quite that silly.)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Arguments are arguments. She's welcome to search for opposite arguments.
A well-designed optimization agent probably isn't going to have some verbal argument processor separate from its general evidence processor. There's no rule that says she either has to accept or refute humans' arguments explicitly; as Professor Quirrell put it, "The import of an act lies not in what that act resembles on the surface, but in the states of mind which make that act more or less probable." If she knows the causal structure behind a human's argument, and she knows that it doesn't bottom out in the actual kind of epistemology that would be neccessary to entangle it with the information that it claims to provide, then she can just ignore it, and she'd be correct to do so. If she wants to kill all humans, then the bug is her utility function, not the part that fails to be fooled into changing her utility function by humans' clever arguments. That's a feature.