Comment author: bokov 18 October 2013 05:22:22PM 0 points [-]

How much do you want to bet on the conjunction of all those claims? (hint: I think at least one of them is provably untrue even according to current knowledge)

How much do you want to bet on the conjunction of yours?

Comment author: scav 18 October 2013 07:20:20PM 1 point [-]

Just for exercise, let's estimate the probability of the conjunction of my claims.

claim A: I think the idea of a single 'self' in the brain is provably untrue according to currently understood neuroscience. I do honestly think so, therefore P(A) as close to 1.0 as makes no difference. Whether I'm right is another matter.

claim B: I think a wildly speculative vague idea thrown into a discussion and then repeatedly disclaimed does little to clarify anything. P(B) approx 0.998 - I might change my mind before the day is out.

claim C: The thing I claim to think in claim B is in fact "usually" true. P(C) maybe 0.97 because I haven't really thought it through but I reckon a random sample of 20 instances of such would be unlikely to reveal 10 exceptions, defeating the "usually".

claim D: A running virtual machine is a physical process happening in a physical object. P(D) very close to 1, because I have no evidence of non-physical processes, and sticking close to the usual definition of a virtual machine, we definitely have never built and run a non-physical one.

claim E: You too are a physical process happening in a physical object. P(E) also close to 1. Never seen a non-physical person either, and if they exist, how do they type comments on lesswrong?

claim F: Nobody knows enough about the reality of consciousness to make legitimate claims that human minds are not information-processing physical processes. P(F) = 0.99. I'm pretty sure I'd have heard something if that problem had been so conclusively solved, but maybe they were disappeared by the CIA or it was announced last week and I've been busy or something.

P( A B C D E F) is approx 0.96.

The amount of money I'd bet would depend on the odds on offer.

I fear I may be being rude by actually answering the question you put to me instead of engaging with your intended point, whatever it was. Sorry if so.

Comment author: Mitchell_Porter 18 October 2013 11:23:26AM 6 points [-]

I don't believe any of the various purely computational definitions of personhood and survival, so just preserving the shapes of neurons, etc., doesn't mean much to me. My best bet is that the self is a single physical thing, a specific physical phenomenon, which forms at a definite moment in the life of the organism, persists through time even during unconsciousness, and ceases to exist when its biological matrix becomes inhospitable. For example, it might be an intricate topological vortex that forms in a (completely hypothetical) condensate of phonons and/or biophotons, somewhere in the cortex.

That is just a wild speculation, made for the sake of concreteness. But what is really unlikely is that I am just a virtual machine, in the sense of computer science - a state machine whose states are coarse-grainings of the actual microphysical states, and which can survive to run on another, physically distinct computer, so long as it reproduces the rough causal structure of the original.

Physically, what is a computer? Nuclei and electrons. And physically, what is a computer program? It is an extreme abstraction of what some of those nuclei and electrons are doing. Computers are designed so that these abstractions remain valid - so that the dynamics of the virtual machine will match the dynamics of the physical object, unless something physically disruptive occurs.

The physical object is the reality, the virtual machine is just a concept. But the information-centric theory of what minds are and what persons are, is that they are virtual machines - a reification of a conceptual construct. This is false to the robust reality of consciousness, especially, which is why I insist on a theory of the self that is physical and not just computational.

I don't want to belabor this point, but just want to make clear again why I dissent from the hundred protean ideas out there, about mind uploading, copies, conscious simulations, platonic programs, personal resurrection from digital brain-maps, and so on, in favor of speculations about a physical self within the brain. Such a self would surely have unconscious coprocessors, other brain regions that would be more like virtual machines, functional adjuncts to the conscious part, such as the immediate suppliers of the boundary conditions which show up in experience as sensory perceptions. But you can't regard the whole of the mind as nothing but virtual machines. Some part of it has to be objectively real.

What would be the implications of this "physical" theory of identity, for cryonics? I will answer as if the topological vortex theory is the correct one, and not just a placeholder speculation.

The idea is that you begin to exist when the vortex begins to exist, and you end when it ends. By this criterion, the odds look bad for the proposition that survival through cryonics is possible. I could invent a further line of speculation as to how the web of quantum entanglement underlying the vortex is not destroyed by the freezing process, but rather gets locked into the ground state of the frozen brain; and such a thing is certainly thinkable, but that's all, and it is equally thinkable that the condensate hosting the vortex depends for its existence on a steady expenditure of energy provided by cellular metabolism, and must therefore disintegrate when the cells freeze. From this perspective cryonics looks like an unlikely gamble, a stab in the dark. So an advocate would have to revert to the old argument that even if the probability of survival through cryonics is close to zero, the probability of survival through non-cryonics is even closer to zero.

What about the idea of surviving by preserving your information? The vortex version of this concept is, OK, during this life you are a quantum vortex in your brain, and that vortex must cease to exist in a cryonically preserved brain; but in the future we can create a new vortex in a new brain, or in some other appropriate physical medium, and then we can seed it with information from the old brain. And thereby, you can live again - or perhaps just approximate-you, if only some of the information got through.

To say anything concrete here requires even more speculation. One might say that the nature of such resurrection schemes would depend a great deal on the extent to which the details of a person depend on information in the vortex, or on information in the virtual coprocessors of the vortex. Is the chief locus of memory, a virtual machine outside of and separate from the conscious part of the brain, coupled to consciousness so that memories just appear there as needed; or are there aspects of memory which are embedded in the vortex-self itself? To reproduce the latter would require, not just the recreation of memory banks adjoining the vortex-self, but the shaping and seeding of the inner dynamics of the vortex.

Either way, personally I find no appeal in the idea of "survival" via such construction of a future copy. I'm a particular "vortex" already; when that definitively sputters out, that's it for me. But I know many others feel differently, and such divergent attitudes might still exist, even if a vortex revolution in philosophy of mind replaced the program paradigm.

I somewhat regret the extremely speculative character of these remarks. They read as if I'm a vortex true believer. The point is to suggest what a future alternative to digital crypto-dualism might look like.

Comment author: scav 18 October 2013 04:05:05PM 0 points [-]

My best bet is that the self is a single physical thing, a specific physical phenomenon, which forms at a definite moment in the life of the organism, persists through time even during unconsciousness, and ceases to exist when its biological matrix becomes inhospitable.

How much do you want to bet on the conjunction of all those claims? (hint: I think at least one of them is provably untrue even according to current knowledge)

That is just a wild speculation, made for the sake of concreteness.

I don't think it supplied the necessary amount of concreteness to be useful; this is usual for wild speculation. ;)

The physical object is the reality, the virtual machine is just a concept.

A running virtual machine is a physical process happening in a physical object. So are you.

This is false to the robust reality of consciousness

Well, nobody actually knows enough about the reality of consciousness to make that claim. It may be that it is incompatible with your intuitions about consciousness. Mine too, so I haven't any alternative claims to make in response.

Comment author: ChrisHallquist 10 October 2013 05:50:08PM *  2 points [-]

Hmmm... the things you complain about are all me anticipating objections to this post, namely:

  1. Given Duverger's law and how bad the two-party system is, shouldn't the conclusion be not, "accept the two-party system" but "get proportional representation"? (Response: proportional representation might be slightly better, but seems implausible that it makes a huge difference.)
  2. Skepticism about whether many people really buy the Obama conspiracy theories / "I'm a libertarian Romney voter and I'm offended!"

But maybe re-word and turn into footnotes?

Edit: footnote-ized!

Comment author: scav 11 October 2013 11:42:45AM 6 points [-]

I find the conclusion that the US would be better off with some form of proportional representation pretty compelling actually, and I don't think it's so implausible that it would make a positive difference.

The difference it makes in Europe (compared to the UK for example) seems to be that the smaller parties with agendas the median voter doesn't care much about still get a voice in parliament. It's worth it for the Greens or the Pirate party to campaign for another 1% of the vote, because they get another 1% of the seats, instead of nothing.

It should be a better marketplace of ideas; although a few major parties still keep most of the power, they have more incentive to accommodate or adopt new ideas. I suppose the presence of the minor parties increases the visibility of multiple policy axes, forcing the major parties to compete for the median voter along each axis.

Having said that, it still isn't very relevant to the thrust of the post, so the decision to footnote it was probably correct.

Comment author: scav 10 October 2013 10:44:07AM 1 point [-]

Thanks for identifying Duverger's Law. I had never heard of it, but I had informally grasped its application in UK politics.

Comment author: Lumifer 07 October 2013 03:07:45PM 2 points [-]

Please tell me that isn't the sort of thing you mean.

Your wish is my command! No, that isn't the sort of thing I meant.

I meant this quite literally and without a preference for the Magenta party or the Cyan party. Given two alternatives and the way they are presented in the popular media, it is often (but not always) possible to predict the preferences of the low-IQ crowd. The end.

That issue is different from political tribalism.

Having said that, I haven't run any reasonably controlled experiments so at this point it's just my opinion without data to support it.

Comment author: scav 08 October 2013 08:07:12AM 3 points [-]

Depressing but plausible :(

I suspect "the way they are presented in the popular media" is crafted with that in mind.

Comment author: James_Miller 06 October 2013 05:47:15PM 2 points [-]

No. If an organization contains sub-competent people, it should take this into account when designing traditions and protocols.

Comment author: scav 07 October 2013 11:39:36AM 7 points [-]

Corollary: all organisations eventually contain sub-competent people. Design protocols accordingly.

Comment author: Lumifer 03 October 2013 03:04:51PM 0 points [-]

While true, people who are too stupid to be allowed near sharp objects have preferences and make choices that are not quite random. It is often (but not always) the case that given several alternatives, one can reliably predict towards which one most stupid people will gravitate.

Comment author: scav 07 October 2013 11:36:07AM -1 points [-]

Citation, or at least a clear example, needed. I can probably construct two policy alternatives, and predict which will be attractive to people who identify with a given political tribe. Then I suppose I get to call one of those options the "stupid" one based on my own value system.

Please tell me that isn't the sort of thing you mean.

I have met people with what I consider to be very irrational political views (in that they are little more than clusters of rote debating points never subjected to analysis). Outside of the well-worn habitual responses their politics would dictate they regurgitate, I have no idea how they would choose on an issue they had never encountered before.

Maybe stupidly (because they aren't in the habit of reflective thought), but maybe less so (because without a knee-jerk political reaction ready to hand, they might take a few seconds to think).

I will go so far as to agree that in too many cases, simple answers will be favoured over complex questions, and instant gratification will be favoured over longer-term advantage.

Comment author: pengvado 07 September 2013 07:22:29PM *  9 points [-]

In fact, the question itself seems superficially similar to the halting problem, where "running off the rails" is the analogue for "halting"

If you want to draw an analogy to halting, then what that analogy actually says is: There are lots of programs that provably halt, and lots that provably don't halt, and lots that aren't provable either way. The impossibility of the halting problem is irrelevant, because we don't need a fully general classifier that works for every possible program. We only need to find a single program that provably has behavior X (for some well-chosen value of X).

If you're postulating that there are some possible friendly behaviors, and some possible programs with those behaviors, but that they're all in the unprovable category, then you're postulating that friendliness is dissimilar to the halting problem in that respect.

Comment author: scav 09 September 2013 07:38:38PM 1 point [-]

It's still probably premature to guess whether friendliness is provable when we don't have any idea what it is. My worry is not that it wouldn't be possible or provable, but that it might not be a meaningful term at all.

But I also suspect friendliness, if it does mean anything, is in general going to be so complex that "only [needing] to find a single program that provably has behaviour X" may be beyond us. There are lots of mathematical conjectures we can't prove, even without invoking the halting problem.

One terrible trap might be the temptation to make simplifications in the model to make the problem provable, but end up proving the wrong thing. Maybe you can prove that a set of friendliness criteria are stable under self-modification, but I don't see any way to prove those starting criteria don't have terrible unintended consequences. Those are contingent on too many real-world circumstances and unknown unknowns. How do you even model that?

Comment author: John_Maxwell_IV 08 September 2013 10:59:21PM *  0 points [-]

Here are a couple of other proposals (which I haven't thought about very long) for consideration:

  • Have the AI create an internal object structure of all the concepts in the world, trying as best as it can to carve reality at its joints. Let the AI's programmers inspect this object structure, make modifications to it, then formulate a command for the AI in terms of the concepts it has discovered for itself.

  • Instead of developing a foolproof way for the AI to understand meaning, develop an OK way for the AI to understand meaning and pair it with a really good system for keeping a distribution over different meanings and asking clarifying questions.

Comment author: scav 09 September 2013 07:15:21PM 0 points [-]

That first one would be worth doing even if we didn't dare hand the AI the keys to go and make changes. To study a non-human-created ontology would be fascinating and maybe really useful.

Comment author: XiXiDu 05 September 2013 10:58:05AM -1 points [-]

To be better able to respond to your comment, please let me know in what way you disagree with the following comparison between narrow AI and general AI:

Narrow artificial intelligence will be denoted NAI and general artificial intelligence GAI.

(1) Is it in principle capable of behaving in accordance with human intention to a sufficient degree?

NAI: True

GAI: True

(2) Under what circumstances does it fail to behave in accordance with human intention?

NAI: If it is broken, where broken stands for a wide range of failure modes such as incorrectly managing memory allocations.

GAI: In all cases in which it is not mathematically proven to be tasked with the protection of, and equipped with, a perfect encoding of all human values or a safe way to obtain such an encoding.

(3) What happens when it fails to behave in accordance with human intention?

NAI: It crashes, freezes or halts. It generally fails in a way that is harmful to its own functioning. If for example an autonomous car fails at driving autonomously it usually means that it will either go into safe-mode and halt or crash.

GAI: It works perfectly well. Superhumanly well. All its intended capabilities are intact except that it completely fails at working as intended in such a way as to destroy all human value in the universe. It will be able to improve itself and capable of obtaining a perfect encoding of human values. It will use those intended capabilities in order to deceive and overpower humans rather than doing what it was intended to do.

(4) What happens if it is bound to use a limited amount of resources, use a limited amount of space or run for a limited amount of time?

NAI: It will only ever do what it was programmed to do. As long as there is no fatal flaw, harming its general functionality, it will work within the defined boundaries as intended.

GAI: It will never do what it was programmed to do and always remove or bypass its intended limitations in order to pursue unintended actions such as taking over the universe.


Please let me also know where you disagree with the following points:

(1) The abilities of systems are part of human preferences as humans intend to give systems certain capabilities and, as a prerequisite to build such systems, have to succeed at implementing their intentions.

(2) Error detection and prevention is such a capability.

(3) Something that is not better than humans at preventing errors is no existential risk.

(4) Without a dramatic increase in the capacity to detect and prevent errors it will be impossible to create something that is better than humans at preventing errors.

(5) A dramatic increase in the human capacity to detect and prevent errors is incompatible with the creation of something that constitutes an existential risk as a result of human error.

Comment author: scav 09 September 2013 01:47:33PM 2 points [-]

First list:

1) Poorly defined terms "human intention" and "sufficient".

2) Possibly under any circumstances whatsoever, if it's anything like other non-trivial software, which always has some bugs.

3) Anything from "you may not notice" to "catastrophic failure resulting in deaths". Claim that failure of software to work as humans intend will "generally fail in a way that is harmful to it's own functioning" is unsupported. E.g. a spreadsheet works fine if the floating point math is off in the 20th bit of the mantissa. The answers will be wrong, but there is nothing about that that the spreadsheet could be expected to care about,

4) Not necessarily. GAI may continue to try to do what it was programmed to do, and only unintentionally destroy a small city in the process :)

Second list:

1) Wrong. The abilities of sufficiently complex systems are a huge space of events humans haven't thought about yet, and so do not yet have preferences about. There is no way to know what their preferences would or should be for many many outcomes.

2) Error as failure to perform the requested action may take precedence over error as failure to anticipate hypothetical objections from some humans to something they hadn't expected. For one thing, it is more clearly defined. We already know human-level intelligences act this way.

3) Asteroids and supervolcanoes are not better than humans at preventing errors. It is perfectly possible for something stupid to be able to kill you. Therefore something with greater cognitive and material resources than you, but still with the capacity to make mistakes can certainly kill you. For example, a government.

4) It is already possible for a very fallible human to make something that is better than humans at detecting certain kinds of errors.

5) No. Unless by dramatic you mean "impossibly perfect, magical and universal".

View more: Prev | Next