Comment author: gRR 18 February 2015 06:08:09AM 4 points [-]

I am confused about how Philosopher's stone could help with reviving Hermione. Does QQ mean to permanently transfigure her dead body into a living Hermione? But then, would it not mean that Harry could do it now, albeit temporarily? And, he wouldn't even need a body. He could then just temporarily transfigure any object into a living Hermione. Also, now that I think of it, he could transfigure himself a Feynman and a couple of Einsteins...

Comment author: Houshalter 10 May 2014 09:49:24AM *  0 points [-]

It may very well be possible to build such an AI. However there are several issues with it:

  • The AI can be adapted for other, less restricted, domains if knowledge on how it works spreads. There would be a large incentive to since such an oracle would only be of limited utility.

  • The AI adds code that will evolve into another AI into it's output. It's remotely possible, depending on what kind of problems you have it working on. If you were using it to design more efficient algorithms, in some cases an AI of some form might be the optimal solution.

    Even if you 100% trust the AI to provide the optimal output, you can't trust that the optimal output to the problem you've specified is what you actually want.

  • The AI could self-modify incorrectly and result in unfriendly AI. In order to be provably friendly/restricted, it would have to be 100% certain of any modification. That's a very tall order, especially in AI where everything has to be approximations or probabilistic.

  • It might not be as safe as you think it is. The AI runs some code and gets an unexpected result. Possibly because of a bug in the environment itself. Look up how difficult it is to sandbox untrusted code and you will get some appreciation for how a superintelligence could figure a way out of it's box.

    But it can't do anything with any exploits it finds because it is restricted to hard-coded axioms? Well, maybe. If it's using probabilities and some form of machine learning, it might be able to learn that "executing this code give me this result" and then learn to take advantage of that. I don't believe that a system can work only in formal proofs. However I might be completely wrong about this one, it's just a thought.

Comment author: gRR 10 May 2014 10:53:46AM *  0 points [-]

The AI can be adapted for other, less restricted, domains

That the ideas from a safe AI can be used to build an unsafe AI is a general argument against working on (or even talking about) any kind of AI whatsoever.

The AI adds code that will evolve into another AI into it's output

The output is to contain only proofs of theorems. Specifically, a proof (or refutation) of the theorem in the input. The state of the system is to be reset after each run so as to not accumulate information.

The AI could self-modify incorrectly and result in unfriendly AI

Any correct or incorrect self-modification is still restricted to the math domain, and so cannot result in an unsafe AI.

bug in the environment itself

Guarding against software bugs is easy in this case. You design an abstract virtual machine environment for the AI, then design the software that implements this environment, then formally prove that the software is correct. Guarding against errors caused by cosmic rays is also easy. You estimate the probability of such an error, and then add redundancy until the probability is so low that it won't happen until the end of the universe.

Look up how difficult it is to sandbox untrusted code

Sandboxing untrusted code is easy. The difficult thing is sandboxing it while making it think that it runs normally. This is irrelevant here.

I don't believe that a system can work only in formal proofs

It is quite probable that a "pure math Oracle" system cannot work. The point was, it can be made safe to try.

Comment author: Mitchell_Porter 08 May 2014 02:33:26PM 9 points [-]

I look at the abstracts of new papers on the quant-ph archive every day. This is a type of paper which, based on the abstract, I would almost certainly not bother to look at. Namely, it proposes to explain where quantum theory comes from, in terms which obviously seem like they will not be enough. I read the promise in the title and abstract and think, "Where is the uncertainty principle going to come from - the minimum combined uncertainty for complementary observables? How will the use of complex numbers arise?"

I did scroll through the paper and notice lots of rigorous-looking probability formalism. I was particularly waiting to see how complex numbers entered the picture. They show up a little after equation 47, when two real-valued functions are combined into one complex-valued function... I also noticed that the authors were talking about "Fisher information". This was unsurprising, there are other people who want to "derive physics from Fisher information", so clearly this paper is part of that dubious trend.

At a guess - without having worked through the paper - I would say that the authors' main sin will turn out to be, that they do not do anything at all like deriving quantum theory - that instead their framework is something much, much looser and less specific - but then they give their article a title implying that they can derive the whole of QM from their loose framework. Not only do they thereby falsely create the impression that they have answered a basic question about reality, but their fake answer is a bland one, thereby dulling further interest, and it is presented with an appearance of rigor, making it look authoritative. I would also expect that, when they get to the stage of trying to derive actual QM, they have to compound their major sin with the minor one of handwaving in support of a preordained conclusion - that they will have to do something like join their two real-valued functions together, in a way which is really motivated only by their knowing what QM looks like, but for which they will have to invent some independent excuse, since they are supposedly deriving QM.

All the foregoing may be regarded as a type of prediction. They are the dodgy misrepresentations I would expect to find happening in the paper, if I actually sat down and scrutinized it in detail. I really don't want to do that since time is precious, but I also didn't want to let this post go unremarked. Is it too much to hope that some coalition of Less Wrong readers, knowing about both probability and physics, will have the time and the will to look more closely, and identify specific leaps of logic, and just what is actually going on in the paper? It may also be worth looking for existing criticisms of the "physics from Fisher information" school of thought - maybe someone out there has already written the ideal explanation of its shortcomings.

Comment author: gRR 08 May 2014 09:16:30PM 2 points [-]

Well, I liked the paper, but I'm not knowledgeable enough to judge its true merits. It deals heavily with Bayesian-related questions, somewhat in Jayne's style, so I thought it could be relevant to this forum.

At least one of the authors is a well-known theoretical physicist with an awe-inspiring Hirsch factor, so presumably the paper would not be trivially worthless. I think it merits a more careful read.

Comment author: gRR 26 July 2013 07:58:23PM 1 point [-]

Regarding the "he's here... he is the end of the world" prophecy, in view of the recent events, it seems like it can become literally true without it being a bad thing. After all, it does not specify a time frame. So Harry may become immortal and then tear apart the very stars in heaven, some time during a long career.

Comment author: DanArmak 26 May 2012 07:12:23PM 0 points [-]

Naturally, F is monotonically increasing in R and decreasing in Ropp

You're treating resources as one single kind, where really there are many kinds with possible trades between teams. Here you're ignoring a factor that might actually be crucial to encouraging cooperation (I'm not saying I showed this formally :-)

Assume there are just two teams

But my point was exactly that there would be many teams who could form many different alliances. Assuming only two is unrealistic and just ignores what I was saying. I don't even care much if given two teams the correct choice is to cooperate, because I set very low probability on there being exactly two teams and no other independent players being able to contribute anything (money, people, etc) to one of the teams.

This is my position

You still haven't given good evidence for holding this position regarding the relation between the different Uxxx utilities. Except for the fact CEV is not really specified, so it could be built so that that would be true. But equally it could be built so that that would be false. There's no point in arguing over which possibility "CEV" really refers to (although if everyone agreed on something that would clear up a lot of debates); the important questions are what do we want a FAI to do if we build one, and what we anticipate others to tell their FAIs to do.

Comment author: gRR 26 May 2012 08:25:46PM 0 points [-]

You're treating resources as one single kind, where really there are many kinds with possible trades between teams

I think this is reasonably realistic. Let R signify money. Then R can buy other necessary resources.

But my point was exactly that there would be many teams who could form many different alliances. Assuming only two is unrealistic and just ignores what I was saying.

We can model N teams by letting them play two-player games in succession. For example, any two teams with nearly matched resources would cooperate with each other, producing a single combined team, etc... This may be an interesting problem to solve, analytically or by computer modeling.

You still haven't given good evidence for holding this position regarding the relation between the different Uxxx utilities.

You're right. Initially, I thought that the actual values of Uxxx-s will not be important for the decision, as long as their relative preference order is as stated. But this turned out to be incorrect. There are regions of cooperation and defection.

Comment author: [deleted] 26 May 2012 04:11:44PM 1 point [-]

These all have property that you only need so much of them.

All of those resources are fungible and can be exchanged for time. There might be no limit to the amount of time people desire, even very enlightened posthuman people.

Comment author: gRR 26 May 2012 04:53:13PM *  0 points [-]

I don't think you can get an everywhere-positive exchange rate. There are diminishing returns and a threshold, after which, exchanging more resources won't get you any more time. There's only 30 hours in a day, after all :)

Comment author: DanArmak 26 May 2012 08:40:05AM 1 point [-]

I have trouble thinking of a resource that would make even one person's CEV, let alone 80%, want to kill people, in order to just have more of it.

shrug Space (land or whatever is being used). Mass and energy. Natural resources. Computing power. Finite-supply money and luxuries if such exist.

Or are you making an assumption that CEVs are automatically more altruistic or nice than non-extrapolated human volitions are?

This is easy, and does not need any special hardcoding. If someone is so insane that their beliefs are totally closed and impossible to move by knowledge and intelligence, then their CEV is undefined. Thus, they are automatically excluded.

Well it does need hardcoding: you need to tell the CEV to exclude people whose EVs are too similar to their current values despite learning contrary facts. Or even all those whose belief-updating process differs too much from perfect Bayesian (and how much is too much?) This is something you'd hardcode in, because you could also write ("hardcode") a CEV that does include them, allowing them to keep the EVs close to their current values.

Not that I'm opposed to this decision (if you must have CEV at all).

We are talking about people building FAI-s. Surely they are intelligent enough to notice the symmetry between themselves.

There's a symmetry, but "first person to complete AI wins, everyone 'defects'" is also a symmetrical situation. Single-iteration PD is symmetrical, but everyone defects. Mere symmetry is not sufficient for TDT-style "decide for everyone", you need similarity that includes similarly valuing the same outcomes. Here everyone values the outcome "have the AI obey ME!", which is not the same.

If you say that logic and rationality makes you decide to 'defect' (=try to build FAI on your own, bomb everyone else), then logic and rationality would make everyone decide to defect. So everybody bombs everybody else, no FAI gets built, everybody loses.

Or someone is stronger than everyone else, wins the bombing contest, and builds the only AI. Or someone succeeds in building an AI in secret, avoiding being bombed. Or there's a player or alliance that's strong enough to deter bombing due to the threat of retaliation, and so completes their AI which doesn't care about everyone else much. There are many possible and plausible outcomes besides "everybody loses".

Instead you can 'cooperate' (=precommit to build FAI<everybody's CEV> and to bomb everyone that did not make the same precommitment). This gets us a single global alliance.

Or while the alliance is still being built, a second alliance or very strong player bombs them to get the military advantages of a first strike. Again, there are other possible outcomes besides what you suggest.

Comment author: gRR 26 May 2012 04:04:05PM *  0 points [-]

Space (land or whatever is being used). Mass and energy. Natural resources. Computing power. Finite-supply money and luxuries if such exist. Or are you making an assumption that CEVs are automatically more altruistic or nice than non-extrapolated human volitions are?

These all have property that you only need so much of them. If there is a sufficient amount for everybody, then there is no point in killing in order to get more. I expect CEV-s to not be greedy just for the sake of greed. It's people's CEV-s we're talking about, not paperclip maximizers'.

Well it does need hardcoding: you need to tell the CEV to exclude people whose EVs are too similar to their current values despite learning contrary facts. Or even all those whose belief-updating process differs too much from perfect Bayesian (and how much is too much?) This is something you'd hardcode in, because you could also write ("hardcode") a CEV that does include them, allowing them to keep the EVs close to their current values.

Hmm, we are starting to argue about exact details of extrapolation process...

There are many possible and plausible outcomes besides "everybody loses".

Lets formalize the problem. Let F(R, Ropp) be the probability of a team successfully building a FAI first, given R resources, and having opposition with Ropp resources. Let Uself, Ueverybody, and Uother be the rewards for being first in building FAI<self>, FAI<everybody>, and FAI<other>, respectively. Naturally, F is monotonically increasing in R and decreasing in Ropp, and Uother < Ueverybody < Uself.

Assume there are just two teams, with resources R1 and R2, and each can perform one of two actions: "cooperate" or "defect". Let's compute the expected utilities for the first team:

We cooperate, opponent team cooperates: EU("CC") = Ueverybody * F(R1+R2, 0) We cooperate, opponent team defects: EU("CD") = Ueverybody * F(R1, R2) + Uother * F(R2, R1) We defect, opponent team cooperates: EU("DC") = Uself * F(R1, R2) + Ueverybody * F(R2, R1) We defect, opponent team defects: EU("DD") = Uself * F(R1, R2) + Uother * F(R2, R1)

Then, EU("CD") < EU("DD") < EU("DC"), which gives us most of the structure of a PD problem. The rest, however, depends on the finer details. Let A = F(R1,R2)/F(R1+R2,0) and B = F(R2,R1)/F(R1+R2,0). Then:

  1. If Ueverybody <= Uself*A + Uother*B, then EU("CC") < EU("DD"), and there is no point in cooperating. This is your position: Ueverybody is much less than Uself, or Uother is not much less than Ueverybody, and/or your team has so much more resources than the other.

  2. If Uself*A + Uother*B < Ueverybody < Uself*A/(1-B), this is a true Prisoner's dilemma.

  3. If Ueverybody >= Uself*A/(1-B), then EU("CC") >= EU("DC"), and "cooperate" is the obviously correct decision. This is my position: Ueverybody is not much less than Uself, and/or the teams are more evenly matched.

Comment author: DanArmak 25 May 2012 08:50:24PM 1 point [-]

The resources are not scarce, yet the CEV-s want to kill? Why?

Sorry for the confusion. Let's taboo "scarce" and start from scratch.

I'm talking about a scenario where - to simplify only slightly from the real world - there exist some finite (even if growing) resources such that almost everyone, no matter how much they already have, want more of. A coalition of 80% of the population forms, which would like to kill the other 20% in order to get their resources. Would the AI prevent this, althogh there is no consensus against the killing?

If you still want to ask whether the resource is "scarce", please specify what that means exactly. Maybe any finite and highly desireable resource, with returns diminishing weakly or not at all, can be considered "scarce".

It would do so only if everybody's CEV-s agree that updating these people's beliefs is a good thing.

People that would still have false factual beliefs no matter how much evidence and how much intelligence they have? They would be incurably insane. Yes, I would agree to ignore their volition, no matter how many they are.

As I said - this is fine by me insofar as I expect the CEV not to choose to ignore me. (Which means it's not fine through the Rawlsian veil of ignorance, but I don't care and presumably neither do you.)

The question of definition, who is to be included in the CEV? or - who is considered sane? becomes of paramount importance. Since it is not itself decided by the CEV, it is presumably hardcoded into the AI design (or evolves within that design as the AI self-modifies, but that's very dangerous without formal proofs that it won't evolve to include the "wrong" people.) The simplest way to hardcode it is to directly specify the people to be included, but you prefer testing on qualifications.

However this is realized, it would give people even more incentive to influence or stop your AI building process or to start their own to compete, since they would be afraid of not being included in the CEV used by your AI.

The PD reasoning to cooperate only applies in case of iterated PD

Err. What about arguments of Douglas Hofstadter and EY, and decision theories like TDT?

TDT applies where agents are "similar enough". I doubt I am similar enough to e.g. the people you labelled insane.

Which arguments of Hofstadter and Yudkowsky do you mean?

Cooperating in this game would mean there is exactly one global research alliance.

Why? What prevents several competing alliances (or single players) from forming, competing for the cooperation of the smaller players?

Comment author: gRR 26 May 2012 03:15:22AM 0 points [-]

A coalition of 80% of the population forms, which would like to kill the other 20% in order to get their resources

I have trouble thinking of a resource that would make even one person's CEV, let alone 80%, want to kill people, in order to just have more of it.

The question of definition, who is to be included in the CEV? or - who is considered sane?

This is easy, and does not need any special hardcoding. If someone is so insane that their beliefs are totally closed and impossible to move by knowledge and intelligence, then their CEV is undefined. Thus, they are automatically excluded.

TDT applies where agents are "similar enough". I doubt I am similar enough to e.g. the people you labelled insane.

We are talking about people building FAI-s. Surely they are intelligent enough to notice the symmetry between themselves. If you say that logic and rationality makes you decide to 'defect' (=try to build FAI on your own, bomb everyone else), then logic and rationality would make everyone decide to defect. So everybody bombs everybody else, no FAI gets built, everybody loses. Instead you can 'cooperate' (=precommit to build FAI<everybody's CEV> and to bomb everyone that did not make the same precommitment). This gets us a single global alliance.

Comment author: DanArmak 24 May 2012 07:59:15PM 0 points [-]

If the resources are so scarce that dividing them is so important that even CEV-s agree on the necessity of killing, then again, I prefer humans to decide who gets them.

The resources are not scarce at all. But, there's no consensus of CEVs. The CEVs of 80% want to kill the rest. The CEVs of 20% obviously don't want to be killed. Because there's no consensus, your version of CEV would not interfere, and the 80% would be free to kill the 20%.

No. CEV does not updates anyone's beliefs. It is calculated by extrapolating values in the presence of full knowledge and sufficient intelligence.

I meant that the AI that implements your version of CEV would forcibly update people's actual beliefs to match what it CEV-extrapolated for them. Sorry for the confusion.

As I said elsewhere, if a person's beliefs are THAT incompatible with truth, I'm ok with ignoring their volition. Note, that their CEV is undefined in this case. But I don't believe there exist such people (excluding totally insane).

A case could be made that many millions of religious "true believers" have un-updatable 0/1 probabilities. And so on.

Your solution is to not give them a voice in the CEV at all. Which is great for the rest of us - our CEV will include some presumably reduced term for their welfare, but they don't get to vote on things. This is something I would certainly support in a FAI (regardless of CEV), just as I would support using CEV<few people + me> or even CEV<few people like me in crucial respects> to CEV<everyone>.

The only difference between us then is that I estimate there to be many such people. If you believed there were many such people, would you modify your solution, or is ignoring them however many they are fine by you?

PD reasoning says you should cooperate (assuming cooperation is precommittable).

As I said before, this reasoning is inapplicable, because this situation is nothing like a PD.

  1. The PD reasoning to cooperate only applies in case of iterated PD, whereas creating a singleton AI is a single game.
  2. Unlike PD, the payoffs are different between players, and players are not sure of each other's payoffs in each scenario. (E.g., minor/weak players are more likely to cooperate than big ones that are more likely to succeed if they defect.)
  3. The game is not instantaneous, so players can change their strategy based on how other players play. When they do so they can transfer value gained by themselves or by other players (e.g. join research alliance 1, learn its research secrets, then defect and sell the secrets to alliance 2).
  4. It is possible to form alliances, which gain by "defecting" as a group. In PD, players cannot discuss alliances or trade other values to form them before choosing how to play.
  5. There are other games going on between players, so they already have knowledge and opinions and prejudices about each other, and desires to cooperate with certain players and not others. Certain alliances will form naturally, others won't.

adoption of total transparency for everybody of all governmental and military matters.

This counts as very weak evidence because it proves it's at least possible to achieve this, yes. (If all players very intensively inspect all other players to make sure a secret project isn't being hidden anywhere - they'd have to recruit a big chunk of the workforce just to watch over all the rest.)

But the probability of this happening in the real world, between all players, as they scramble to be the first to build an apocalyptic new weapon, is so small it's not even worth discussion time. (Unless you disagree, of course.) I'm not convinced by this that it's an easier problem to solve than that of building AGI or FAI or CEV.

Comment author: gRR 24 May 2012 09:51:55PM 1 point [-]

The resources are not scarce at all. But, there's no consensus of CEVs. The CEVs of 80% want to kill the rest.

The resources are not scarce, yet the CEV-s want to kill? Why?

I meant that the AI that implements your version of CEV would forcibly update people's actual beliefs to match what it CEV-extrapolated for them.

It would do so only if everybody's CEV-s agree that updating these people's beliefs is a good thing.

If you believed there were many such people, would you modify your solution, or is ignoring them however many they are fine by you?

People that would still have false factual beliefs no matter how much evidence and how much intelligence they have? They would be incurably insane. Yes, I would agree to ignore their volition, no matter how many they are.

The PD reasoning to cooperate only applies in case of iterated PD

Err. What about arguments of Douglas Hofstadter and EY, and decision theories like TDT?

Unlike PD, the payoffs are different between players, and players are not sure of each other's payoffs in each scenario

This doesn't really matter for a broad range of possible payoff matrices.

join research alliance 1, learn its research secrets, then defect and sell the secrets to alliance 2

Cooperating in this game would mean there is exactly one global research alliance. A cooperating move is a precommitment to abide by its rules. Enforcing such precommitment is a separate problem. Let's assume it's solved.

I'm not convinced by this that it's an easier problem to solve than that of building AGI or FAI or CEV.

Maybe you're right. But IMHO it's a less interesting problem :)

Comment author: DanArmak 24 May 2012 06:39:05PM *  0 points [-]

The majority may wish to kill the minority for wrong reasons - based on false beliefs or insufficient intelligence. In which case their CEV-s won't endorse it, and the FAI will interfere

So you're OK with the FAI not interfering if they want to kill them for the "right" reasons? Such as "if we kill them, we will benefit by dividing their resources among ourselves"?

But you said it would only do things that are approved by a strong human consensus.

Strong consensus of their CEV-s.

So you're saying your version of CEV will forcibly update everyone's beliefs and values to be "factual" and disallow people to believe in anything not supported by appropriate Bayesian evidence? Even if it has to modify those people by force, the result is unlike the original in many respects that they and many other people value and see as identity-forming, etc.? And it will do this not because it's backed by a strong consensus of actual desires, but because post-modification there will be a strong consensus of people happy that the modification was made?

If your answer is "yes, it will do that", then I would not call your AI a Friendly one at all.

Extrapolated volition is based on objective truth, by definition.

My understanding of the CEV doc differs from yours. It's not a precise or complete spec, and it looks like both readings can be justified.

The doc doesn't (on my reading) say that the extrapolated volition can totally conform to objective truth. The EV is based on an extrapolation of our existing volition, not of objective truth itself. One of the ways it extrapolates is by adding facts the original person was not aware of. But that doesn't mean it removes all non-truth or all beliefs that "aren't even wrong" from the original volition. If the original person effectively assigns 0 or 1 "non-updateable probability" to some belief, or honestly doesn't believe in objective reality, or believes in "subjective truth" of some kind, CEV is not necessarily going to "cure" them of it - especially not by force.

But as long as we're discussing your vision of CEV, I can only repeat what I said above - if it's going to modify people by force like this, I think it's unFriendly and if it were up to me, would not launch such an AI.

I meant four independent clauses: each of the agents does not endorse CEV<other>, but endorses CEV<both>.

Understood. But I don't see how this partial ordering changes what I had described.

Let's say I'm A1 and you're A2. We would both prefer a mutual CEV than a CEV of the other only. But each of us would prefer even more a CEV of himself only. So each of us might try to bomb the other first if he expected to get away without retaliation. That there exists a possible compromise that is better than total defeat doesn't mean total victory wouldn't be much better than any compromise.

How can a state or military precommit to not having a supersecret project to develop a private AGI?

That's a separate problem. I think it is easier to solve than extrapolating volition or building AI.

If you think so you must have evidence relating to how to actually solve this problem. Otherwise they'd both look equally mysterious. So, what's your idea?

Comment author: gRR 24 May 2012 07:35:19PM 0 points [-]

So you're OK with the FAI not interfering if they want to kill them for the "right" reasons?

I wouldn't like it. But if the alternative is, for example, to have FAI directly enforce the values of the minority on the majority (or vice versa) - the values that would make them kill in order to satisfy/prevent - then I prefer FAI not interfering.

"if we kill them, we will benefit by dividing their resources among ourselves"

If the resources are so scarce that dividing them is so important that even CEV-s agree on the necessity of killing, then again, I prefer humans to decide who gets them.

So you're saying your version of CEV will forcibly update everyone's beliefs

No. CEV does not updates anyone's beliefs. It is calculated by extrapolating values in the presence of full knowledge and sufficient intelligence.

If the original person effectively assigns 0 or 1 "non-updateable probability" to some belief, or honestly doesn't believe in objective reality, or believes in "subjective truth" of some kind, CEV is not necessarily going to "cure" them of it - especially not by force.

As I said elsewhere, if a person's beliefs are THAT incompatible with truth, I'm ok with ignoring their volition. Note, that their CEV is undefined in this case. But I don't believe there exist such people (excluding totally insane).

That there exists a possible compromise that is better than total defeat doesn't mean total victory wouldn't be much better than any compromise.

But the total loss would be correspondingly worse. PD reasoning says you should cooperate (assuming cooperation is precommittable).

If you think so you must have evidence relating to how to actually solve this problem. Otherwise they'd both look equally mysterious. So, what's your idea?

Off the top of my head, adoption of total transparency for everybody of all governmental and military matters.

View more: Next