I don't follow. Could you make this example more formal, giving a set of outcomes, a set of lotteries over these outcomes, and a preference relation on these that corresponds to "I will act so that, at some point, there will have been a chance of me becoming a heavy-weight champion of the world", and which fails Continuity but satisfies all other VNM axioms? (Intuitively this sounds more like it's violating Independence, but I may well be misunderstanding what you're trying to do since I don't know how to do the above formalization of your argument.)
Also, Magical Britain keeps Muggles out, going so far as to enforce this by not even allowing Muggles to know that Magical Britain exists. I highly doubt that Muggle Britain would do that to potential illegal immigrants even if it did have the technology...
Incidentally, the same argument also applies to Governor Earl Warren's statement quoted in Absence of evidence is evidence of absence: He can be seen as arguing that there are at least three possibilities, (1) there is no fifth column, (2) there is a fifth column and it supposed to do sabotage independent from an invasion, (3) there is a fifth column and it is supposed to aid a Japanese invasion of the West Coast. In case (2), you would expect to have seen sabotage; in case (1) and (3), you wouldn't, because if the fifth column were known to exist by the t...
The true message of the first video is even more subliminal: The whiteboard behind him shows some math recently developed by MIRI, along with a (rather boring) diagram of Botworld :-)
Sorry about that; I've had limited time to spend on this, and have mostly come down on the side of trying to get more of my previous thinking out there rather than replying to comments. (It's a tradeoff where neither of the options is good, but I'll try to at least improve my number of replies.) I've replied there. (Actually, now that I spent some time writing that reply, I realize that I should probably just have pointed to Coscott's existing reply in this thread.)
I'm not sure which of the following two questions you meant to ask (though I guess probably the second one), so I'll answer both:
(a) "Under what circumstances is something (either an l-zombie or conscious)?" I am not saying that something is an l-zombie only if someone has actually written out the code of the program; for the purposes of this post, I assume that all natural numbers exist as platonical objects, and therefore all observers in programs that someone could in principle write and run exist at least as l-zombies.
(b) "When is a prog...
Thank you for the feedback, and sorry for causing you distress! I genuinely did not take into consideration that this choice could cause distress, and it could have occurred to me, and I apologize.
On how I came to think that it might be a good idea (as opposed to missing that it might be a bad idea): While there's math in this post, the point is really the philosophy rather than the math (whose role is just to help thinking more clearly about the philosophy, e.g. to see that PBDT fails in the same way as NBDT on this example). The original counterfactual m...
In short, I don't think SUDT (or UDT) by itself solves the problem of counterfactual mugging. [...] Perhaps SUDT also needs to specify a rule for selecting utility functions (e.g. some sort of disinterested "veil of ignorance" on the decider's identity, or an equivalent ban on utilities which sneak it in a selfish or self-interested term).
I'll first give an answer to a relatively literal reading of your comment, and then one to what IMO you are "really" getting at.
Answer to a literal reading: I believe that what you value is part of ...
It's priors over logical states of affairs. Consider the following sentence: "There is a cellular automaton that can be described in at most 10 KB in programming language X, plus a computable function f() which can be described in another 10 KB in the same programming language, such that f() returns a space/time location within the cellular automaton corresponding to Earth as we know it in early 2014." This could be false even if Tegmark IV is true, and prior probability (i.e., probability without trying to do an anthropic update of the form "I observe this, so it's probably simple") says it's probably false.
Yup, sure.
To summarize that part of the post: (1) The view I'm discussing there argues that the reason we find ourselves in a simple-looking world is that all possible experiences are consciously experienced, including the ones where the world looks simple, and we just happen to experience the latter. (2) If this is correct, then you cannot use the fact that you look around and see a simple-looking world to infer that you live in a simple-looking world, because there are plenty of complex interventionistic worlds that look deceptively simple. In fact, the prior prob...
I don't feel like considering these different ways to approach K-complexity addresses the point I was trying to make. The rebuttal seems to be arguing that we should weigh the TMs that don't read the end of the tape equally, rather than weighing TMs more that read less of the tape. But my point isn't that I don't want to weigh complex TMs as much as simple TMs; it is (1) that I seem to be willing to consider TMs with one obviously disorderly event "pretty simple", even though I think they have high K-complexity; and (2) given this, the utility I ...
So, I can see that you would care similarly as you would in a multiverse with magical reality fluid that's distributed in the same proportions as your measure of caring, and if your measure of caring is K-complexity with respect to a universal Turing machine (UTM) we would consider simple, it's at least one plausible possibility that the true magical reality fluid that's distributed in roughly those proportions. But given the state of our confusion, I think that conditional on there being a true measure, any single hypothesis as to how that measure is dist...
But you see Eliezer's comments because a conscious copy of Eliezer has been run.
A conscious copy of Eliezer that thought about what Eliezer would do when faced with that situation, not a conscious copy of Eliezer actually faced with that situation -- the latter Eliezer is still an l-zombie, if we live in a world with l-zombies.
For l-zombies to do anything they need to be run, whereupon they stop being l-zombies.
Omega doesn't necessarily need to run a conscious copy of Eliezer to be pretty sure that Eliezer would pay up in the counterfactual mugging; it could use other information about Eliezer, like Eliezer's comments on LW, the way that I just did. It should be possible to achieve pretty high confidence that way about what Eliezer-being-asked-about-a-counterfactual-mugging would do, even if that version of Eliezer should happen to be an l-zombie.
Fixed, thanks!
(Agree with Coscott's comment.)
I meant useful in the context of AI since any such sequence would obviously have to be non-computable and thus not something the AI (or person) could make pragmatic use of.
I was replying to this:
Ultimately, you can always collapse any computable sequence of computable theories (necessary for the AI to even manipulate) into a single computable theory so there was never any hope this kind of sequence could be useful.
I.e., I was talking about computable sequences of computable theories, not about non-computable ones.
...Also, it is far from clear that
Actually, the `proof' you gave that no true list of theories like this exists made the assumption (not listed in this paper) that the sequence of indexes for the computable theories is definable over arithmetic. In general there is no reason this must be true but of course for the purposes of an AI it must.
("This paper" being Eliezer's writeup of the procrastination paradox.) That's true, thanks.
...Ultimately, you can always collapse any computable sequence of computable theories (necessary for the AI to even manipulate) into a single computabl
I'm hard-pressed to this of any more I could want from [the coco-value] (aside from easy extensions to bigger classes of games).
Invariance to affine transformations of players' utility functions. This solution requires that both players value outcomes in a common currency, plus the physical ability to transfer utility in this currency outside the game (unless there are two outcomes o_1 and o_2 of the game such that A(o_1) + B(o_1) = A(o_2) + B(o_2) = max_o A(o) + B(o), and such that A(o_1) >= A's coco-value >= A(o_2), in which case the players can...
...so? What you say is true but seems entirely irrelevant to the question what the superrational outcome in an asymmetric game should be.
Retracted my comment for being unhelpful (I don't recognize what I said in what you heard, so I'm clearly not managing to explain myself here).
Agree with Nisan's intuition, though I also agree with Wei Dai's position that we shouldn't feel sure that Bayesian probability is the right way to handle logical uncertainty. To more directly answer the question what it means to assign a probability to the twin prime conjecture: If Omega reveals to you that you live in a simulation, and it offers you a choice between (a) Omega throws a bent coin which has probability p of landing heads, and shuts down the simulation if it lands tails, otherwise keeps running it forever; and (b) Omega changes the code of t...
I'm not saying we'll take the genome and read it to figure out how the brain does what it does, I'm saying that we run a brain simulation and do science (experiments) on it and study how it works, similarly how we study how DNA transcription or ATP production or muscle contraction or a neuron's ion pumps or the Krebs cycle or honeybee communication or hormone release or cell division or the immune system or chick begging or the heart's pacemaker work. There are a lot of things evolution hasn't obfuscated so much that we haven't been able to figure out what they're doing. Of course there's also a lot of things we don't understand yet, but I don't see how that leads to the conclusion that evolution is generally obfuscatory.
Saying that all civilizations able to create strong AI will reliably be wise enough to avoid creating strong AI seems like a really strong statement, without any particular reason to be true. By analogy, if you replace civilizations by individual research teams, would it be safe to rely on each team capable of creating uFAI to realize the dangers of doing so and therefore refraining from doing so, so that we can safely take a much longer time to figure out FAI? Even if it were the case that most teams capable of creating uFAI hold back like this, one single rogue team may be enough to destroy the world, and it just seems really likely that there will be some not-so-wise people in any large enough group.
Good points.
evolution hit on some necessary extraordinarily unlikely combination to give us intelligence and for P vs NP reasons we can't find it
For this one, you also need to explain why we can't reverse-engineer it from the human brain.
no civilization smart enough to create strong AI is stupid enough to create strong AI
This seems particularly unlikely in several ways; I'll skip the most obvious one, but also it seems unlikely that humans are "safe" in that they don't create a FOOMing AI but it wouldn't be possible even with much thought...
Combining your ideas together -- our overlord actually is a Safe AI created by humans.
How it happened:
Humans became aware of the risks of intelligence explosions. Because they were not sure they could create a Friendly AI in the first attempt, and creating an Unfriendly AI would be too risky, instead they decided to first create a Safe AI. The Safe AI was planned to become a hundred times smarter than humans but not any smarter, answer some questions, and then turn itself off completely; and it had a mathematically proved safety mechanism to prevent it fro...
I would agree with your reasoning if CFAR claimed that they can reliably turn people into altruists free of cognitive biases within the span of their four-day workshop. If they claimed that and were correct in that, then it shouldn't matter whether they (a) require up-front payment and offer a refund or (b) have people decide what to pay after the workshop, since a bias-free altruist would make end up paying the same in either case. There would only be a difference if CFAR didn't achieve what, in this counterfactual scenario, it claimed to achieve, so they...
Yep: CFAR advertised their fundraiser in their latest newsletter, which I received on Dec. 5.
The only scenario I can see where this would make sense is if SIAI expects small donors to donate less than $(1/2)N in a dollar-for-dollar scheme, so that its total gain from the fundraiser would be below $(3/2)N, but expects to get the full $(3/2)N in a two-dollars-for-every-dollar scheme. But not only does this seem like a very unlikely story [...]
One year later, the roaring success of MIRI's Winter 2013 Matching Challenge, which is offering 3:1 matching for new large donors (people donating >= $5K who have donated less that $5K in total in the pas...
Yes, a real-life reasoner would have to use probabilistic reasoning to carry out these sorts of inference. We do not have a real understanding yet of how to do probabilistic reasoning about logical statements, though, although there has been a bit of work about it in the past. This is one topic MIRI is currently doing research on. In the meantime, we also examine problems of self-reference in ordinary deductive logic, since we understand it very well. It's not certain that the results there will carry over in any way into the probabilistic setting, and it'...
There is a way to write a predicate Proves(p,f) in the language of PA which is true if f is the Gödel number of a formula and p is the Gödel number of a proof of that formula from the axioms of PA. You can then define a predicate Provable(f) := exists p. Proves(p,f); then Provable(f) says that f is the Gödel number of a provable formula. Writing "A" for the Gödel number of the formula A, we can then write
PA |- Provable("A")
to say that there's a proof that A is provable, and
PA |- Provable("Provable("A")")
to say that t...
An example of this: CFAR has published some results on an experiment where they tried to see if they could improve people's probability estimates by asking them how surprised they'd be by truth about some question turning out one way or another. They expected it would, but it turned out it didn't. And that doesn't surprise me. If imagined feelings of surprise contained some information naive probability-estimation methods didn't, why wouldn't we have evolved to tap that information automatically?
Because so few of our ancestors died because they got nume...
Mark, have you read Eliezer's article about the Löbian obstacle, and what was your reaction to it?
I'm in the early stages of writing up my own work on the Löbian obstacle for publication, which will need to include its own (more condensed, rather than expanded) exposition of the Löbian obstacle; but I liked Eliezer's article, so it would be helpful to know why you didn't think it argued the point well enough.
Don't worry, I wasn't offended :)
Good to hear, and thanks for the reassurance :-) And yeah, I do too well know the problem of having too little time to write something polished, and I do certainly prefer having the discussion in fairly raw form to not having it at all.
One possibility is that MIRI's arguments actually do look that terrible to you
What I would say is that the arguments start to look really fishy when one thinks about concrete instantiations of the problem.
I'm not really sure what you mean by a "concrete instantiation". I c...
Since the PSM was designed without self-modification in mind, "safe but unable to improve itself in effective ways".
(Not sure how this thought experiment helps the discussion along.)
MIRI stated goals are similar to those of mainstream AI research, and MIRI approach in particular includes as subgoals the goals of research fields such as model checking and automated theorem proving.
It's definitely not a goal of mainstream AI, and not even a goal of most AGI researchers, to create self-modifying AI that provably preserves its goals. MIRI's work on this topic doesn't seem relevant to what mainstream AI researchers want to achieve.
Zooming out from MIRI's technical work to MIRI's general mission, it's certainly true that MIRI's failure t...
I thought the example was pretty terrible.
Glad to see you're doing well, Benja :)
Sorry for being curmudgeonly there -- I did afterwards wish that I had tempered that. The thing is that when you write something like
I also agree that the idea of "logical uncertainty" is very interesting. I spend much of my time as a grad student working on problems that could be construed as versions of logical uncertainty.
that sounds to me like you're painting MIRI as working on these topics just because it's fun, and supporting its work by arguments tha...
Jacob, have you seen Luke's interview with me, where I've tried to reply to some arguments of the sort you've given in this thread and elsewhere?
...I don't think [the fact that humans' predictions about themselves and each other often fail] is sufficient to dismiss my example. Whether or not we prove things, we certainly have some way of reasoning at least somewhat reliably about how we and others will behave. It seems important to ask why we expect AI to be fundamentally different; I don't think that drawing a distinction between heuristics and logical pro
Things that result in fewer resources going into AI specifically would result in fewer UFAI resources without reducing overall economic growth, but it needs to be kept in mind that some such research occurs in financial firms pushing trading algorithms, and a lot more in Google, not just in places like universities.
To the extent that industry researchers publish less than academia (this seems particularly likely in financial firms, and to a lesser degree at Google), a hypothetical complete shutdown of academic AI research should reduce uFAI's paralleliz...
I'd definitely be interested to talk more about many of these, especially anthropics and reduced impact / Oracle AI, and potentially collaborate. Lots of topics for future Oxford visits! :-)
Hope you'll get interest from others as well.
Sorry for the long-delayed reply, Wei!
So you think that humans do not have a built-in solution to the Löbstacle, and you must also think we are capable of building an FAI that does have a built-in solution to the Löbstacle. That means an intelligence without a solution to the Löbstacle can produce another intelligence that shares its values and does have a solution to the Löbstacle.
Yup.
...But then why is it necessary for us to solve this problem? [...] Why can't we instead built an FAI without solving this problem, and depend on the FAI to solve the pro
Drats. But also, yay, information! Thanks for trying this!
ETA: Worth noting that I found that post useful, though.
Glad to hear that & looking forward to seeing how it works! I very much understand that one might be concerned about posting "quick and dirty" thoughts (I find it so very difficult to lower my own standards even when it's obviously blocking me from getting stuff done), but there seems to be little cost of trying it with a Discussion post and seeing how it goes -- yay value of information! :-)
Note that you're wrongly discouraging people from doing strategy research by saying that they need to catch up to insiders' unpublished knowledge when they really don't.
What makes you say that? I believe you can reinvent much of what Eliezer and Carl and Bostrom and a few others already know but haven't written down. Not sure that's true for almost most everyone else.
I read the idea as being that people rediscovering and writing up stuff that goes 5% towards what E/C/N have already figured out but haven't written down would be a net positive and it's...
You should frequently change your passwords, use strong passwords, and not use the same password for multiple services (only one point of failure where all your passwords get compromised rather than every such service being a point of failure). It's not easy to live up to this in practice, but there are approximations that are much easier:
Using a password manager is better than using the same password for lots of services. Clipperz is a web service that does the encryption on your computer (so your passwords never get sent to the server), and can be inst
As a pedestrian or cyclist, you're not all that easy to see from a car at night, worse if you don't wear white. High-visibility vests (that thing that construction workers wear, yellow or orange with reflective stripes) fix the problem and cost around $7-$8 from Amazon including shipping, or £3 in the UK.
Donated $300.