The true point of no return has to be indeed much later than we believe it to be now.
Who is "we", and what do "we" believe about the point of no return? Surely you're not talking about ordinary doctors pronouncing medical death, because that's just irrelevant (pronouncements of medical death are assertions about what current medicine can repair, not about information-theoretic death). But I don't know what other consensus you could be referring to.
I think your answer is in The Domain of Your Utility Function. That post isn't specifically about cryonics, but is about how you can care about possible futures in which you will be dead. If you understand both of the perspectives therein and are still confused, then I can elaborate.
Why would a self-improving agent not improve its own decision-theory to reach an optimum without human intervention, given a "comfortable" utility function in the first place?
A self-improving agent does improve its own decision theory, but it uses its current decision theory to predict which self-modifications would be improvements, and broken decision theories can be wrong about that. Not all starting points converge to the same answer.
That strategy is optimal if and only if the probably of success was reasonably high after all. Otoh, if you put an unconditional extortioner in an environment mostly populated by decision theories that refuse extortion, then the extortioner will start a war and end up on the losing side.
Jbay didn't specify that the drug has to leave people able to answer questions about their own emotional state. And in fact there are some people who can't do that, even though they're otherwise functional.
There are many such operators, and different ones give different answers when presented with the same agent. Only a human utility function distinguishes the right way of interpreting a human mind as having a utility function from all of the wrong ways of interpreting a human mind as having a utility function. So you need to get a bunch of Friendliness Theory right before you can bootstrap.
I donated $40,000.00
I think it's unlikely that pengvado is lying -- but if anyone from CFAR is reading this and can confirm this donation, I think that would be a Good Thing.
Holy crap, dude. Thanks for helping to save the world.
If you can encode microstate s in n bits, that implies that you have a prior that assigns P(s)=2^-n. The set of all possible microstates is countably infinite. There is no such thing as a uniform distribution over a countably infinite set. Therefore, even the ignorance prior can't assign equal length bitstrings to all microstates.
Yes, that's the usual application, but it's the wrong level of generality to make them synonyms. "Fully general counterargument" is one particular absurdity that you can reduce things to. Even after you've specified that you're performing a reductio ad absurdum against the proposition "argument X is sound", you still need to say what the absurd conclusion is, so you still need a term for "fully general counterargument".
Why should you not have preferences about something just because you can't observe it? Do you also not care whether an intergalactic colony-ship survives its journey, if the colony will be beyond the cosmological horizon?
Here's a citation for the claim of DRAM persisting with >99% accuracy for seconds at operating temperature or hours at LN2. (The latest hardware tested there is from 2007. Did something drastically change in the last 6 years?)
What relevance does personal identity have to TDT? TDT doesn't depend on whether the other instances of TDT are in copies of you, or in other people who merely use the same decision theory as you.
That works with caveats: You can't just publish the seed in advance, because that would allow the player to generate the coin in advance. You can't just publish the seed in retrospect, because the seed is an ordinary random number, and if it's unknown then you're just dealing with an ordinary coin, not a logical one. So publish in advance the first k bits of the pseudorandom stream, where k > seed length, thus making it information-theoretically possible but computationally intractable to derive the seed; use the k+1st bit as the coin; and then publish ...
In fact, the question itself seems superficially similar to the halting problem, where "running off the rails" is the analogue for "halting"
If you want to draw an analogy to halting, then what that analogy actually says is: There are lots of programs that provably halt, and lots that provably don't halt, and lots that aren't provable either way. The impossibility of the halting problem is irrelevant, because we don't need a fully general classifier that works for every possible program. We only need to find a single program that prov...
Then what you should be asking is "which problems are in BQP?" (if you just want a summary of the high level capabilities that have been proved so far), or "how do quantum circuits work?" (if you want to know what role individual qubits play). I don't think there's any meaningful answer to "a qubit's specs" short of a tutorial in the aforementioned topics. Here is one such tutorial I recommend.
"do not affect anything outside of this volume of space"
Suppose you, standing outside the specified volume, observe the end result of the AI's work: Oops, that's an example of the AI affecting you. Therefore, the AI isn't allowed to do anything at all. Suppose the AI does nothing: Oops, you can see that too, so that's also forbidden. More generally, the AI is made of matter, which will have gravitational effects on everything in its future lightcone.
Suppose I say "I prefer state X to Z, and don't express a preference between X and Y, or between Y and Z." I am not saying that X and Y are equivalent; I am merely refusing to judge.
If the result of that partial preference is that you start with Z and then decline the sequence of trades Z->Y->X, then you got dutch booked.
Otoh, maybe you want to accept the sequence Z->Y->X if you expect both trades to be offered, but decline each in isolation? But then your decision procedure is dynamically inconsistent: Standing at Z and expecting ...
I interpret Daniel_Burfoot's idea as: "import java.util.*" makes subsequent mentions of List longer, since there are more symbols in scope that it has to be distinguished from.
But I don't think that idea actually works. You can decompose the probability of a conjunction into a product of conditional probabilities, and you get the same number regardless of the order of said decomposition. Whatever probability (and corresponding total compressed size) you assign to a certain sequence of imports and symbols, you could just as well record the symbols...
Eliezer's proposal was a different notation, not an actual change in the strength of Solomonoff Induction. The usual form of SI with deterministic hypotheses is already equivalent to one with probabilistic hypotheses. Because a single hypothesis with prior probability P that assigns uniform probability to each of 2^N different bitstrings, makes the same predictions as an ensemble of 2^N deterministic hypotheses each of which has prior probability P*2^-N and predicts one of the bitstrings with certainty; and a Bayesian update in the former case is equivalen...
Is there a benefit from doing that server-side rather than client-side? I've long since configured my web browser to always use my favorite font rather than whatever is suggested by any website.
"Oh," said Professor Quirrell, "don't worry about a little rough handling. You could toss this diary in a fireplace and it would emerge unscathed.
That isn't necessarily the same level of indestructibility as a horcrux. It could just be a standard charm placed on rare books.
If I already know "I am EDT", then "I saw myself doing X" does imply "EDT outputs X as the optimal action". Logical omniscience doesn't preclude imagining counterfactual worlds, but imagining counterfactual worlds is a different operation than performing Bayesian updates. CDT constructs counterfactuals by severing some of the edges in its causal graph and then assuming certain values for the nodes that no longer have any causes. TDT does too, except with a different graph and a different choice of edges to sever.
The way EDT operates is to perform the following three steps for each possible action in turn:
- Assume that I saw myself doing X.
- Perform a Bayesian update on this new evidence.
- Calculate and record my utility.
Ideal Bayesian updates assume logical omniscience, right? Including knowledge about logical fact of what EDT would do for any given input. If you know that you are an EDT agent, and condition on all of your past observations and also on the fact that you do X, but X is not in fact what EDT does given those inputs, then as an ideal Bayesian you wi...
Yes (At least that's the general consensus among complexity theorists, though it hasn't been proved.) This doesn't contradict anything Eliezer said in the grandparent. The following are all consensus-but-not-proved:
P⊂BQP⊂EXP
P⊂NP⊂EXP
BQP≠NP (Neither is confidently predicted to be a subset of the other, though BQP⊂NP is at least plausible, while NP⊆BQP is not.)
If you don't measure any distinctions finer than P vs EXP, then you're using a ridiculously coarse scale. There are lots of complexity classes strictly between P and EXP, defined by limiting resources other than time-on-a-classical-computer. Some of them are tractable under our physics and some aren't.
If instead the simulator can read the real probability on an infinite tape... obviously it can't read the whole tape before producing an output. So it has to read, then output, then read, then output. It seems intuitive that with this strategy, it can place an absolute limit on the advantage that any attacker can achieve, but I don't have a proof of that yet.
In this model, a simulator can exactly match the desired probability in O(1) expected time per sample. (The distribution of possible running times extends to arbitrarily large values, but the L1-nor...
Answering "how will this protein most likely fold?" is computationally much easier (as far as we can tell) than answering "what protein will fold like this?"
Got a reference for that? It's not obvious to me (CS background, not bio).
What if you have an algorithm that attempts to solve the "how will this protein most likely fold?" problem, but is only tractable on 1% of possible inputs, and just gives up on the other 99%? As long as the 1% contains enough interesting structures, it'll still work as a subroutine for the "w...
The selection effect you mention only applies to offering bets, not accepting them. If Alice announces her betting odds and then Bob decides which side of the bet to take, Alice might be doing something irrational there (if she didn't have a bid-ask spread), but we can still talk about dutch books from Bob's perspective. If you want to eliminate the effect whereby Bob updates on the existence of Alice's offer before making his decision, then replace Alice with an automated market maker (setup by someone who expects to lose money in exchange for outsourcing...
I had in mind an automated wrapper generator for the "passed own sourcecode" version of the contest:
(define CliqueBot
(lambda (self opponent)
(if (eq? self opponent) 'C 'D)))
(define Wrapper
(lambda (agent)
(lambda (self opponent)
(agent agent opponent))))
(define WrappedCliqueBot
(Wrapper CliqueBot))
Note that for all values of X and Y, (WrappedCliqueBot X Y) == (CliqueBot CliqueBot Y), and there's no possible code you could add to CliqueBot that would break this identity. Now I just realized that the very fact that WrappedCliqueBot d...
How does that help? A quine-like program could just as well put its real payload in a string with a cryptographic signature, verify the signature, and then eval the string with the string as input; thus emulating the "passed its own sourcecode" format. You could mess with that if you're smart enough to locate and delete the "verify the signature" step, but then you could do that in the real "passed its own sourcecode" format too.
Conversely, even if the tournament program itself is honest, contestants can lie to their simulations of their opponents about what sourcecode the simulation is of.
Assume you have noisy measurements X1, X2, X3 of physical quantities Y1, Y2, Y3 respectively; variables 1, 2, and 3 are independent; X2 is much noisier than the others; and you want a point-estimate of Y = Y1+Y2+Y3. Then you shouldn't use either X1+X2+X3 or X1+X3. You should use E[Y1|X1] + E[Y2|X2] + E[Y3|X3]. Regression to the mean is involved in computing each of the conditional expectations. Lots of noise (relative to the width of your prior) in X2 means that E[Y2|X2] will tend to be close to the prior E[Y2] even for extreme values of X2, but E[Y2|X2] is still a better estimate of that portion of the sum than E[Y2] is.
unless you are allowed to pose infinitely many problems
Or one selected at random from an infinite class of problems.
Also, if the universe is spatially infinite, it can solve the halting problem in a deeply silly way, namely there could be an infinite string of bits somewhere, each a fixed distance from the next, that just hardcodes the solution to the halting problem.
That's why both computability theory and complexity theory require algorithms to have finite sized sourcecode.
Novikov consistency is synonymous with Stable Time Loop, where all time travelers observe the same events as they remember from their subjectively-previous iteration. This is as opposed to MWI-based time travel, where the no paradox rule merely requires that the overall distribution of time travelers arriving at t0 is equal to the overall distribution of people departing in time machines at t1.
Yes, Novikov talked about QM. He used the sum-over-histories formulation, restricted to the subset of histories that each singlehandedly form a classical stable time...
The US government made Tor? Awesome. I wonder which part of the government did it.
You can certainly postulate a physics that's both MWI and contains something sorta like Time-Turners except without the Novikov property. The problem with that isn't paradox, it just doesn't reproduce the fictional experimental evidence we're trying to explain. What's impossible is MWI with something exactly like Time-Turners including Novikov.
More precisely, you can compute the variance of the logarithm of the final estimate and, as the number of pieces gets large, it will shrink compared to the expected value of the logarithm (and even more precisely, you can use something like Hoeffding's inequality).
If success of a fermi estimate is defined to be "within a factor of 10 of the correct answer", then that's a constant bound on the allowed error of the logarithm. No "compared to the expected value of the logarithm" involved. Besides, I wouldn't expect the value of the loga...
What if, in building a non-Löb-compliant AI, you've already failed to give it part of your inference ability / trust-in-math / whatever-you-call-it? Even if the AI figures out how to not lose any more, that doesn't mean it's going to get back the part you missed.
Possibly related question: Why try to solve decision theory, rather than just using CDT and let it figure out what the right decision theory is? Because CDT uses its own impoverished notion of "consequences" when deriving what the consequence of switching decision theories is.
"50:4" in the post refers to "P(V=1|A=100)*1 : P(V=100|A=100)*100", not "EV(A=1) : EV(A=100)". EV(A=1) is irrelevant, since we know that A is in fact 100.
IIUC this addresses the ontology problem in AIXI by assuming that the domain of our utility function already covers every possible computable ontology, so that whichever one turns out to be correct, we already know what to do with it. If I take the AIXI formalism to be a literal description of the universe (i.e. not just dualism, but also that the AI is running on a hypercomputer, the environment is running on a turing computer, and the utility function cares only about the environment, not the AI's internals), then I think the proposal works.
But under the...
As far as I can tell from wikipedia's description of admissibility, it makes the same assumptions as CDT: That the outcome depends only on your action and the state of the environment, and not on any other properties of your algorithm. This assumption fails in multi-player games.
So your quote actually means: If you're going to use CDT then Bayes is the optimal way to derive your probabilities.
The intuition: For a high dimensional ball, most of the volume is near the surface, and most of the surface is near the equator (for any given choice of equator). The extremity of "most" and "near" increases with number of dimensions. The intersection of two equal-size balls is a ball minus a slice through the equator, and thus missing most of its volume even if it's a pretty thin slice.
The calculation:
Let %20=%202\int%20_{y=0}%5E{r}%20v(n-1,\sqrt{r%5E2-y%5E2})%20dy%20=%20(2%20\pi%5E{n/2}%20r%5En)%20/%20(n%20\Gamma(n/2))) which is the ...
First consider a 10000-dimensional unit ball. If we shift this ball by two units in one of the dimensions, it would no longer intersect at all with the original volume. But if we were to shift it by 1/1000 units in each of the 10000 dimensions, the shifted ball would still mostly overlap with the original ball even though we've shifted it by a total of 10 units (because the distance between the centers of the balls is only sqrt(10000)/1000 = 0.1).
Actually no, it doesn't mostly overlap. If we consider a hypercube of radius 1 (displaced along the diagonal...
you're too young (and didn't have much income before anyway) to have significant savings.
Err, I haven't yet earned as much from the lazy entrepreneur route as I would have if I had taken a standard programming job for the past 7 years (though I'll pass that point within a few months at the current rate). So don't go blaming my cohort's age if they haven't saved and/or donated as much as me. I'm with Rain in spluttering at how people can have an income and not have money.
I value my free time far too much to work for a living. So your model is correct on that count. I had planned to be mostly unemployed with occasional freelance programming jobs, and generally keep costs down.
But then a couple years ago my hobby accidentally turned into a business, and it's doing well. "Accidentally" because it started with companies contacting me and saying "We know you're giving it away for free, but free isn't good enough for us. We want to buy a bunch of copies." And because my co-founder took charge of the negotiati...
I donated 20,000$ now, in addition to 110,000$ earlier this year.
Holy pickled waffles on a pogo stick. Thanks, dude.
Is there anything you're willing to say about how you acquired that dough? My model of you has earned less in a lifetime.
Thanks very much!!
On your account, how do you learn causal models from observing someone else perform an experiment? That doesn't involve any interventions or counterfactuals. You only see what actually happens, in a system that includes a scientist.
The idea that "it wouldn't be you" isn't something I thought would be a problem
It probably doesn't help that Celestia implies "it wouldn't be you" when explaining why Hanna uploaded. If the shut-down authority was tied to her biological body, then Celestia fails to say so, and talks instead about identity. If it was tied to her name, then conflating that with the uploading is misleading. If the point of uploading was to protect her against coercion, then that would be sufficient even without any change in authority, and "Hanna n...
Now, were their experiences real? Did we make them real by marking them with a 1 - by applying the logical filter using a causal computer?
You can apply the brute-force/postselection method to CGoL without timetravel too... But in that case verifying that a proposed history obeys the laws of CGoL involves all the same arithmetic ops as simulating forwards from the initial state. (The ops can, but don't have to, be in the same order.) Likewise if there are any linear-time subregions of CGoL+timetravel. So I might guess that the execution of such a filter ...
Use a prefix-free encoding for the hypotheses. There's not 2^n hypotheses of length n: Some of the length-n bitstrings are incomplete and you'd need to add more bits in order to get a hypothesis; others are actually a length <n hypothesis plus some gibberish on the end.
Then the sum of the probabilities of all programs of all lengths combined is 1.0. After excluding the programs that don't halt, the normalization constant is Chaitin's Omega.