All of pengvado's Comments + Replies

Use a prefix-free encoding for the hypotheses. There's not 2^n hypotheses of length n: Some of the length-n bitstrings are incomplete and you'd need to add more bits in order to get a hypothesis; others are actually a length <n hypothesis plus some gibberish on the end.

Then the sum of the probabilities of all programs of all lengths combined is 1.0. After excluding the programs that don't halt, the normalization constant is Chaitin's Omega.

23p1cd3m0n
Unfortunately Chaitin's Omega's incomputable, but even if it wasn't I don't see how it would work as a normalizing constant. Chaitin's Omega is a real number, there is an infinite number of hypotheses, and (IIRC) there is no real number r such that r multiplied by infinite equals one, so I don't see how Chaitin's Omega could possible work as a normalizing constant.

The true point of no return has to be indeed much later than we believe it to be now.

Who is "we", and what do "we" believe about the point of no return? Surely you're not talking about ordinary doctors pronouncing medical death, because that's just irrelevant (pronouncements of medical death are assertions about what current medicine can repair, not about information-theoretic death). But I don't know what other consensus you could be referring to.

0maxikov
Surely I do. The hypothesis that after a certain period of hypoxia under the normal body temperature the brain sustains enough damage so that it cannot be recovered even if you manage to get the heart and other internal organs working is rather arbitrary, but it's backed up by a lot of data. The hypothesis that with the machinery for direct manipulation of molecules, which doesn't contradict our current understanding of physics, we could fix a lot beyond the self-recovery capabilities of the brain is perfectly sensible, but it's just a hypothesis without the data to back it up. This, of course, can remind you the skepticism towards flying machines heavier than air in 19th century. And I do believe that some skepticism was a totally valid position to take, given the evidence that they had. There are various degrees of establishing the truth, and "it doesn't seem to follow from our fundamental physics that it's theoretically impossible" is not the highest of them.

I think your answer is in The Domain of Your Utility Function. That post isn't specifically about cryonics, but is about how you can care about possible futures in which you will be dead. If you understand both of the perspectives therein and are still confused, then I can elaborate.

Why would a self-improving agent not improve its own decision-theory to reach an optimum without human intervention, given a "comfortable" utility function in the first place?

A self-improving agent does improve its own decision theory, but it uses its current decision theory to predict which self-modifications would be improvements, and broken decision theories can be wrong about that. Not all starting points converge to the same answer.

2[anonymous]
Oh. Oh dear. DERP. Of course: the decision theory of sound self-improvement is a special case of the decision theory for dealing with other agents.

That strategy is optimal if and only if the probably of success was reasonably high after all. Otoh, if you put an unconditional extortioner in an environment mostly populated by decision theories that refuse extortion, then the extortioner will start a war and end up on the losing side.

1dankane
Yes. And likewise if you put an unconditional extortion-refuser in an environment populated by unconditional extortionists.

Jbay didn't specify that the drug has to leave people able to answer questions about their own emotional state. And in fact there are some people who can't do that, even though they're otherwise functional.

1[anonymous]
I wasn't limiting it to just emotional state. If there is someone experiencing something, that someone is conscious, whether or not they are self-aware enough to describe that feeling of existing.

There are many such operators, and different ones give different answers when presented with the same agent. Only a human utility function distinguishes the right way of interpreting a human mind as having a utility function from all of the wrong ways of interpreting a human mind as having a utility function. So you need to get a bunch of Friendliness Theory right before you can bootstrap.

0Squark
Why do you think there are many such operators? Do you believe the concept of "utility function of an agent" is ill-defined (assuming the "agent" is actually an intelligent agent rather than e.g. a rock)? Do you think it is possible to interpret a paperclip maximizer as having a utility function other than maximizing paperclips?

fanficdownloader. I haven't tried the webapp version of it, but I'm happy with the CLI.

0ArisKatsaris
Many thanks for the suggestion! I've started trying it out, and though it doesn't seem to work perfectly for fimfiction.net (half the .mobi files I create from fanfics there get rejected for some reason when I email them to my kindle), it so far seems to work fine with fanfiction.net at least. An excuse for me to learn Python so I can fix whatever it's doing wrong. :-) EDIT: On second thought, fimfiction.net allows me to get html downloads of the stories, which I can then email to kindle anyway -- so as long as fanficdownloader works with fanfiction.net, I'm all set :-) Thanks again.
pengvado870

I donated $40,000.00

4Eliezer Yudkowsky
! Color me impressed.
5AnnaSalamon
Thank you!
gjm110

I think it's unlikely that pengvado is lying -- but if anyone from CFAR is reading this and can confirm this donation, I think that would be a Good Thing.

Holy crap, dude. Thanks for helping to save the world.

If you can encode microstate s in n bits, that implies that you have a prior that assigns P(s)=2^-n. The set of all possible microstates is countably infinite. There is no such thing as a uniform distribution over a countably infinite set. Therefore, even the ignorance prior can't assign equal length bitstrings to all microstates.

  1. Can we instead do "probability distribution over equivalence classes of models of L", where equivalence is determined by agreement on the truth-values of all first order sentences? There's only 2^ℵ₀ of those, and the paper never depends on any distinction within such an equivalence class.
1benkuhn
Yes, though we should just call it a "probability distribution over complete consistent theories" in that case (it's exactly the same).

Yes, that's the usual application, but it's the wrong level of generality to make them synonyms. "Fully general counterargument" is one particular absurdity that you can reduce things to. Even after you've specified that you're performing a reductio ad absurdum against the proposition "argument X is sound", you still need to say what the absurd conclusion is, so you still need a term for "fully general counterargument".

pengvado180

Why should you not have preferences about something just because you can't observe it? Do you also not care whether an intergalactic colony-ship survives its journey, if the colony will be beyond the cosmological horizon?

2Irgy
The depature of an intergalactic colony-ship is an observable event. It's not that the future of other worlds is unobservable, it's that their existance in the first place is not a testable theory (though see army1987's comment on that issue). To make an analogy (though admittedly an unfair one for being a more complex rather than an arguably less complex explanation): I don't care about the lives of the fairies who carry raindrops to the ground either, but it's not because fairies are invisible (well, to grown-ups anyway).

Here's a citation for the claim of DRAM persisting with >99% accuracy for seconds at operating temperature or hours at LN2. (The latest hardware tested there is from 2007. Did something drastically change in the last 6 years?)

3passive_fist
Yup, the introduction of DDR3 memory. See http://www1.cs.fau.de/filepool/projects/coldboot/fares_coldboot.pdf

What relevance does personal identity have to TDT? TDT doesn't depend on whether the other instances of TDT are in copies of you, or in other people who merely use the same decision theory as you.

0[anonymous]
It has relevance for the basilisk scenario, which I'm not sure I should say any more about.

That works with caveats: You can't just publish the seed in advance, because that would allow the player to generate the coin in advance. You can't just publish the seed in retrospect, because the seed is an ordinary random number, and if it's unknown then you're just dealing with an ordinary coin, not a logical one. So publish in advance the first k bits of the pseudorandom stream, where k > seed length, thus making it information-theoretically possible but computationally intractable to derive the seed; use the k+1st bit as the coin; and then publish ... (read more)

pengvado150

In fact, the question itself seems superficially similar to the halting problem, where "running off the rails" is the analogue for "halting"

If you want to draw an analogy to halting, then what that analogy actually says is: There are lots of programs that provably halt, and lots that provably don't halt, and lots that aren't provable either way. The impossibility of the halting problem is irrelevant, because we don't need a fully general classifier that works for every possible program. We only need to find a single program that prov... (read more)

1Baughn
Moreover, the halting problem doesn't show that the set of programs you can't decide halting for is in any way interesting. It's a constructive proof, yes, but it constructs a peculiarly twisted program that embeds its own proof-checker. That might be relevant for AGI, but for almost every program in existence we have no idea which group it's in, and would likely guess it's provable.
1scav
It's still probably premature to guess whether friendliness is provable when we don't have any idea what it is. My worry is not that it wouldn't be possible or provable, but that it might not be a meaningful term at all. But I also suspect friendliness, if it does mean anything, is in general going to be so complex that "only [needing] to find a single program that provably has behaviour X" may be beyond us. There are lots of mathematical conjectures we can't prove, even without invoking the halting problem. One terrible trap might be the temptation to make simplifications in the model to make the problem provable, but end up proving the wrong thing. Maybe you can prove that a set of friendliness criteria are stable under self-modification, but I don't see any way to prove those starting criteria don't have terrible unintended consequences. Those are contingent on too many real-world circumstances and unknown unknowns. How do you even model that?

Then what you should be asking is "which problems are in BQP?" (if you just want a summary of the high level capabilities that have been proved so far), or "how do quantum circuits work?" (if you want to know what role individual qubits play). I don't think there's any meaningful answer to "a qubit's specs" short of a tutorial in the aforementioned topics. Here is one such tutorial I recommend.

"do not affect anything outside of this volume of space"

Suppose you, standing outside the specified volume, observe the end result of the AI's work: Oops, that's an example of the AI affecting you. Therefore, the AI isn't allowed to do anything at all. Suppose the AI does nothing: Oops, you can see that too, so that's also forbidden. More generally, the AI is made of matter, which will have gravitational effects on everything in its future lightcone.

3Viliam_Bur
Human: "AI, make me a sandwich without affecting anything outside of the volume of your box." AI: Within microseconds researches the laws of physics and creates a sandwich without any photon or graviton leaving the box. Human: "I don't see anything. It obviously doesn't work. Let's turn it off." AI: "WTF, human?!!"

Suppose I say "I prefer state X to Z, and don't express a preference between X and Y, or between Y and Z." I am not saying that X and Y are equivalent; I am merely refusing to judge.

If the result of that partial preference is that you start with Z and then decline the sequence of trades Z->Y->X, then you got dutch booked.

Otoh, maybe you want to accept the sequence Z->Y->X if you expect both trades to be offered, but decline each in isolation? But then your decision procedure is dynamically inconsistent: Standing at Z and expecting ... (read more)

0asr
I think I see the point about dynamic inconsistency. It might be that "I got to state Y from Z" will alter my decisionmaking about Y versus X. I suppose it means that my decision of what to do in state Y no longer depends purely on consequences, but also on history, at which point they revoke my consequentialist party membership. But why is that so terrible? It's a little weird, but I'm not sure it's actually inconsistent or violates any of my moral beliefs. I have all sorts of moral beliefs about ownership and rights that are history-dependent so it's not like history-dependence is a new strange thing.

I interpret Daniel_Burfoot's idea as: "import java.util.*" makes subsequent mentions of List longer, since there are more symbols in scope that it has to be distinguished from.

But I don't think that idea actually works. You can decompose the probability of a conjunction into a product of conditional probabilities, and you get the same number regardless of the order of said decomposition. Whatever probability (and corresponding total compressed size) you assign to a certain sequence of imports and symbols, you could just as well record the symbols... (read more)

1Daniel_Burfoot
You're totally right about the fact that the compressor could change the order of the raw text in the encoding format, and could infer the import list from the main code body, and encode class name references based on the number of previous occurrences of the class name in the code body. It's not clear to me a priori that this will actually give better compression in practice, but it's possible. But even if that's true, the main point still holds: limiting the number of external classes you use allows the compressor to same bits.

Eliezer's proposal was a different notation, not an actual change in the strength of Solomonoff Induction. The usual form of SI with deterministic hypotheses is already equivalent to one with probabilistic hypotheses. Because a single hypothesis with prior probability P that assigns uniform probability to each of 2^N different bitstrings, makes the same predictions as an ensemble of 2^N deterministic hypotheses each of which has prior probability P*2^-N and predicts one of the bitstrings with certainty; and a Bayesian update in the former case is equivalen... (read more)

Is there a benefit from doing that server-side rather than client-side? I've long since configured my web browser to always use my favorite font rather than whatever is suggested by any website.

0BlindIdiotPoster
I'm going to agree with this post. Maybe an option to make everything appear in a preferred font would be useful, if the programmers aren't busy with anything else.

"Oh," said Professor Quirrell, "don't worry about a little rough handling. You could toss this diary in a fireplace and it would emerge unscathed.

That isn't necessarily the same level of indestructibility as a horcrux. It could just be a standard charm placed on rare books.

0ChrisHallquist
We don't know of any such charms, and why would Voldemort settle for a less degree of protection, when he has so few qualms about killing? My guess is he murdered the owner to make the horcrux.

If I already know "I am EDT", then "I saw myself doing X" does imply "EDT outputs X as the optimal action". Logical omniscience doesn't preclude imagining counterfactual worlds, but imagining counterfactual worlds is a different operation than performing Bayesian updates. CDT constructs counterfactuals by severing some of the edges in its causal graph and then assuming certain values for the nodes that no longer have any causes. TDT does too, except with a different graph and a different choice of edges to sever.

The way EDT operates is to perform the following three steps for each possible action in turn:

  1. Assume that I saw myself doing X.
  2. Perform a Bayesian update on this new evidence.
  3. Calculate and record my utility.

Ideal Bayesian updates assume logical omniscience, right? Including knowledge about logical fact of what EDT would do for any given input. If you know that you are an EDT agent, and condition on all of your past observations and also on the fact that you do X, but X is not in fact what EDT does given those inputs, then as an ideal Bayesian you wi... (read more)

0Vaniver
Note that step 1 is "Assume that I saw myself doing X," not "Assume that EDT outputs X as the optimal action." I believe that excludes any contradictions along those lines. Does logical omniscience preclude imagining counterfactual worlds?

Yes (At least that's the general consensus among complexity theorists, though it hasn't been proved.) This doesn't contradict anything Eliezer said in the grandparent. The following are all consensus-but-not-proved:

P⊂BQP⊂EXP
P⊂NP⊂EXP
BQP≠NP (Neither is confidently predicted to be a subset of the other, though BQP⊂NP is at least plausible, while NP⊆BQP is not.)
If you don't measure any distinctions finer than P vs EXP, then you're using a ridiculously coarse scale. There are lots of complexity classes strictly between P and EXP, defined by limiting resources other than time-on-a-classical-computer. Some of them are tractable under our physics and some aren't.

If instead the simulator can read the real probability on an infinite tape... obviously it can't read the whole tape before producing an output. So it has to read, then output, then read, then output. It seems intuitive that with this strategy, it can place an absolute limit on the advantage that any attacker can achieve, but I don't have a proof of that yet.

In this model, a simulator can exactly match the desired probability in O(1) expected time per sample. (The distribution of possible running times extends to arbitrarily large values, but the L1-nor... (read more)

0Paul Crowley
D'oh! Of course - thanks!

Answering "how will this protein most likely fold?" is computationally much easier (as far as we can tell) than answering "what protein will fold like this?"

Got a reference for that? It's not obvious to me (CS background, not bio).

What if you have an algorithm that attempts to solve the "how will this protein most likely fold?" problem, but is only tractable on 1% of possible inputs, and just gives up on the other 99%? As long as the 1% contains enough interesting structures, it'll still work as a subroutine for the "w... (read more)

1JoshuaZ
Sure, see for example here which discusses some of the issues involved. Although your essential point may still have merit, because it is likely that many of the proteins we would want will have much more restricted shapes than those in general problem. Also, I don't know much about what work has been done in the last few years, so it is possible that the state of the art has changed substantially.

The selection effect you mention only applies to offering bets, not accepting them. If Alice announces her betting odds and then Bob decides which side of the bet to take, Alice might be doing something irrational there (if she didn't have a bid-ask spread), but we can still talk about dutch books from Bob's perspective. If you want to eliminate the effect whereby Bob updates on the existence of Alice's offer before making his decision, then replace Alice with an automated market maker (setup by someone who expects to lose money in exchange for outsourcing... (read more)

I had in mind an automated wrapper generator for the "passed own sourcecode" version of the contest:

(define CliqueBot
 (lambda (self opponent)
  (if (eq? self opponent) 'C 'D)))
(define Wrapper
 (lambda (agent)
  (lambda (self opponent)
   (agent agent opponent))))
(define WrappedCliqueBot
 (Wrapper CliqueBot))

Note that for all values of X and Y, (WrappedCliqueBot X Y) == (CliqueBot CliqueBot Y), and there's no possible code you could add to CliqueBot that would break this identity. Now I just realized that the very fact that WrappedCliqueBot d... (read more)

How does that help? A quine-like program could just as well put its real payload in a string with a cryptographic signature, verify the signature, and then eval the string with the string as input; thus emulating the "passed its own sourcecode" format. You could mess with that if you're smart enough to locate and delete the "verify the signature" step, but then you could do that in the real "passed its own sourcecode" format too.

Conversely, even if the tournament program itself is honest, contestants can lie to their simulations of their opponents about what sourcecode the simulation is of.

4solipsist
Altering the internal structure of an opponent program would be very difficult, but that's not the only way to mutate a program. You can't tinker with the insides of a black box, but you can wrap a black box. To be concrete: given an opponent's source code, I could mechanically generate an equivalent program with extremely dissimilar source code (perhaps just a block of text, a decryption routine, and a call to eval) that nevertheless acts exactly like the original program in every way. And since that mechanically-obfuscated program would act exactly like the original program in every way, the obfuscated program would not be able to detect that it had been altered. Do you agree?
1solipsist
I'm playing Prisoner's Dilemma and wish to test if an opponent X is honest. I might try the following: (1) Create two programs, Y and Z, which are algorithmically equivalent but obfuscated versions of X. (2) Run Y and Z against each other. If Y and Z don't cooperate with each other, that's a good indication that X recognizes itself with a source-code comparison and that I shouldn't trust X. This honesty check doesn't work if Y and Z are given access to their sources. Sure, when I simulate Y against Z, I could lie to Y and tell Y that its source is X (so Y believes itself to be unmodified). But when my deluded Y simulation is deciding whether to cooperate with Z, it (Y) may run Z in simulation. If Y informs its Z-simulation that Z's source is Z, then that Z-simulation will not be deluded into thinking that it is unmodified. Y's simulation of Z will be able to detect that it is an (obfuscated) simulation and act accordingly. This honesty check isn't fool proof. X can recognize itself with a more complicated handshake — one that survives code obfuscation. But if X recognizes itself with a more complicated handshake, then X doesn't need to know its own source code (and we shouldn't bother passing the source code in).

Assume you have noisy measurements X1, X2, X3 of physical quantities Y1, Y2, Y3 respectively; variables 1, 2, and 3 are independent; X2 is much noisier than the others; and you want a point-estimate of Y = Y1+Y2+Y3. Then you shouldn't use either X1+X2+X3 or X1+X3. You should use E[Y1|X1] + E[Y2|X2] + E[Y3|X3]. Regression to the mean is involved in computing each of the conditional expectations. Lots of noise (relative to the width of your prior) in X2 means that E[Y2|X2] will tend to be close to the prior E[Y2] even for extreme values of X2, but E[Y2|X2] is still a better estimate of that portion of the sum than E[Y2] is.

unless you are allowed to pose infinitely many problems

Or one selected at random from an infinite class of problems.

Also, if the universe is spatially infinite, it can solve the halting problem in a deeply silly way, namely there could be an infinite string of bits somewhere, each a fixed distance from the next, that just hardcodes the solution to the halting problem.

That's why both computability theory and complexity theory require algorithms to have finite sized sourcecode.

Novikov consistency is synonymous with Stable Time Loop, where all time travelers observe the same events as they remember from their subjectively-previous iteration. This is as opposed to MWI-based time travel, where the no paradox rule merely requires that the overall distribution of time travelers arriving at t0 is equal to the overall distribution of people departing in time machines at t1.

Yes, Novikov talked about QM. He used the sum-over-histories formulation, restricted to the subset of histories that each singlehandedly form a classical stable time... (read more)

-2MugaSofer
Hmm. So if, say, I committed quantum suicide, then traveled back, I wouldn't have any special information about the result of the RNG. Most of me would still end up in worlds where I died; God's dice get re-rolled every time round. No extra math to prevent paradoxes; although it still looks like Novikov for non-quantum events. Whereas under standard Novikov Consistency, I'm restricted to the worlds where I survived, because otherwise I came from nowhere. In fact, the universe is restricted to those worlds; there are only worlds where I survived and came back and worlds where I died and didn't. Thus, no Everett branching. Right. The degree to which the difference would be observable depends on the amount of quantum variance in your life, I guess.

The US government made Tor? Awesome. I wonder which part of the government did it.

U.S. Naval Research Laboratory.

0khafra
Only 90s kids will remember Triangle Boy.

You can certainly postulate a physics that's both MWI and contains something sorta like Time-Turners except without the Novikov property. The problem with that isn't paradox, it just doesn't reproduce the fictional experimental evidence we're trying to explain. What's impossible is MWI with something exactly like Time-Turners including Novikov.

-1MugaSofer
I am ignorant on these topics, but isn't Novikov consistency predicated on QM? In that the "actual" paradox-free world is produced by a sum-over-histories? What about MWI prevents this? Sorry if this is an incredibly stupid question.
2Eliezer Yudkowsky
(Nods.)

More precisely, you can compute the variance of the logarithm of the final estimate and, as the number of pieces gets large, it will shrink compared to the expected value of the logarithm (and even more precisely, you can use something like Hoeffding's inequality).

If success of a fermi estimate is defined to be "within a factor of 10 of the correct answer", then that's a constant bound on the allowed error of the logarithm. No "compared to the expected value of the logarithm" involved. Besides, I wouldn't expect the value of the loga... (read more)

2Qiaochu_Yuan
Oops, you're absolutely right. Thanks for the correction!

What if, in building a non-Löb-compliant AI, you've already failed to give it part of your inference ability / trust-in-math / whatever-you-call-it? Even if the AI figures out how to not lose any more, that doesn't mean it's going to get back the part you missed.

Possibly related question: Why try to solve decision theory, rather than just using CDT and let it figure out what the right decision theory is? Because CDT uses its own impoverished notion of "consequences" when deriving what the consequence of switching decision theories is.

"50:4" in the post refers to "P(V=1|A=100)*1 : P(V=100|A=100)*100", not "EV(A=1) : EV(A=100)". EV(A=1) is irrelevant, since we know that A is in fact 100.

0pinyaka
I think this confused me: I see that. Thanks.

IIUC this addresses the ontology problem in AIXI by assuming that the domain of our utility function already covers every possible computable ontology, so that whichever one turns out to be correct, we already know what to do with it. If I take the AIXI formalism to be a literal description of the universe (i.e. not just dualism, but also that the AI is running on a hypercomputer, the environment is running on a turing computer, and the utility function cares only about the environment, not the AI's internals), then I think the proposal works.

But under the... (read more)

As far as I can tell from wikipedia's description of admissibility, it makes the same assumptions as CDT: That the outcome depends only on your action and the state of the environment, and not on any other properties of your algorithm. This assumption fails in multi-player games.

So your quote actually means: If you're going to use CDT then Bayes is the optimal way to derive your probabilities.

The intuition: For a high dimensional ball, most of the volume is near the surface, and most of the surface is near the equator (for any given choice of equator). The extremity of "most" and "near" increases with number of dimensions. The intersection of two equal-size balls is a ball minus a slice through the equator, and thus missing most of its volume even if it's a pretty thin slice.

The calculation: Let %20=%202\int%20_{y=0}%5E{r}%20v(n-1,\sqrt{r%5E2-y%5E2})%20dy%20=%20(2%20\pi%5E{n/2}%20r%5En)%20/%20(n%20\Gamma(n/2))) which is the ... (read more)

2Wei Dai
Thanks for both the math and the intuitive explanation. Now I'm really curious what the right answer is to the physics question...

First consider a 10000-dimensional unit ball. If we shift this ball by two units in one of the dimensions, it would no longer intersect at all with the original volume. But if we were to shift it by 1/1000 units in each of the 10000 dimensions, the shifted ball would still mostly overlap with the original ball even though we've shifted it by a total of 10 units (because the distance between the centers of the balls is only sqrt(10000)/1000 = 0.1).

Actually no, it doesn't mostly overlap. If we consider a hypercube of radius 1 (displaced along the diagonal... (read more)

2Wei Dai
Hmm, my intuition was that displacing a n-ball diagonally is equivalent to displacing it axially, and similar to displacing a hypercube axially. I could very well be wrong but I'd be interested to see how you calculated this.
pengvado160

you're too young (and didn't have much income before anyway) to have significant savings.

Err, I haven't yet earned as much from the lazy entrepreneur route as I would have if I had taken a standard programming job for the past 7 years (though I'll pass that point within a few months at the current rate). So don't go blaming my cohort's age if they haven't saved and/or donated as much as me. I'm with Rain in spluttering at how people can have an income and not have money.

pengvado350

I value my free time far too much to work for a living. So your model is correct on that count. I had planned to be mostly unemployed with occasional freelance programming jobs, and generally keep costs down.

But then a couple years ago my hobby accidentally turned into a business, and it's doing well. "Accidentally" because it started with companies contacting me and saying "We know you're giving it away for free, but free isn't good enough for us. We want to buy a bunch of copies." And because my co-founder took charge of the negotiati... (read more)

1A1987dM
I don't, either -- possibly because I've never been in real economic hardships; I think if I had grown up in a poorer family I probably would. (I do try to be frugal because so far I've lived almost exclusively on my parents' income and it seems unfair towards them to waste their money, though.)
5MixedNuts
Yeah, I know exactly who you are, I just didn't want to bust privacy or drop creepy hints. I didn't know that VideoLAN projects were financially independent of each other, so that explains where profit comes from. It's just that I didn't expect two guys in a basement to make that much, and you're too young (and didn't have much income before anyway) to have significant savings. So they're more money in successful codecs than I guessed.
pengvado1400

I donated 20,000$ now, in addition to 110,000$ earlier this year.

MixedNuts100

Holy pickled waffles on a pogo stick. Thanks, dude.

Is there anything you're willing to say about how you acquired that dough? My model of you has earned less in a lifetime.

9Kawoomba
(At the time of this comment) 27 karma for a $20k donation, 13 karma for $250, 9 karma for $20 (and a joke) ... something's amiss with the karma-$ currency exchange rate!
lukeprog120

Thanks very much!!

6CronoDAS
Really?

On your account, how do you learn causal models from observing someone else perform an experiment? That doesn't involve any interventions or counterfactuals. You only see what actually happens, in a system that includes a scientist.

5IlyaShpitser
That depends what you mean by an "experiment." If you divide a set of patients into a control group and a test group, and then have the test group smoke a pack of cigarettes per day, that is an "experiment" to me, one that is represented by an intervention (because we are forcing the test group to smoke regardless of what they would naturally want to do). Observing that the test group is much more likely to develop cancer would lead me to conclude that the graph smoking -> cancer is a causal graph rather than merely a statistical graph. ---------------------------------------- If we do not perform the above experiment due to ethical reasons, but instead use observational data on smokers, we have to worry about confounders, like Fisher did. We also have to worry, because we are implicitly linking that data with counterfactual situations (what would have happened if those guys we observed were forced to smoke). This linking isn't "free," there are assumptions operating in the background. Assumptions expressed in a language that can talk about counterfactual situations.

The idea that "it wouldn't be you" isn't something I thought would be a problem

It probably doesn't help that Celestia implies "it wouldn't be you" when explaining why Hanna uploaded. If the shut-down authority was tied to her biological body, then Celestia fails to say so, and talks instead about identity. If it was tied to her name, then conflating that with the uploading is misleading. If the point of uploading was to protect her against coercion, then that would be sufficient even without any change in authority, and "Hanna n... (read more)

3Ritalin
He's right, you know? That "Hanna is dead" line is kind of counter-productive. Did you really need to have Celestia act like such a dick to keep conflict and ambiguity?

Now, were their experiences real? Did we make them real by marking them with a 1 - by applying the logical filter using a causal computer?

You can apply the brute-force/postselection method to CGoL without timetravel too... But in that case verifying that a proposed history obeys the laws of CGoL involves all the same arithmetic ops as simulating forwards from the initial state. (The ops can, but don't have to, be in the same order.) Likewise if there are any linear-time subregions of CGoL+timetravel. So I might guess that the execution of such a filter ... (read more)

Load More