I would suggest changing this system by defining to mean that no is the Gödel number of a proof of an inconsistency in ZFC (instead of just asserting that isn't). The purpose of this is to make it so that if ZFC were inconsistent, then we only end up talking about a finite number of levels of truth predicate. More specifically, I'd define to be PA plus the axiom schema
Then, it seems that Jacob Hilton's proof that the waterfalls are consistent goes through for this waterfall:
Work in ZFC and assume that ZFC is i
...Hm; we could add an uninterpreted predicate symbol to the language of arithmetic, and let and . Then, it seems like the only barrier to recursive enumerability of is that 's opinions about aren't computable; this seems worrying in practice, since it seems certain that we would like logical uncertainty to be able to reason about the values of computations that use more resources than we use to compute our own probability estimates. But on the other hand, all of this makes this sound like an issue of self-reference,
...The other direction follows from the fact that the algorithm is bounded, and PA can simply show the execution trace of in steps.
Unimportant technical point: I think the length of the PA proof grows faster than this. (More precisely, the length in symbols, rather than the length in number of statements; we're almost always interested in the former, since it determines how quickly a proof can be checked or be found by exhaustive search.) The obvious way of showing in PA is to successively show for higher and higher that "after ticks, the T
...Next, we consider the case that PA is consistent and work through the agent’s decision. PA can’t prove , since we used the chicken rule, so since the sentence is easily provable, the sentence (ie. the first sentence that the agents checks for proofs of) must be unprovable.
It seems like this argument needs soundness of PA, not just consistency of PA. Do you see a way to prove in PA that if , then PA is inconsistent?
[edited to add:] However, your idea reminds me of my post on the odd counterfactuals of playing chicken
...If you replace the inner "" by "", then the literal thing you wrote follows from the reflection principle: Suppose that the outer probability is . Then
Now, implies , which by the converse of the outer reflection principle yields , whence . Now, by the forward direction of the outer reflection principle, we have
which, by the outer reflection principle again, impl
...This is very interesting!
My main question, which I didn't see an answer to on my first read, is: If the agent proves that action leads to the goal being achieved, are there any conditions under which this implies that the goal actually is achieved? The procrastination paradox shows that this isn't true in general. Is there a class of sentences (e.g., all sentences, though I don't expect that to be true in this case) such that if then ? In other words, do we have some guarantee that we do better at actually achieving than an agent which us
...Categorization is hard! :-) I wanted to break it up because long lists are annoying to read, but there was certainly some arbitrariness in dividing it up. I've moved "resource gathering agent" to the odds & ends.
Want to echo Nate's points!
One particular thing that I wanted to emphasize is that I think you can see as a thread on this forum (in particular, the modal UDT work is relevant) is that it's useful to make formal toy models where the math is fully specified, so that you can prove theorems about what exactly an agent would do (or, sometimes, write a program that figures it out for you). When you write out things that explicitly, then, for example, it becomes clearer that you need to assume that a decision problem is "fair" (extensional) to get certain result
...Some additions about how the initial system is going to work:
Non-member contributions (comments and links) are going to become publicly visible when they have received 2 likes (from members---only members can like things).
However, members will be able to reply to a contribution as soon as it has received 1 like. This means that if you think someone's made a useful contribution, and you want to reply to them about it, you don't have to wait for a second person to Like it before you can write that reply. (It won't be publicly visible until the contribution has two Likes, though.)
Yeah, that sounds good! Of course, by the Kripke levels argument, it's sufficient to consider 's of the form . And we might want to have a separate notion of " leads to at level ", which we can actually implement in a finite modal formula. This seems to suggest a version of modal UDT that tries to prove things in PA, then if that has ambiguous counterfactuals (i.e., it can't prove for any ) we try PA+1 and so on up to some finite ; then we can hope these versions of UDT approximate optimality according to your revised version of " leads t
...Fixed---thanks, Patrick!
Regarding the example, I earlier defined "action leads to outcome " to mean the conjunction of and ; i.e., we check for spurious counterfactuals before believing that tells us something about what action leads to, and we only consider ourselves "fully informed" in this sense if we have non-spurious information for each . (Of course, my follow-up post is about how that's still unsatisfactory; the reason to define this notion of "fully informative" so explicitly was really to be able to say more clearly i
...(Thanks for the feedback!)
In our chat about this, Eliezer said that, aside from the difficulty of making a human-level mathematical philosophy engine aligned with our goals, an additional significant relevant disagreement with Paul is that Eliezer thinks it's likely that we'll use low-level self-improvement on the road to human-level AI; he used the analogies of programmers using compilers instead of writing machine code directly, and of EURISKO helping Lenat. (Again, hoping I'm not misrepresenting.)
This seems like a plausible scenario to me, but I'm not convinced it argues for the
......though although I'm guessing the variant of the reflective oracle discussed in the comment thread may be approximable, it seems less likely that a version of AIXI can be defined based on it that would be approximable.
Sorry for the delayed reply! I like this post, and agree with much of what you're saying. I guess I disagree with the particular line you draw, though; I think there's an interesting line, but it's at "is there a Turing machine which will halt and give the answer to this problem" rather than "is there a Turing machine which will spit out the correct answer for a large enough input (but will spit out wrong answers for smaller inputs, and you don't know what number is large enough)". The latter of these doesn't seem that qualitatively different to me from "i
...I think the main disagreement is about whether it's possible to get an initial system which is powerful in the ways needed for your proposal and which is knowably aligned with our goals; some more about this in my reply to your post, which I've finally posted, though there I mostly discuss my own position rather than Eliezer's.
Actually, drnickbone's original LessWrong post introducing evil problems also gives an extension to the case you are considering: The evil decision problem gives the agent three or more options, and rewards the one that the "victim" decision theory assigns the least probability to (breaking ties lexicographically). Then, no decision theory can put probability on the action that is rewarded in its probabilistic evil problem.
True. This looks to me like an effect of EDT not being stable under self-modification, although here the issue is handicapping itself through external means rather than self-modification---like, if you offer a CDT agent a potion that will make it unable to lift more than one box before it enters Newcomb's problem (i.e., before Omega makes its observation of the agent), then it'll cheerfully take it and pay you for the privilege.
Thread for proofs of results claimed above (not limited to stuff not found in Lindström); contributions appreciated. Stuff not in Lindström includes the uniqueness of arithmetic fixed points in PA (found in the modal agents paper), and I think the version of the fixed point theorem with more than one (the latter should be provable by iterated application of the fixed point theorem for a single ).
Thanks for expanding on your construction! I hadn't thought of the recursive construction, that's really neat.
I'm not that worried about the application to AIXI: unless I'm missing something, we can just additionally give our machines access to a double halting oracle (for ordinary Turing machines), and recover the same power. I considered doing that and didn't go with it because the stuff in my post seemed slightly more elegant, but if there's a reason to prefer the other version, it seems fine to use it.
I'm not clear on what you mean by "
...Thanks! I didn't really think at all about whether or not "money-pump" was the appropriate word (I'm not sure what the exact definition is); have now changed "way to money-pump EDT agents" into "way to get EDT agents to pay you for managing the news for them".
About determining whether something's a query: A formulation of the statement we want to test is, (A) "for every (infinite) input on the oracle tape, and every , there is a such that with probability , the program halts in steps."
I think this is equivalent to, (B) "For every , there is a such that, for any input on the oracle tape, the program halts within timesteps with probability ."
Reason why I think this is equivalent: Clearly (B) implies (A). Suppose that (B) is false; then
...In the other direction, suppose that is a primitive recursive predicate, and consider the following probabilistic Turing machine: First, randomly choose an (placing positive probability on every natural number). Then, search exhaustively for an such that is true. If you find one, halt and output . This machine is a query if and only if . Together with the parent comment, this shows that having an oracle that tells you whether or not something is a query is equivalent to having an oracle for statements (i.e., a double h
...Aagh... I'm conflicted. I really like the simplicity of having players correspond 1:1 to calls to the oracle, and their mixed strategies to the probability that the oracle returns "true", but the implication that the oracle must return the same answer if it's called twice on the same arguments during the execution of a single program interacts very badly with the intuitive picture of the applications I have in mind.
The intuitive picture is that the agents under consideration are living in a world whose laws of physics are probabilistic and allow for the co
...Fixed the typo, thanks!
I considered describing the probabilities of the oracle returning "true" by a different function , but it seemed too pedantic to have a different letter. Maybe that's wrong, but it still feels too pedantic. If I do things that way I probably shouldn't be writing " returns 'true' if...", though...
Thanks! I've been also thinking about making a finite version; my main motivation was that I've been hoping to take my still-incompletely-specified AIXI variant and find an analog variant of AIXItl, the computable version of AIXI that considers only hypotheses which terminate in time and whose source length is .
I think that would in fact be a pretty reasonable finite set of programs to use in what you're suggesting, since "we want it to terminate in time " provides a motivation for why we would require programs to terminate surely, as opposed to al
...Hm, I think it would be expectedly helpful to explain things more, but it would also take more time, and aiming much lower would take much more time. (That's part of why it's taking me so long to turn these things into research papers, although in that case the length limitations make the constraint even more difficult.) At the moment, I'm trying to get the things I'm working on out there at all, especially with something like this where I haven't yet worked through all the details of whether I can actually do my intended application based on it. After all
...A model of UDT with a halting oracle searches only for one utility value for each action. I'm guessing the other formulation just wasn't obvious at the time? (I don't remember realizing the possibility of playing chicken implicitly before Will Sawin advertised it to me, though I think he attributed it to you.)
Thanks! :-)
The reflective assignments correspond very directly to Nash equilibria (albeit in an -to- manner, because given any finite game, a reflective assignment contains information about that game but also information about many other things). So I wouldn't quite say that they correspond to methods of equilibrium selection---e.g., if two methods of equilibrium selection give you the same Nash equilibrium, they're not distinguishable on the level of reflective assignments. But yeah, the question of what equilibrium gets played gets outsourced to the o
...However, the approach is designed so that these "spurious" logical implications are unprovable, so they don’t interfere with decision-making. The proof of that is left as an easy exercise.
I don't think this is technically true as stated; it seems to be possible that the agent proves some spurious counterfactuals as long as the outcome it does in fact obtain is the best possible one. (This is of course harmless!) Say the agent has two possible actions, and , leading to outcomes and , respectively. The latter is preferred, and thes
...I know this is supposed to be just introductory, but I actually think that the complete reformulation of UDT-with-a-halting-oracle in terms of modal logic is really interesting! For starters, it allows us to compare UDT and modal agents in the same framework (with the right 's, we can see this version of UDT as a modal agent). It would also be really neat if we could write an "interpreter" that allows us to write UDT as a program calling a halting oracle, and then evaluate what it does by way of modal logic.
But also, it allows us to give a nice definition
...Ask yourself if you would want to revive someone frozen 100 years ago. Yes. They don't deserve to die. Kthx next.
I wish that this were on Less Wrong, so that I could vote this up.
Garrett, since Anonymous reply was a little implicit, the point is that infants have a larger chance of dying before reproducing than young adults, so expected number of future offspring increases during childhood (when at each point counting only non-deceased children).
Aron, almost; it's because they get older, and only future children are relevant. Whether they've had children won't change the value except insofar it changes the chance for future children.
Me: ...so IIUC, we expect a large influence of random variation in the sample.
Bzzzt! Wrong.
Upon m...
Eliezer, right, thanks. And I hadn't noticed about the correlations of the subcategories...
Might we get an even higher correlation if we tried to take into account the reproductive opportunity cost of raising a child of age X to independent maturity, while discarding all sunk costs to raise a child to age X?
I haven't done the math, but my intuition says that upon observing the highest! correlation! ever!, surely our subjective probability must go towards a high true underlying correlation and having picked a sample with a particularly high correlation? (Conditioning on the paper not being wrong due to human error or fake, of course -- I don't ...
Any chance you'd consider installing jsMath? (Client-side library rendering LaTeX math. Formatting math in plain HTML won't kill you, but there are other things you can do with the same amount of effort that will make you stronger still :-))
We might even cooperate in the Prisoner's Dilemma. But we would never be friends with them. They would never see us as anything but means to an end. They would never shed a tear for us, nor smile for our joys. And the others of their own kind would receive no different consideration, nor have any sense that they were missing something important thereby.
...but beware of using that as a reason to think of them as humans in chitin exoskeletons :-)
Robin, I suspect that despite how it may look from a high level, the lives of most of the people you refer to probably do differ enough from year to year that they will in fact have new experiences and learn something new, and that they would in fact find it unbearable if their world were so static as to come even a little close to being video game repetitive.
That said, I would agree that many people seem not to act day-to-day as if they put a premium on Eliezer-style novelty, but that seems like it could be better explained by Eliezer's boredom being a FAR value than by the concept being specific to Eliezer :-)
thinks -- Okay, so if I understand you correctly now, the essential thing I was missing that you meant to imply was that the utility of living forever must necessarily be equal to (cannot be larger than) the limit of the utilities of living a finite number of years. Then, if u(live forever) is finite, p times the difference between u(live forever) and u(live n years) must become arbitrarily small, and thus, eventually smaller than q times the difference between u(live n years) and u(live googolplex years). You then arrive at a contradiction, from which you...
Given how many times Eliezer has linked to it, it's a little surprising that nobody seems to have picked up on this yet, but the paragraph about the utility function not being up for grabs seems to have a pretty serious technical flaw:
There is no finite amount of life lived N where I would prefer a 80.0001% probability of living N years to an 0.0001% chance of living a googolplex years and an 80% chance of living forever. This is a sufficient condition to imply that my utility function is unbounded.
Let p = 80% and let q be one in a million. I'm pretty...
(To be clear, the previous comment was meant as a joke, not as a serious addition to the list -- at least not as it stands :-))
You give an absolute train wreck of a purported definition, then do your best to relive the crash over and over. Intelligence is not merely objective reason, but includes nonlinear subconscious processing, intuition, and emotional intelligence. Therefore, AGI needs quantum computing.
On a whimsical note, it is reminiscent of the unpredictability of the Infinite Improbability Drive :-)
Bogdan's presented almost exactly the argument that I too came up with while reading this thread. I would choose the specks in that argument and also in the original scenario (as long as I am not committing to the same choice being repeated an arbitrary number of times, and I am not causing more people to crash their cars than I cause not to crash their cars; the latter seems like an unlikely assumption, but thought experiments are allowed to make unlikely assumptions, and I'm interested in the moral question posed when we accept the assumption). Based on ...
Gawk! "even if they had to deal with a terrorist attack on all of these branches, say" was supposed to come after "Surely everybody here would find an outcome undesirable where all of their future Everett branches wink out of existence." (The bane of computers. On a typewriter, this would not have happened.)
Richard, I am going to assume ... that you assign an Everett branch in which you painless wink out of existence a value of zero (neither desirable or undesirable)
I'd rather say that people who find quantum suicide desirable have a utility function that does not decompose into a linear combination of individual utility functions for their individual Everett branches-- even if they had to deal with a terrorist attack on all of these branches, say. Surely everybody here would find an outcome undesirable where all of their future Everett branches wink out of e...
Two points I'd like to comment on.
Re: The second scientist had more information
I don't think this is relevant if-- as I understood from the description-- the first scientist's theory predicted experiments 11..20 with high accuracy. In this scenario, I don't think the first scientist should have learned anything that would make them reject their previous view. This seems like an important point. (I think I understood this from Tyrrell's comment.)
Re: Theories screen of theorists
I agree-- we should pick the simpler theory-- if we're able to judge them for sim...
Tyrrell, um. If "the ball will be visible" is a better theory, then "we will observe some experimental result" would be an even better theory?
Solomonoff induction, the induction method based on Kolmogorov complexity, requires the theory (program) to output the precise experimental results of all experiments so far, and in the future. So your T3 would not be a single program; rather, it would be a set of programs, each encoding specifically one experimental outcome consistent with "the ball is visible." (Which gets rid of the problem that "we will observe some experimental result" is the best possible theory :))
We should be more careful, though, about what we mean by saying that φ(x) only depends on Trm for m>n, though, since this cannot be a purely syntactic criterion if we allow quantification over the subscript (as I did here). I'm pretty sure that something can be worked out, but I'll leave it for the moment.