The game theory textbook "A Course in Microeconomic Theory" (Kreps) addresses this situation. Quoting from page 516:
...We will give an exact analysis of this problem momentarily (in smaller type), but you should have no difficulty seeing the basic trade-off; too little punishment, triggered only rarely, will give your opponent the incentive to try to get away with the noncooperative strategy. You have to punish often enough and harshly enough so that your opponent is motivated to play [cooperate] instead of [defect]. But the more often/more harsh
Back when Eliezer was writing his metaethics sequence, it would have been great to know where he was going, i.e., if he had posted ahead of time a one-paragraph technical summary of the position he set out to explain. Can you post such a summary of your position now?
Hmmmm. What do other people think of this idea?
I suspect one reason Eliezer did not do this is that when you make a long list of claims without any justification for them, it sounds silly and people don't pay attention to the rest of the sequence. But if you had first stepped them through the entire argument, they would have found no place at which they can really disagree. That's a concern, anyway.
Now, citing axioms and theorems to justify a step in a proof is not a mere social convention to make mathematicians happy. It is a useful constraint on your cognition, allowing you to make only inferences that are actually valid.
When you are trying to build up a new argument, temporarily accepting steps of uncertain correctness can be helpful (if mentally tagged as such). This strategy can move you out of local optima by prompting you to think about what further assumptions would be required to make the steps correct.
Techniques based on this kind of r...
As you wish: Drag the link on this page to your browser's bookmark bar. Clicking it on any page will turn all links black and remove the underlines, making links distinguishable from black plain text only through changes in mouse pointer style. Click again to get the original style back.
See also: A Universal Approach to Self-Referential Paradoxes, Incompleteness and Fixed Points, which treats the Liar's paradox as an instance of a generalization of Cantor's theorem (no onto mapping from N->2^N).
The best part of this unified scheme is that it shows that there are really no paradoxes. There are limitations. Paradoxes are ways of showing that if you permit one to violate a limitation, then you will get an inconsistent systems. The Liar paradox shows that if you permit natural language to talk about its own truthfulness (as it - of course - does) then we will have inconsistencies in natural languages.
Please stop commenting on this topic until you have understood more of what has been written about it on LW and elsewhere. Unsubstantiated proposals harm LW as a community. LW deals with some topics that look crazy on surface examination; you don't want people who dig deeper to stumble on comments like this and find actual crazy.
Similarly, inference (conditioning) is incomputable in general, even if your prior is computable. However, if you assume that observations are corrupted by independent, absolutely continuous noise, conditioning becomes computable.
Consider marginal utility. Many people are working on AI, machine learning, computational psychology, and related fields. Nobody is working on preference theory, formal understanding of our goals under reflection. If you want to do interesting research and if you have the background to advance either of those fields, do you think the world will be better off with you on the one side or on the other?
Now suppose you are playing against another timeless decision theory agent. Clearly, the best strategy is to be that actor which defects no matter what. If both agents do this, the worst possible result for both of them occurs.
Which shows that defection was not the best strategy in this situation.
I was comparing the two choices people face who want to do inference in nontrivial models. You can either write the model in an existing probabilistic programming language and get inefficient inference for free or you can write model+inference in something like Matlab. Here, you may be able to use libraries if your model is similar enough to existing models, but for many interesting models, this is not the case.
Current universal inference methods are very limited, so the main advantages of using probabilistic programming languages are (1) the conceptual clarity you get by separating generative model and inference and (2) the ability to write down complex nonparametric models and immediately be able to do inference, even if it's inefficient. Writing a full model+inference implementation in Matlab, say, takes you much longer, is more confusing and less flexible.
That said, some techniques that were developed for particular classes of problems have a useful analog in the setting of programs. The gradient-based methods you mention have been generalized to work on any probabilistic program with continuous parameters.
Probabilistic inference in general is NP-hard, but it is not clear that (1) this property holds for the kinds of problems people are interested in and, even if it does, that (2) approximate probabilistic inference is hard for this class of problems. For example, if you believe this paper, probabilistic inference without extreme conditional probabilities is easy.
Combine this with speech-to-text transcription software and you get a searchable archive of your recorded interactions!
ETA: In theory. In practice, dictation software algorithms are probably not up to the task of turning noisy speech from different people into text with any reasonable accuracy.
The key idea behind Church and similar languages is that they allow us to express and formally reason about a large class of probabilistic models, many of which cannot be formalized in any concise way as Bayes nets.
Bayes nets express generative models, i.e. processes that generate data. To infer the states of hidden variables from observations, you condition the Bayes net and compute a distribution on the hidden variable settings using Bayesian inference or some approximation thereof. A particularly popular class of approximations is the class of sampling ...
The notion of abstract state machines may be useful for a formalization of operational equivalence of computations.
Your argument leaves out necessary steps. It is not a careful analysis, does not consider ways in which it might be mistaken, but gives rise to the impression that you wanted to get to your conclusion as quickly as possible.
There is, necessarily, absolutely no way to determine - given an algorithm - whether it is conscious or not. It is not even a formally undecidable statement!
It is unclear how this follows from anything you wrote.
consciousness refuses to be phrased formally (it is subjective, and computation is objective)
Consider tabooing words ...
From the document:
I suggest a synthesis between the approaches of Yudkowsky and de Garis.
Later, elaborating:
...Yudkowsky's emphasis on pristine best scenarios will probably fail to survive the real world precisely because evolution often proceeds by upsetting such scenarios. Yudkowsky's dismissal of random mutations or evolutionary engineering could thus become the source of the downfall of his approach. Yet de Garis's overemphasis on evolutionary unpredictability fails to account for the extent to which human intelligence itself is model for learning f
90% of spreadsheets contain errors.
Source (scroll down to the last line of the first spreadsheet)
Ask yourself: If the LW consensus on some question was wrong, how would you notice? How do you distinguish good arguments from bad arguments? Do your criteria for good arguments depend on social context in the sense that they might change if your social context changes?
Next, consider what you believe and why you think you believe it, applying the methods you just named. According to your criteria, are the arguments in favor of your beliefs strong, and the arguments against weak? Or do your criteria not discriminate between them? Do you have difficulty expl...
Comments on HN and LW result in immediate reward through upvoting and replies whereas writing a book is a more solitary experience. If you identify this difference as a likely cause for your behavior and if you believe that the difference in value to you is as large as you say, then you should test this hypothesis by turning book-writing into a more interactive, immediately rewarding process. Blogging and sending pieces to friends once they are written come to mind.
More generally, consider structuring your social environment such that social expectations and rewards line up with activities you consider valuable. I have found this to be a powerful way to change my behavior.
Meanwhile, there's something on-hand I could do that'd have 300 times the impact. For sure, almost certainly 300 times the impact, because I see some proven success in the 300x area, and the frittering-away-time area is almost certainly not going to be valuable.
Your post includes a "silly" and a business-scale example, but not a personal one. In order to answer the questions about causes that you ask, it seems necessary to look at specific situations. Is there a real-life situation that you can talk about where you have two options, one almost certainly hundreds of times as good as the other, and you choose the option that is worse?
I feel like a lot of us have those opportunities - we see that a place we're putting a small amount of effort is accounting for most of our success, but we don't say - "Okay, that area that I'm giving a little attention that's producing massive results? All attention goes there now."
If you are giving some area a little attention, this does not imply that more attention would get you proportionally better results; you may run into diminishing returns quickly. Of course, for any given situation, it is worth understanding whether this is the case or not.
The question is what causes this sensation that cryonics is a threat? What does it specifically threaten?
It doesn't threaten the notion that we will all die eventually. Accident, homicide, and war will remain possibilities unless we can defeat them, and suicide will always remain an option.
Even if cryonics does not in fact threaten the notion of eventual death, it might still cause the sensation that it poses this threat.
I use the word "prior" in the sense of priors as mathematical objects, meaning all of your starting information plus the way you learn from experience.
Nothing much happens to intelligent agents - because an intelligent agents' original priors mostly get left behind shortly after they are born - and get replaced by evidence-based probability estimates of events happening.
Prior determines how evidence informs your estimates, what things you can consider. In order to "replace priors with evidence-based probability estimates of events", you need a notion of event, and that is determined by your prior.
Intuitively, the notion of updating a map of fixed reality makes sense, but in the context of decision-making, formalization in full generality proves elusive, even unnecessary, so far.
By making a choice, you control the truth value of certain statements—statements about your decision-making algorithm and about mathematical objects depending on your algorithm. Only some of these mathematical objects are part of the "real world". Observations affect what choices you make ("updating is about following a plan"), but you must have decided b...
In my experience, academics often cannot distinguish between SIAI and Kurzweil-related activities such as the Singularity University. With its 25k tuition for two months, SU is viewed as some sort of scam, and Kurzweilian ideas of exponential change are seen as naive. People hear about Kurzweil, SU, the Singularity Summit, and the Singularity Institute, and assume that the latter is behind all those crazy singularity things.
We need to make it easier to distinguish the preference and decision theory research program as an attempt to solve a hard problem fr...
Fodor's arguments for a "language of thought" make sense (see his book of the same name). In a nutshell, thought seems to be productive – out of given concepts, we can always construct new ones, e.g. arbitrary nestings of "the mother of the mother of ..." – systematic – knowing certain concepts automatically leads to the ability to construct other concepts, e.g. knowing the concept "child" and the concept "wild", I can also represent "wild child" – and compositional, e.g. the meaning of "wild child" is a function of the meaning of "wild" and "child".
Since I never described a way of extracting preference from a human (and hence defining it for a FAI), I'm not sure where do you see the regress in the process of defining preference.
Reading your previous post in this thread, I felt like I was missing something and I could have asked the question Wei Dai asked ("Once we implement this kind of FAI, how will we be better off than we are today?"). You did not explicitly describe a way of extracting preference from a human, but phrases like "if you manage to represent your preference in terms...
There is also Shades, which lets you set a tint color and which provides a slider so you can move gradually between standard and tinted mode.
My conclusion from this discussion is that our disagreement lies in the probability we assign that uploads can be applied safely to FAI as opposed to generating more existential risk. I do not see how to resolve this disagreement right now. I agree with your statement that we need to make sure that those involved in running uploads understand the problem of preserving human preference.
People have very feeble understanding of their own goals. Understanding is not required. Goals can't be given "from the outside", goals are what system does.
Even if we have little insight into our goals, it seems plausible that we frequently do things that are not conducive to our goals. If this is true, then in what sense can it be said that a system's goals are what it does? Is the explanation that you distinguish between preference (goals the system would want to have) and goals that it actually optimizes for, and that you were talking about the latter?
The first option tries to capture our best current guess as to our fundamental preference. It then updates the agent (us) based on that guess.
This guess may be awful. The process of emulation and attempts to increase the intelligence of the emulations may introduce subtle psychological changes that could affect the preferences of the persons involved.
For subsequent changes based on "trying to evolve towards what the agent thinks is its exact preference" I see two options: Either they are like the first change, open to the possibility of being ...
It's not clear to me that this is the only way to evaluate my claim, or that it is even a reasonable way. My understanding of FAI is that arriving at such a resolution of human preferences is a central ingredient to building an FAI, hence using your method to evaluate my claim would require more progress on FAI.
If your statement ("The route of WBE simply takes the guess work out") were a comparison between two routes similar in approach, e.g. WBE and neuroenhancement, then you could argue that a better formal understanding of preference would ...
—Mike Sinnett, Boeing's 787 chief project engineer