LESSWRONG
LW

All of andreas's Comments + Replies

Rationality Quotes February 2013

"I design a cell to not fail and then assume it will and then ask the next 'what-if' questions," Sinnett said. "And then I design the batteries that if there is a failure of one cell it won't propagate to another. And then I assume that I am wrong and that it will propagate to another and then I design the enclosure and the redundancy of the equipment to assume that all the cells are involved and the airplane needs to be able to play through that."

—Mike Sinnett, Boeing's 787 chief project engineer

5Nic_Smith12y

Isn't the point of the article that Boeing may not have actually done at least the first two steps (design cell not to fail, prevent failure of a cell from causing battery problems)? I am confused.

Random thought: What is the optimal PD strategy under imperfect information?

andreas13y50

The game theory textbook "A Course in Microeconomic Theory" (Kreps) addresses this situation. Quoting from page 516:

We will give an exact analysis of this problem momentarily (in smaller type), but you should have no difficulty seeing the basic trade-off; too little punishment, triggered only rarely, will give your opponent the incentive to try to get away with the noncooperative strategy. You have to punish often enough and harshly enough so that your opponent is motivated to play [cooperate] instead of [defect]. But the more often/more harsh

... (read more)

Heading Toward: No-Nonsense Metaethics

andreas14y70

I am more motivated to read the rest of your sequence if the summary sounds silly than if I can easily see the arguments myself.

Heading Toward: No-Nonsense Metaethics

andreas14y130

Back when Eliezer was writing his metaethics sequence, it would have been great to know where he was going, i.e., if he had posted ahead of time a one-paragraph technical summary of the position he set out to explain. Can you post such a summary of your position now?

lukeprog14y120

Hmmmm. What do other people think of this idea?

I suspect one reason Eliezer did not do this is that when you make a long list of claims without any justification for them, it sounds silly and people don't pay attention to the rest of the sequence. But if you had first stepped them through the entire argument, they would have found no place at which they can really disagree. That's a concern, anyway.

Making Reasoning Obviously Locally Correct

andreas14y30

Now, citing axioms and theorems to justify a step in a proof is not a mere social convention to make mathematicians happy. It is a useful constraint on your cognition, allowing you to make only inferences that are actually valid.

When you are trying to build up a new argument, temporarily accepting steps of uncertain correctness can be helpful (if mentally tagged as such). This strategy can move you out of local optima by prompting you to think about what further assumptions would be required to make the steps correct.

Techniques based on this kind of r... (read more)

1JGWeissman14y

Agreed. You just have to remember once you've figured out all those steps leading you your conclusion, you have an outline, not a completed proof. Being able to produce such outlines that most of the time can be successfully turned into proofs, or at least interesting reasons why the proof failed, is an important skill.

Hyperlinks and Less Wrong

andreas14y60

As you wish: Drag the link on this page to your browser's bookmark bar. Clicking it on any page will turn all links black and remove the underlines, making links distinguishable from black plain text only through changes in mouse pointer style. Click again to get the original style back.

0lukeprog14y

It's like magic!

Unsolved Problems in Philosophy Part 1: The Liar's Paradox

andreas14y20

See also: A Universal Approach to Self-Referential Paradoxes, Incompleteness and Fixed Points, which treats the Liar's paradox as an instance of a generalization of Cantor's theorem (no onto mapping from N->2^N).

The best part of this unified scheme is that it shows that there are really no paradoxes. There are limitations. Paradoxes are ways of showing that if you permit one to violate a limitation, then you will get an inconsistent systems. The Liar paradox shows that if you permit natural language to talk about its own truthfulness (as it - of course - does) then we will have inconsistencies in natural languages.

8cousin_it14y

I'm not sure if I like this paper (it seems to be trying to do too much), but it did contain something new to me - Yablo's non-self-referential version of the Liar Paradox: for every natural number n, let S(n) be the statement that for all m>n S(m) is false. Also there is a funny non-self-referential formulation by Quine: “Yields falsehood when preceded by its quotation” yields falsehood when preceded by its quotation.

Rationality is Not an Attractive Tribe

andreas14y20

Do you think that your beliefs regarding what you care about could be mistaken? That you might tell yourself that you care more about being lazy than about getting cryonics done, but that in fact, under reflection, you would prefer to get the contract?

-2[anonymous]14y

I can't solve that problem right now. It implies that part of my volition is not, in fact, part of what I want or should not be part of my goals. Why would I only listen to the part of my inner self favoring long-term decisions? I could take the car to drive to that Christmas party to visit my family and friends, or I could stay home because of black ice. After all there will be many more Christmas parties without black ice in future, and even more in the far future where there will be backups? But where does this thinking lead? I want both of course. On reflection, not dying is more important than party. But on further reflection I do not have enough data that would allow me to conclude that any long-term payoff could outweigh extensive restraint at present. There are also some practical considerations about Cryonics. I am in Germany, I don't know of any Cryonics companies here. I don't know what is the likelihood of being frozen quickly enough in case of accident. When I know I'm going to die in advance then I can still get a contract then. So is the money really worth it, given that most pathways to death result in no expected benefits from a Cryonics contract?

An Xtranormal Intelligence Explosion

andreas14y100

Please stop commenting on this topic until you have understood more of what has been written about it on LW and elsewhere. Unsubstantiated proposals harm LW as a community. LW deals with some topics that look crazy on surface examination; you don't want people who dig deeper to stumble on comments like this and find actual crazy.

4PhilGoetz14y

You're kidding. You want us to substantiate all our proposals? Are you giving out grants?

The Curve of Capability

andreas14y20

Similarly, inference (conditioning) is incomputable in general, even if your prior is computable. However, if you assume that observations are corrupted by independent, absolutely continuous noise, conditioning becomes computable.

Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It)

andreas15y50

Consider marginal utility. Many people are working on AI, machine learning, computational psychology, and related fields. Nobody is working on preference theory, formal understanding of our goals under reflection. If you want to do interesting research and if you have the background to advance either of those fields, do you think the world will be better off with you on the one side or on the other?

3[anonymous]15y

Maybe that's true, but that's a separate point. "Let's work on preference theory so that it'll be ready when the AI catches up" is one thing -- tentatively, I'd say it's a good idea. "Let's campaign against anybody doing AI research" seems less useful (and less likely to be effective.)

A Paradox in Timeless Decision Theory

andreas15y30

Now suppose you are playing against another timeless decision theory agent. Clearly, the best strategy is to be that actor which defects no matter what. If both agents do this, the worst possible result for both of them occurs.

Which shows that defection was not the best strategy in this situation.

Church: a language for probabilistic modeling

andreas15y00

Yes, deriving mechanisms that take complex models and turn them into something tractable is mostly an open problem.

Church: a language for probabilistic modeling

andreas15y10

They don't work without continuous parameters. If you have a probabilistic program that includes both discrete and continuous parameters, you can use gradient methods to generate MH proposals for your continuous parameters. I don't think there are any publications that discuss this yet.

0jsalvatier15y

Oh, ok that makes perfect sense. Breaking inference problems into sub problems and using different methods on the sub problems seems like a common technique.

Three kinds of political similarity

andreas15y00

A pdf of the nature intro is here.

Church: a language for probabilistic modeling

andreas15y10

I was comparing the two choices people face who want to do inference in nontrivial models. You can either write the model in an existing probabilistic programming language and get inefficient inference for free or you can write model+inference in something like Matlab. Here, you may be able to use libraries if your model is similar enough to existing models, but for many interesting models, this is not the case.

0Richard_Kennaway15y

Ok, I was comparing the effort at the meta-level: providing tools for computing in a given subject area either by making a new language, or by making a new library in an existing language.

Church: a language for probabilistic modeling

andreas15y20

Current universal inference methods are very limited, so the main advantages of using probabilistic programming languages are (1) the conceptual clarity you get by separating generative model and inference and (2) the ability to write down complex nonparametric models and immediately be able to do inference, even if it's inefficient. Writing a full model+inference implementation in Matlab, say, takes you much longer, is more confusing and less flexible.

That said, some techniques that were developed for particular classes of problems have a useful analog in the setting of programs. The gradient-based methods you mention have been generalized to work on any probabilistic program with continuous parameters.

1jsalvatier15y

Interesting, I suppose that does seem somewhat useful; for discussion purposes at the very least. I am curious about how a gradient-based method can work without continuous parameters: that is counter intuitive for me. Can you throw out some keywords? Keywords for what I was talking about: Metropolis-adjusted Langevin algorithm (MALA), Stochastic Newton, any MCMC with 'hessian' in the name.

0Richard_Kennaway15y

Defining and implementing a whole programming language is way more work than writing a library in an existing language. A library, after all, is a language, but one you don't have to write a parser, interpreter, or compiler for.

Church: a language for probabilistic modeling

andreas15y40

Probabilistic inference in general is NP-hard, but it is not clear that (1) this property holds for the kinds of problems people are interested in and, even if it does, that (2) approximate probabilistic inference is hard for this class of problems. For example, if you believe this paper, probabilistic inference without extreme conditional probabilities is easy.

Lifelogging: the recording device

andreas15y30

Combine this with speech-to-text transcription software and you get a searchable archive of your recorded interactions!

ETA: In theory. In practice, dictation software algorithms are probably not up to the task of turning noisy speech from different people into text with any reasonable accuracy.

Church: a language for probabilistic modeling

andreas15y200

The key idea behind Church and similar languages is that they allow us to express and formally reason about a large class of probabilistic models, many of which cannot be formalized in any concise way as Bayes nets.

Bayes nets express generative models, i.e. processes that generate data. To infer the states of hidden variables from observations, you condition the Bayes net and compute a distribution on the hidden variable settings using Bayesian inference or some approximation thereof. A particularly popular class of approximations is the class of sampling ... (read more)

0cousin_it15y

Please do.

1Daniel_Burfoot15y

Naively, separation of model from inference algorithm seems like a terrible idea. People use problem-specific approximation algorithms because if they don't exploit specific features of the model, inference will be completely intractable. Often this consideration is so important that people will use obviously sub-optimal models, if those models support fast inference.

2Cyan15y

Do eet.

3[anonymous]15y

Thank you so much. I'm sorry I was confused, and I'm glad someone is around who knows more. And thank you so much for the tutorial! This is proving to be much clearer than the papers. You are a prince.

Help: When are two computations isomorphic?

andreas15y20

The notion of abstract state machines may be useful for a formalization of operational equivalence of computations.

Consciousness doesn't exist.

andreas15y80

Your argument leaves out necessary steps. It is not a careful analysis, does not consider ways in which it might be mistaken, but gives rise to the impression that you wanted to get to your conclusion as quickly as possible.

There is, necessarily, absolutely no way to determine - given an algorithm - whether it is conscious or not. It is not even a formally undecidable statement!

It is unclear how this follows from anything you wrote.

consciousness refuses to be phrased formally (it is subjective, and computation is objective)

Consider tabooing words ... (read more)

Open Thread September, Part 3

andreas15y60

Open Thread, September, 2010-- part 2

andreas15y40

From the document:

I suggest a synthesis between the approaches of Yudkowsky and de Garis.

Later, elaborating:

Yudkowsky's emphasis on pristine best scenarios will probably fail to survive the real world precisely because evolution often proceeds by upsetting such scenarios. Yudkowsky's dismissal of random mutations or evolutionary engineering could thus become the source of the downfall of his approach. Yet de Garis's overemphasis on evolutionary unpredictability fails to account for the extent to which human intelligence itself is model for learning f

... (read more)

0jimrandomh15y

Interesting. But I note that there is nothing by Yudkowsky in the selected bibliography. I get the impression that his knowledge there is secondhand. Maybe if he'd read a bit about rationality, it could have pulled him back to reality. And maybe if he'd read a bit about what death really is, he wouldn'tve taken a several-millenia-old, incorrect Socrates quote as justification for suicide.

0[anonymous]15y

I've stumbled upon some references to the ideas of Fukuyama and a Kurzwell reference, but had no idea he was familiar with Yudkowsky's work. Can you tell me from which page you got this? Is it possible this guy was a poster here?

Let's make a deal

andreas15y170

To make a good case for financial support, point to past results that are evidence of clear thinking and of the ability to get research done.

-2Mitchell_Porter15y

That's what the paper was for. Unfortunately, I have run out of time.

Error detection bias in research

andreas15y40

90% of spreadsheets contain errors.

Source (scroll down to the last line of the first spreadsheet)

3prase15y

It's kind of ironic that a spreadsheet is used as evidence of spreadsheets being almost always wrong. (I am not being serious.)

Open Thread, September, 2010-- part 2

andreas15y50

Ask yourself: If the LW consensus on some question was wrong, how would you notice? How do you distinguish good arguments from bad arguments? Do your criteria for good arguments depend on social context in the sense that they might change if your social context changes?

Next, consider what you believe and why you think you believe it, applying the methods you just named. According to your criteria, are the arguments in favor of your beliefs strong, and the arguments against weak? Or do your criteria not discriminate between them? Do you have difficulty expl... (read more)

Open Thread, September, 2010-- part 2

andreas15y20

I'm in Cambridge, MA, looking for a rationalist roommate. PM me for details if you are interested or if you know someone who is!

Less Wrong: Open Thread, September 2010

andreas15y00

Thanks for coding this!

Currently, the script does not work in Chrome (which supports Greasemonkey out of the box).

2Wei Dai15y

From http://dev.chromium.org/developers/design-documents/user-scripts My script uses 4 out of these 6 features, and also cross-domain GM_xmlhttpRequest (the comments are actually loaded from a PHP script hosted elsewhere, because LW doesn't seem to provide a way to grab 400 comments at once), so it's going to have to stay Firefox-only for the time being. Oh, in case anyone developing LW is reading this, I release my script into the public domain, so feel free to incorporate the features into LW itself.

A "Failure to Evaluate Return-on-Time" Fallacy

andreas15y170

Comments on HN and LW result in immediate reward through upvoting and replies whereas writing a book is a more solitary experience. If you identify this difference as a likely cause for your behavior and if you believe that the difference in value to you is as large as you say, then you should test this hypothesis by turning book-writing into a more interactive, immediately rewarding process. Blogging and sending pieces to friends once they are written come to mind.

More generally, consider structuring your social environment such that social expectations and rewards line up with activities you consider valuable. I have found this to be a powerful way to change my behavior.

6lionhearted (Sebastian Marshall)15y

Indeed, this is a good insight. I've done both, actually. I have an active blog, and actually making a public commitment helped me finish my first book. I wrote about it under "The Joys of Public Accountability"; it does work. That's a really powerful observation. Why do you think people don't do that more often? Ignorance? Also, do you have any observations from your own life of structuring your environment? I'd be fascinated to hear, you seem very knowledgeable and astute on the subject.

A "Failure to Evaluate Return-on-Time" Fallacy

andreas15y170

Meanwhile, there's something on-hand I could do that'd have 300 times the impact. For sure, almost certainly 300 times the impact, because I see some proven success in the 300x area, and the frittering-away-time area is almost certainly not going to be valuable.

Your post includes a "silly" and a business-scale example, but not a personal one. In order to answer the questions about causes that you ask, it seems necessary to look at specific situations. Is there a real-life situation that you can talk about where you have two options, one almost certainly hundreds of times as good as the other, and you choose the option that is worse?

3gwern15y

I used to be pretty cavalier about messing with Windows, and would lose my files on an annual or bi-annual basis. I spent a heck of a lot of time tracking down files and restoring from my sporadic backups, not to mention the virus scan time or defragging. Then the 4th or 5th time I realized that this was crazy, switched to Linux, and learned how to use DVCSes. I'm not sure that this has yet amounted to a 300x improvement in wasted time, but I'm pretty confident that by the time I die it will have.

8lionhearted (Sebastian Marshall)15y

Sure. An easy one: Commenting on nothing particularly important on Hacker News when I could be writing my second book. Commenting on HN = very small gain, minor contribution to a few people over a very short period of time. Working on a book = much more enjoyable, and much larger contribution over a longer period of time. Have you seen the same phenomenon in your life at all Andreas? Maybe "300x" is an exaggeration - or maybe not, even, if the value of the distracting task is low enough, and the value of the good task is high enough.

2mattnewport15y

I agree, I don't think these kind of 'easy' wins are all that common in real life, certainly not those offering 300x improvements. I would like to see some better examples. Entrepreneurship / business seems likely to be relatively fertile ground for finding good examples since short term financial gain can often be used as a relatively good proxy for 'success' and is relatively easy to measure. Too much focus on short term financial gain isn't always an optimal strategy even in business however since it may result in getting stuck in local maxima or directly compromising longer term success.

A "Failure to Evaluate Return-on-Time" Fallacy

andreas15y40

I feel like a lot of us have those opportunities - we see that a place we're putting a small amount of effort is accounting for most of our success, but we don't say - "Okay, that area that I'm giving a little attention that's producing massive results? All attention goes there now."

If you are giving some area a little attention, this does not imply that more attention would get you proportionally better results; you may run into diminishing returns quickly. Of course, for any given situation, it is worth understanding whether this is the case or not.

Open Thread, August 2010-- part 2

andreas15y00

If all you want is single bits from a quantum random number generator, you can use this script.

The Threat of Cryonics

andreas15y70

The question is what causes this sensation that cryonics is a threat? What does it specifically threaten?

It doesn't threaten the notion that we will all die eventually. Accident, homicide, and war will remain possibilities unless we can defeat them, and suicide will always remain an option.

Even if cryonics does not in fact threaten the notion of eventual death, it might still cause the sensation that it poses this threat.

Open Thread, August 2010

andreas15y60

Scott Aaronson asks for rational arguments for and against cryonics.

0Paul Crowley15y

And in one comment we've already got to "I’m not ready to waste my time on a crusade against another piece of ignorant superstition".

0Paul Crowley15y

Thanks to the two people who pointed this out to me in DM. I've commented, though Cyan has already linked to the essays on my blog I'd link to first.

Metaphilosophical Mysteries

andreas15y00

I use the word "prior" in the sense of priors as mathematical objects, meaning all of your starting information plus the way you learn from experience.

-3timtyler15y

Well yes, you can have "priors" that you have learned from experience. An uncomputable world is not a problem in that case either - since you can learn about uncomputable physics, in just the same way that you learn about everything else. This whole discussion seems to be a case of people making a problem out of nothing.

1Vladimir_Nesov15y

I can't quite place "you need a notion of event, and that is determined by your prior", but I guess the mapping between sample space and possible observations is what you meant.

Metaphilosophical Mysteries

andreas15y00

Nothing much happens to intelligent agents - because an intelligent agents' original priors mostly get left behind shortly after they are born - and get replaced by evidence-based probability estimates of events happening.

Prior determines how evidence informs your estimates, what things you can consider. In order to "replace priors with evidence-based probability estimates of events", you need a notion of event, and that is determined by your prior.

0Vladimir_Nesov15y

Prior evaluates, but it doesn't dictate what is being evaluated. In this case, "events happening" refers to subjective anticipation, which in turn refers to prior, but this connection is far from being straightforward.

0timtyler15y

"Determined" in the sense of "weakly influenced". The more actual data you get, the weaker the influence of the original prior becomes - and after looking at the world for a little while, your original priors become insignificant - swamped under a huge mountain of sensory data about the actual observed universe. Priors don't really affect what things you can consider - since you can consider (and assign non-zero probability to) receiving any sensory input sequence.

Newcomb's Problem and Regret of Rationality

andreas15y00

Intuitively, the notion of updating a map of fixed reality makes sense, but in the context of decision-making, formalization in full generality proves elusive, even unnecessary, so far.

By making a choice, you control the truth value of certain statements—statements about your decision-making algorithm and about mathematical objects depending on your algorithm. Only some of these mathematical objects are part of the "real world". Observations affect what choices you make ("updating is about following a plan"), but you must have decided b... (read more)

andreas15y10

Are you doing this? If not, why not?

(One reason) why capitalism is much maligned

andreas15y170

In my experience, academics often cannot distinguish between SIAI and Kurzweil-related activities such as the Singularity University. With its 25k tuition for two months, SU is viewed as some sort of scam, and Kurzweilian ideas of exponential change are seen as naive. People hear about Kurzweil, SU, the Singularity Summit, and the Singularity Institute, and assume that the latter is behind all those crazy singularity things.

We need to make it easier to distinguish the preference and decision theory research program as an attempt to solve a hard problem fr... (read more)

6Utilitarian15y

Agreed. I'm often somewhat embarrassed to mention SIAI's full name, or the Singularity Summit, because of the term "singularity" which, in many people's minds -- to some extent including my own -- is a red flag for "crazy". Honestly, even the "Artificial Intelligence" part of the name can misrepresent what SIAI is about. I would describe the organization as just "a philosophy institute researching hugely important fundamental questions."

0Roko15y

Ah, I think I can guess who you are. You work under a professor called Josh and have an umlaut in your surname. Shame that the others in that great research group don't take you seriously.

CogSci books

andreas15y00

Fodor's arguments for a "language of thought" make sense (see his book of the same name). In a nutshell, thought seems to be productive – out of given concepts, we can always construct new ones, e.g. arbitrary nestings of "the mother of the mother of ..." – systematic – knowing certain concepts automatically leads to the ability to construct other concepts, e.g. knowing the concept "child" and the concept "wild", I can also represent "wild child" – and compositional, e.g. the meaning of "wild child" is a function of the meaning of "wild" and "child".

CogSci books

andreas15y70

If you want to learn the fundamental concepts of a field, I find most of the time that textbooks with exercises are still the best option. The more introductory chapters of PhD theses are also helpful in this situation.

Open Thread: April 2010, Part 2

andreas15y00

Thanks! Please keep on posting, this is interesting.

Open Thread: April 2010

andreas15y30

Since I never described a way of extracting preference from a human (and hence defining it for a FAI), I'm not sure where do you see the regress in the process of defining preference.

Reading your previous post in this thread, I felt like I was missing something and I could have asked the question Wei Dai asked ("Once we implement this kind of FAI, how will we be better off than we are today?"). You did not explicitly describe a way of extracting preference from a human, but phrases like "if you manage to represent your preference in terms... (read more)

3Wei Dai15y

After reading Nesov's latest posts on the subject, I think I better understand what he is talking about now. But I still don't get why Nesov seems confident that this is the right approach, as opposed to a possible one that is worth looking into. Do we have at least an outline of how such an analysis would work? If not, why do we think that working out such an analysis would be any easier than, say, trying to state ourselves what our "semantic" preferences are?

3Vladimir_Nesov15y

Correct. Note that "strategy" is a pretty standard term, while "I/O map" sounds ambiguous, though it emphasizes that everything except the behavior at I/O is disregarded. An agent is more than its strategy: strategy is only external behavior, normal form of the algorithm implemented in the agent. The same strategy can be implemented by many different programs. I strongly suspect that it takes more than a strategy to define preference, that introspective properties are important (how the behavior is computed, as opposed to just what the resulting behavior is). It is sufficient for preference, when it is defined, to talk about strategies, and disregard how they could be computed; but to define (extract) a preference, a single strategy may be insufficient, it may be necessary to look at how the reference agent (e.g. a human) works on the inside. Besides, the agent is never given as its strategy, it is given as its source code that normalizes to that strategy, and computing the strategy may be tough (and pointless).

Open Thread: April 2010

andreas15y00

There is also Shades, which lets you set a tint color and which provides a slider so you can move gradually between standard and tinted mode.

Hedging our Bets: The Case for Pursuing Whole Brain Emulation to Safeguard Humanity's Future

andreas15y20

My conclusion from this discussion is that our disagreement lies in the probability we assign that uploads can be applied safely to FAI as opposed to generating more existential risk. I do not see how to resolve this disagreement right now. I agree with your statement that we need to make sure that those involved in running uploads understand the problem of preserving human preference.

0Jordan15y

I'm not entirely sure how to resolve that either. However, it isn't necessary for us to agree on that probability to agree on a course of action. What probability would you assign to uploads being used safely? What do your probability distributions look like for the ETA of uploads, FAI and AGI?

The problem of pseudofriendliness

andreas15y00

People have very feeble understanding of their own goals. Understanding is not required. Goals can't be given "from the outside", goals are what system does.

Even if we have little insight into our goals, it seems plausible that we frequently do things that are not conducive to our goals. If this is true, then in what sense can it be said that a system's goals are what it does? Is the explanation that you distinguish between preference (goals the system would want to have) and goals that it actually optimizes for, and that you were talking about the latter?

0Vladimir_Nesov15y

More precisely, goals (=preference) are in what system does (which includes all processes happening inside the system as well), which is simply a statement of system determining its preference (while the coding is disregarded, so what matters is behavior and not particular atoms that implement this behavior). Of course, system's actions are not optimal according to system's goals (preference). On the other hand, two agents can be said to have the same preference if they agree (on reflection, which is not actually available) on what should be done in each epistemic state (which doesn't necessarily mean they'll solve the optimization problem the same way, but they work on the same optimization problem). This is also the way out from the ontology problem: this equivalence by preference doesn't mention the real world.

Hedging our Bets: The Case for Pursuing Whole Brain Emulation to Safeguard Humanity's Future

andreas15y00

Good, this is progress. Your comment clarified your position greatly. However, I do not know what you mean by "how long is WBE likely to take?" — take until what happens?

0Jordan15y

The amount of time until we have high fidelity emulations of human brains. At that point we can start modifying/enhancing humans, seeking to create a superintelligence or at least sufficiently intelligent humans that can then create an FAI. The time from first emulation to superintelligence is nonzero, but is probably small compared to the time to first emulation. If we have reason to believe that the additional time is not small we should factor in our predictions for it as well.

Hedging our Bets: The Case for Pursuing Whole Brain Emulation to Safeguard Humanity's Future

andreas15y20

The first option tries to capture our best current guess as to our fundamental preference. It then updates the agent (us) based on that guess.

This guess may be awful. The process of emulation and attempts to increase the intelligence of the emulations may introduce subtle psychological changes that could affect the preferences of the persons involved.

For subsequent changes based on "trying to evolve towards what the agent thinks is its exact preference" I see two options: Either they are like the first change, open to the possibility of being ... (read more)

3Jordan15y

It may be horribly awful, yes. The question is "how likely is it be awful?" If FAI research can advance fast enough then we will have the luxury of implementing a coherent preference system that will guarantee the long term stability of our exact preferences. In an ideal world that would be the path we took. In the real world there is a downside to the FAI path: it may take too long. The benefit of other paths is that, although they would have some potential to fail even if executed in time, they offer a potentially faster time table. I'll reiterate: yes, of course FAI would be better than WBE, if both were available. No, WBE provides no guarantee and could lead to horrendous preference drift. The questions are: how likely is WBE to go wrong? how long is FAI likely to take? how long is WBE likely to take? And, ultimately, combining the answers to those questions together: where should we be directing our research? Your post points out very well that WBE might go wrong. It gives no clue to the likelihood though.

Hedging our Bets: The Case for Pursuing Whole Brain Emulation to Safeguard Humanity's Future

andreas15y20

It's not clear to me that this is the only way to evaluate my claim, or that it is even a reasonable way. My understanding of FAI is that arriving at such a resolution of human preferences is a central ingredient to building an FAI, hence using your method to evaluate my claim would require more progress on FAI.

If your statement ("The route of WBE simply takes the guess work out") were a comparison between two routes similar in approach, e.g. WBE and neuroenhancement, then you could argue that a better formal understanding of preference would ... (read more)

0Jordan15y

The first option tries to capture our best current guess as to our fundamental preference. It then updates the agent (us) based on that guess. Afterwards the next guess as to our fundamental preference is likely different, so the process iterates. The iteration is trying to evolve towards what the agent thinks is its exact preference. The iteration is simply doing so to some sort of "first order" approximation. For the first option, I think self-modification under the direction of current, apparent preferences should be done with extreme caution, so as to get a better 'approximation' at each step. For the second option though, it's hard for me to imagine ever choosing to self-modify into an agent with exact, unchanging preferences.