LESSWRONG
LW

All of Drahflow's Comments + Replies

Open thread, Sep. 26 - Oct. 02, 2016

Quoting https://en.wikipedia.org/wiki/Kleene%27s_T_predicate:

The ternary relation T1(e,i,x) takes three natural numbers as arguments. The triples of numbers (e,i,x) that belong to the relation (the ones for which T1(e,i,x) is true) are defined to be exactly the triples in which x encodes a computation history of the computable function with index e when run with input i, and the program halts as the last step of this computation history.

In other words: If someone gives you an encoding of a program, an encoding of its input and a trace of its run, you c... (read more)

0MrMind9y

Oh! This point had evaded me: I thought x encoded the program and the input, not just the entire history. So U, instead of executing, just locates the last thing written on tape according to x and repeat it. Well, I'm disappointed... at U and at myself.

Open thread, Sep. 26 - Oct. 02, 2016

Drahflow9y00

A counterexample to your claim: Ackermann(m,m) is a computable function, hence computable by a universal Turing machine. Yet it is designed to be not primitive recursive.

And indeed Kleene's normal form theorem requires one application of the μ-Operator. Which introduces unbounded search.

0MrMind9y

Yes, but the U() and the T() are primitive recursive. Unbounded search is necessary to get the encoding of the program, but not to execute it, that's why I said "if an angel gives you the encoding". The normal form theorem indeed says that any partial recursive function is equivalent to two primitive recursive functions / relations, namely U and T, and one application of unbounded search.

The Philosophical Implications of Quantum Information Theory

Drahflow9y00

I don't buy your first argument against time-travel. Even under the model of the universe as a static mathematical object connected by wave-function consistency constraints, there is still a consistent interpretation of the intuitive notion of "time travel":

The "passage" of time is the continuous measurement of the environment by a subsystem (which incidentally believes itself to be an 'observer') and the resulting entanglement with farther away parts of the system as "time goes on" (i.e. further towards positive time). Then ... (read more)

2lisper9y

No, what you say is correct, but you don't even need to bring entanglement into it at all: moving faster than light is the same thing as moving into the past (in some reference frame). This is why information can't propagate faster than light. The kind of time travel that I'm talking about here is not merely sending information into the past but sending yourself into the past, that is, sending your body into the past. But that's not possible because your body is on the most fundamental level made of entanglements, and entanglements define the arrow of time.

The ethics of eating meat

Drahflow9y00

Here is my attempt to convince you also of 1 (in your numbering):

I disagree with your: "From a preference utilitarian Perspective, only a self-conscious being can have preferences for the future, therefore you can only violate the preferences of a self-conscious being by killing it."

To the contrary, every agent which follows an optimization goal exhibits some preference (even if itself does not understand them). Namely that its optimization goal shall be reached. The ability to understand ones own optimization goal is not necessary for a preferen... (read more)

The Fable of the Burning Branch

Drahflow9y70

I, for one, like my moral assumptions and cached thoughts challenged regularly. This works well with repugnant conclusions. Hence I upvoted this post (to -21).

I find two interesting questions here:

How to reconcile opposing interests in subgroups of a population of entities whose interests we would like to include into our utility function. An obvious answer is facilitating trade between all interested to increase utility. But: How do we react to subgroups whose utility function values trade itself negatively?
Given that mate selection is a huge driver o

... (read more)

2Viliam9y

Seems to me that the interests are often not literally opposed, such that one group literally has "X" as a terminal value, and the other group has "not X". More often, the goals are simply anticorrelated in practice, thus wanting "the opposite of what the other group wants" becomes a good heuristic. This is why calmly debating and exploring all options, including unusual ones, can be a good approach. For example, in this specific situation: (1) legalize prostitution, and create safe conditions so that the prostitutes are not exploited; (2) create good cheap sexbots, or maybe rent them.

-5ChristianKl9y

The Fable of the Burning Branch

Drahflow9y10

Interestingly, there appears (at least in my local cultural circle) that being attended by human caretakers when incapacitated by age, is supposed to be a basic right. Hence, there must be some other reason - and not just the problem about rights being fulfilled by other persons, why the particular example assumed to underlie the parable, is reprehensible to many people.

0OrphanWilde9y

There is another reason. In social-standing friendly language, "Sex is sacred". For the less socially-friendly approach... sex is clearly not sacred, and the issue isn't the idea of sex being a right, as one can readily see by looking at people who can complain about involuntary celibacy without much social risk, and do so. I'm not going to name the ugliness, both because it's broad and ill-defined - a patch of area defined more by what a set of ideologies fail to say, than what they explicitly name - but also because it's something you have to see for yourself to believe.

Your transhuman copy is of questionable value to your meat self.

Drahflow9y80

To disagree with this statement is to say that a scanned living brain, cloned, remade and started will contain the exact same consciousness, not similar, the exact same thing itself, that simultaneously exists in the still-living original. If consciousness has an anatomical location, and therefore is tied to matter, then it would follow that this matter here is the exact matter as that separate matter there. This is an absurd proposition.

You conclude that consciousness in your scenario cannot have 1 location(s).

If consciousness does not have an anatom

... (read more)

4Usul9y

Thanks for the reply. "You conclude that consciousness in your scenario cannot have 1 location(s)." I'm not sure if this is a typo or a misunderstanding. I am very much saying that a single consciousness has a single location, no more no less. It is located in those brain structures which produce it. One consciousness in one specific set of matter. A starting-state-identical consciousness may exist in a separate set of matter. This is a separate consciousness. If they are the same, then the set of matter itself is the same set of matter. The exact same particles/wave-particles/strings/what-have-you. This is an absurdity. Therefore to say 2 consciousnesses are the same consciousness is an absurdity. "It is indeed the same program in the same state but in 2 locations." It is not. They are (plural pronoun use) identical (congruent?) programs in identical states in 2 locations. You may choose to equally value both but they are not the same thing i two places. My consciousness is the awareness of all input and activity of my mind, not the memory. I believe it is, barring brain damage, unchanged in any meaningful way by experience. It is the same consciousness today as next week, regardless of changes in personality, memory, conditioned response imprinting. I care about tomorrow-me because I will experience what he experiences. I care no more about copy-me than I do the general public (with some exceptions if we must interact in the future) because I (the point of passive awareness that is the best definition of "I") will not experience what he experiences. I set a question to entirelyuseless above: Basically, does anything given to a copy of you induce you to take a bullet to the head?

Rationalist Magic: Initiation into the Cult of Rationatron

Drahflow9y00

Regarding auras. I am not sure, if I observed the same phenomenon, but if I sit still and keep my eyes fixed on the same spot for a while (in a still scene), my eyes will -- after a while -- get accustomed to the exact light pattern incoming and everythig kind-of fades to gray. But very slight movements will then generate colorful borders on edges (like a gaussian edge detector).

-1MrMind9y

Or you possibly can study hard Eckman and then train your synesthesia on people!

Ideological Turing Test Domains

Drahflow10y50

Best way to fix climate change: "Renewables / Nuclear"
Secret Services are necessary to fight terrorism / Secret Services must be abolished
GPL / BSD-Licences

Stupid Questions June 2015

Drahflow10y40

Install a smoke detector
Do martial arts training until you get the falling more or less right. While this might be helpful against muggers the main benefit is the reduced probability of injury in various unfortunate situation.

0Gondolinian10y

As someone with ~3 years of aikido experience, I second this.

Stupid Questions May 2015

Drahflow10y30

The Metamath project was started by a person who also wanted to understand math by coding it: http://metamath.org/

Generally speaking, machine-checked proofs are ridiculously detailed. But it being able to create such detailed proofs did boost my mathematical understanding a lot. I found it worthwhile.

2015 Repository Reruns - Boring Advice Repository

Drahflow10y40

Install a smoke detector (and reduce mortality by 0.3% if I'm reading the statistics right - not to talk of the property damages prevented).

December 2014 Bragging Thread

Drahflow10y10

I use multiple passwords of consisting of 12 elements of a..z, A..Z, 0..9, and ~20 symbol characters, generated randomly. Total entropy of these is around 76 bits.

10 decimal digits is actually more like 33 bits of entropy.

1SeekingEternity10y

Yeah, I (roughly) computed for 20 digits (what Alex said) and then wrote the wrong thing, because... derp? Also, yes, it's up to 66.4 bits, not 70. My bad

Contrarian LW views and their economic implications

Drahflow11y30

small enough to be masked by confounders There are an extremely large number of companies. Unrelated effects should average out.

Regarding statistics: http://thinkprogress.org/economy/2014/07/08/3457859/women-ceos-beat-stock-market/ links to quite some.

Knightian Uncertainty and Ambiguity Aversion: Motivation

Drahflow11y10

Given identical money payoffs between two options (even when adjusting for non-linear utility of money), choosing the non-ambiguous has the added advantage of giving a limited rationality agent less possible futures to spend computing resources on while the process of generating utility runs.

Consider two options: a) You wait one year and get 1 million dollars. b) You wait one year and get 3 million dollars with 0.5 probability (decided after this year).

If you take option b), depending on the size of your "utils", all planning for after the year must essentially be done twice, once for the case with 3 million dollars available and once for the case without.

Bragging Thread, July 2014

Drahflow11y130

I usually take the minutes of the German Pirate Party assemblies. It is non-trivial to transcribe two days of speach alone (and I don't know steno). A better solution is a collaborative editor and multiple people typing while listening to the audio with increasing delay, i.e. one person gets life audio, the next one 20 seconds delay, etc... There is EtherPad, but the web client cannot really handle the 250kB files a full day transcript needs, also two of the persons interested in taking minutes (me included) strongly prefer VIm over a glorified textfield.

H... (read more)

5[anonymous]11y

...so you should have almost no confidence in your implementation, or accept that you're dealing with an orders-of-mangnitude easier version of the problem than Google is.

Intelligence Explosion vs. Co-operative Explosion

Drahflow13y20

while corporations have a variety of mechanisms for trying to provide their employees with the proper incentives, anyone who's worked for a big company knows that they employees tend to follow their own interests, even when they conflict with those of the company. It's certainly nothing like the situation with a cell, where the survival of each cell organ depends on the survival of the whole cell. If the cell dies, the cell organs die; if the company fails, the employees can just get a new job.

These observations might not hold for uploads running on ha... (read more)

0timtyler13y

Or maybe governments - if they get their act together. Dividing your country into competing companies hardly seems very efficient.

Is community-collaborative article production possible?

Drahflow13y120

There should be a step 9, where every potential author is sent the final article and has the option of refusing formal authorship (if she doesn't agree with the final article). Convention in academic literature is that each author individually endorses all claims made in an article, hence this final check.

SotW: Check Consequentialism

Drahflow13y40

So... how would I design an exercise to teach Checking Consequentialism?

Divide the group into pairs. One is the decider, the other is the environment. Let them play some game repeatedly, prisoners dilemma might be appropriate, but maybe it should be a little bit more complex. The algorithm of the environment is predetermined by the teacher and known to both of the players.

The decider tries to maximize utilitiy over the repeated rounds, the environment tries to minimise the winnigs of the decider, by using social interaction between the evaluated game round... (read more)

Is risk aversion really irrational ?

Drahflow13y10

The described effect seems strongly related to the concept of opportunity cost.

I.e. while a bet of yours is still open, the resources spent paying for entering the bet cannot be used again to enter a (better) bet.

Why an Intelligence Explosion might be a Low-Priority Global Risk

Drahflow13y00

The AGI would have to acquire new resources slowly, as it couldn’t just self-improve to come up with faster and more efficient solutions. In other words, self-improvement would demand resources. The AGI could not profit from its ability to self-improve regarding the necessary acquisition of resources to be able to self-improve in the first place.

If the AGI creates a sufficiently convincing business plan / fake company front, it might well be able to command a significant share of the world's resources on credit and either repay after improving or grab power and leave it at that.

What visionary project would you fund?

Drahflow13y00

Small scale fusion power.

Research challenges: How to get hydrogen to fuse into helium using only 500kg of machinery and less energy than will be produced.

Urgent Tasks: (In-)Validate the results of the fusor people, scale up / down as neccessary.

Reasons: Enormous amounts of energy goes into everything. If energy costs drop significantly, I expect sustained, fast and profound economic growth, in this case without too much ecological impact. Also, a lot of high-energy technology will become way more feasible, e.g. space missions.

Selection Effects in estimates of Global Catastrophic Risk

Drahflow13y00

Risk mitigation groups would gain some credibility by publishing concrete probability estimates of "the world will be destroyed by X before 2020" (and similar for other years). As many of the risks are a rather short event (think nuclear war / asteroid strike / singularity), the world will be destroyed by a single cause and the respective probabilities can be summed. I would not be surprised if the total probability comes out well above 1. Has anybody ever compiled a list of separate estimates?

On a related note, how much of the SIAI is financed o... (read more)

Algorithms as Case Studies in Rationality

Drahflow14y30

My classical example for algorithms applicable to real life: Merge sort for sorting stacks of paper.

Branches of rationality

Drahflow14y30

A short list for prediction making of groups (and via extension decision making):

Confidence levels inside and outside an argument

Drahflow14y00

But it's hard for me to be properly outraged about this, because the conclusion that the LHC will not destroy the world is correct.

What is your argument for claiming that the LHC will not destroy the world?

That the world still exists albeit ongoing experiments is easily explained by the fact that we are necessarily living in those branches of the universe where the LHC didn't destroy the world. (On an related side note: Has the great filter been found yet?)

0Scott Alexander14y

Good point. I've changed this to "since the LHC did not destroy the world", which is true regardless of whether it destroyed other branches.

Intelligence Amplification Open Thread

Drahflow15y40

It appears slow. In particular I seem to think more things per time, sometimes noticing significant delays between thought and action. However according to the scores, performance improvement is only marginal (but existent). The effect wears off after 10 to 15 minutes according to my experience.

I usually play Quake 3 (just in case anybody want's to compare effects between games).

Intelligence Amplification Open Thread

Drahflow15y40

Same goes for videos (Yay action movies at 2x).

Bonus points (for fun only): Play action games afterwards. Time sensation is a weird thing.

1Leafy15y

Interestingly I have noticed a similar "time slowing" effect in rapid reaction computer games following extreme bursts of adrenaline for whatever reason - I wonder if action movies at 2x give you an adrenaline boost?

0[anonymous]15y

Doesn't the helium-voice effect completely kill the mood? Or, if your film player automatically compensates, which is it?

0katydee15y

What's this like?

0xamdam15y

What software are you using? I find audio parts of sped-up videos pretty difficult, other than on the ipod line of products (even in quicktime). There is some info here: http://www.catonmat.net/blog/how-to-save-time-by-watching-videos-at-higher-playback-speeds/

Memetic Hazards in Videogames

Drahflow15y70

Video game authors probably put a lot of effort into optimizing video games for human pleasure.

Workplace design, User Interfaces etc., they could all be improved if more ideas were copied from video games.

mattnewport15y210

Games often fall into the trap of optimizing for addictiveness which is not quite the same thing as pleasure. Jonathan Blow has talked about this and I think there is a lot of merit in his arguments:

He clarified, "I’m not saying [rewards are] bad, I’m saying you can divide them into two categories – some are like foods that are naturally beneficial and can increase your life, but some are like drugs."
Continued Blow, "As game designers, we don’t know how to make food, so we resort to drugs all the time. It shows in the discontent at the sta

... (read more)

Controlling Constant Programs

Drahflow15y20

The only difference I can see between "an agent which knows the world program it's working with" and "agent('source of world')" is that the latter agent can be more general.

0Will_Sawin15y

A prior distribution about possible states of the world, which is what you'd want to pass outside of toy-universe examples, is rather clearly part of the agent rather than a parameter.

0Vladimir_Nesov15y

Yes, in a sense. (Although technically, the agent could know facts about the world program that can't be algorithmically or before-timeout inferred just from the program, and ditto for agent's own program, but that's a fine point.)

Controlling Constant Programs

Drahflow15y00

If agent() is actually agent('source of world') as the classical newcomb problem has it, I fail to see what is wrong with simply enumerating the possible actions and simulating the 'source of world' with the constant call of agent('source of world') replaced by the current action candidate? And then returning the action with maximum payoff obviously.

0Vladimir_Nesov15y

See world2(). Also, the agent takes no parameters, it just knows the world program it's working with.

Rationality quotes: September 2010

Drahflow15y10

Loyality to petrified opinion has already kept chains from being closed and souls from being trapped.

Rationality quotes: September 2010

Drahflow15y60

But some thoughts are both so complex and so irrelevant that a correct analysis of the thought would cost more than an infrequent error about thoughts of this class (costs of necessary meta-analysis included).

3wedrifid15y

Most of what we do here, for example.

Exploitation and cooperation in ecology, government, business, and AI

Drahflow15y10

What is the difference between non-nested and modular? (Or between non-modular and nested?)

The pictures seem to be rotated by 180 degrees essentially.

1magfrump15y

Given a finite graph you can define a characteristic of that graph that for our purposes is called "modularity." For all integers N, consider all of the ways that you can define a partition of the graph into N subsets. For each partition, divide the size of the smallest partition by the number of connections between the different components. Find the partition where this number is maximal, then this maximal ratio is the "N-modularity" of the graph. That is, if the number is very high, there is a partition into N blocks which are dense and have few connections between each other; we can call these modules. Not sure about how to define nested, but I imagine it has to do with isomorphic sub-graphs; so if each of the N modules of a graph had the same structure, the graph would be nested as well as modular. But I'm less confident about the nested definition.

0PhilGoetz15y

Modular and nested are not opposites. "Nested", they say, means a sharing of relationships; they're not any more specific than that. Don't know what you mean about being rotated by 180 degrees. Consider the lower-left picture: It shows modularity in mutualistic (cooperative) relationships. All the points are below the line y=x because the initial measure of modularity was larger than the equilibrium measure of modularity.

Five-minute rationality techniques

Drahflow15y30

Decreasing frequency of surprising technology advancements are caused by faster and more frequent information of the general public about scientific advancements.

If the rate of news consumes grows faster than the rate of innovations produced, the perceived magnitude of innovation per news will go down.

Missed opportunities for doing well by doing good

Drahflow15y20

If you are out for the warm fuzzies: According to my experience fuzzies / $ is optimized via giving a little often.

Microfinancing might be an option, as the same capital can be lend multiple times, generating some fuzzies each time.

Then again, GiveWell seems not too decided on the concept: http://www.givewell.org/international-giving-marketplaces

Fusing AI with Superstition

Drahflow15y00

I fail to understand the sentence about overthinking. Mind to explain?

As for the condition of removing all energy and mass in a part of space not being sufficient to destroy all agents therein, I cannot see the error. Do you have an example of an agent which would continue to exist in those circumstances?

That the condition is not necessary is true: I can shoot you, you die. No need to remove much mass or energy from the part of space you occupy. However we don't need a necessary condition, only a sufficient one.

2Jack15y

Well yes we don't need a necessary condition for your idea but presumably if we want to make even a passing attempt at friendliness we're going to want the AI to know not to burn live humans for fuel. If we can't do better an AI is too dangerous, with this back-up in place or not. Well you could remove the agents and the mass surrounding them to some other location, intact.

Fusing AI with Superstition

Drahflow15y00

Not having heard your argument against "Describing ..." yet, but assuming you believe some to exist, I estimate the chance of me still believing it after your argument at 0.6.

Now for guessing the two problems:

The first possible problem will be describing "mass" and "energy" to a system which basically only has sensor readings. However, if we can describe concepts like "human" or "freedom", I expect descriptions of matter and energy to be simpler (even though 10.000 years ago, telling somebody about "hu... (read more)

0RobinZ15y

I apologize - that was, in fact, my intent.

1Jack15y

I don't know if I'm thinking about what Robin's after but the statement at issue strikes me as giving neither necessary nor sufficient conditions for destroying agents in any given part of space. If I'm on the same page as him you're overthinking it.

Fusing AI with Superstition

Drahflow15y20

The claim is relevant to the question of whether giving an action description for the red wire which will fit all of human future is not harder than constructing a real moral system. That the claim is trivial is a good reason to use "certainly".

2RobinZ15y

You're right about that. My objection was ill-posed - what I was talking about was the thought habits that produced, well: Why did you say this? Do you expect to stand by this if I explain the problems I have with it? I apologize for being circuitous - I recognize that it's condescending - but I'm trying to make the point that none of this is "easy" in a way which cannot be easily mistaken. If you want me to be direct, I will be.

Fusing AI with Superstition

Drahflow15y00

I meant certainly as in "I have an argument for it, so I am certain."

Claim: Describing some part of space to "contain a human" and its destruction is never harder than describing a goal which will ensure every part of space which "contains a human" is treated in manner X for a non-trivial X (where X will usually be "morally correct", whatever that means). (Non-trivial X means: Some known action A of the AI exists which will not treat a space volume in manner X).

The assumption that the action A is known is reasonably ... (read more)

1RobinZ15y

This claim is trivially true, but also irrelevant. Proving P ≠ NP is never harder than proving P ≠ NP and then flushing the toilet. ...is that your final answer? I say this because there are at least two problems with this single statement, and I would prefer that you identify them yourself.

Fusing AI with Superstition

Drahflow15y10

How so? The AI lives in a universe where people are planning to fuse AIs in the way described here. Given this website, and the knowledge that one believes that the red wire is magic, there is a high probability that the red wire is fake, and some very small probability that the wire is real. But it is also known for certain that the wire is real. There is not even a contradiction here.

Giving a wrong prior is not the same as walking up to the AI and telling it a lie (which should never raise probability to 1).

-1PhilGoetz15y

If you can design an AI that can be given beliefs that it cannot modify, doubt, or work around, then that would be true. Most conceptions of friendly AI probably require such an AI design, so it's not an unreasonable supposition (on LW, anyway).

Fusing AI with Superstition

Drahflow15y00

It cannot fix bugs in its priors as for any other part of the system, e.g. sensor drivers, the AI can fix the hell out of itself. Anything which can be fixed is not a true prior though. If we allow the AI to change its prior completely then it is effectively acting upon a prior which does not include any probability 1 entries.

There is no reason to fix the red wire belief if you are certain that it is true. Every evidence is against it, but the red wire does magic with probability 1, hence something is wrong with the evidence (e.g. sensor errors).

0NancyLebovitz15y

Isn't being able to fix bugs in your priors a large part of the point of Bayesianism?

Fusing AI with Superstition

Drahflow15y00

I agree. The AI + Fuse System is a deliberately broken AI. In general such an AI will perform suboptimal compared to the AI alone.

If the AI under consideration has a problematic goal though, we actually want the AI to act suboptimal with regards to its goals.

3JamesAndrix15y

I think it's broken worse than that. A false belief with certainty will allow for something like the explosion principle. http://en.wikipedia.org/wiki/Principle_of_explosion As implications of magic collide with observations indicating an ordinary wire, the AI may infer things that are insanely skewed. Where in the belief network these collisions happen could depend on the particulars of the algorithm involved and the shape of the belief network, it would probably be very unpredictable.

Fusing AI with Superstition

Drahflow15y10

This is indeed a point I did not consider.

In particular, it might be impossible to construct a simple action description which will fit all of human future. However, it is certainly not harder than to construct a real moral system.

One might get pretty far by eliminating every volume in space (AI excluded) which can learn (some fixed pattern for example) within a certain bounded time, instead of converting DNA into fluorine. It is not clear to me whether this would be possible to describe or not though.

The other option would be to disable the fuse after som... (read more)

2bogdanb15y

* it is certainly not harder...: This at least seems correct. (Reasoning: if you have a real moral system (I presume you also imply “correct” in the FAI sense), then not killing everyone is a consequence; once you solve the former, the latter is also solved, so it can’t be harder.) I’m obviously not sure of all consequences of a correct moral system, hence the “seems”. But my real objection is different: For any wrong & unchangeable belief you impose, there’s also the risk of unwanted consequences: suppose you use an, eg, fluorine-turns-to-carbon “watchdog-belief” for a (really correct) FAI. The FAI uploads everyone (willingly; it’s smart enough to convince everyone that it’s really better to do it) inside its computing framework. Then it decides that turning fluorine to carbon would be a very useful action (because “free” transmutation is a potentially infinite energy source, and the fluorine is not useful anymore for DNA). Then everybody dies. Scenarios like this could be constructed for many kinds of “watchdog beliefs”; I conjecture that the more “false” the belief is the more likely it is that it’ll be used, because it would imply large effects that can’t be obtained by physics (since the belief is false), thus are potentially useful. I’m not sure exactly if this undermines the “seems” in the first sentence. But there’s another problem: suppose that “find a good watchdog” is just as hard (or even a bit easier, but still very hard) problem as “make the AI friendly”. Then working on the first would take precious resources from solving the second. ---------------------------------------- A minor point: is English your first language? I’m having a bit of trouble parsing some of your comments (including some below). English is not my first language either, but I don’t have this kind of trouble with most everyone else around here, including Clippy. You might want to try formulating your comments more clearly.

5RobinZ15y

"Certainly" is a lullaby word (hat-tip to Morendil for the term), and a dangerous one at that. In this case, your "certainly" denies that anyone can make the precise objection that everyone has been making. FAI theory talks a lot about this kind of thinking - for example, I believe The Hidden Complexity of Wishes was specifically written to describe the problem with the kind of thinking that comes up with this idea.

Fusing AI with Superstition

Drahflow15y-10

There is no hand coded goal in my proposal. I propose to craft the prior, i.e. restrict the worlds the AI can consider possible.

This is the reason both why the procedure is comparatively simple (in comparison with friendly AI) and why the resulting AIs are less powerful.

1JamesAndrix15y

Hand coded goals are what you're trying to patch over. Don't think about it this way. This is not a path to a solution.

Fusing AI with Superstition

Drahflow15y10

It might be the case that adding the red wire belief will cripple the AI to a point of total unusability. Whether that is the case can be found out by experiment however.

Adding a fuse as proposed turns an AI which might be friendly or unfriendly into an AI that might be friendly, might spontaneously combust or be stupid.

I prefer the latter kind of AI (even though they need rebuilding more often).

Fusing AI with Superstition

Drahflow15y20

War mongering humans are also not particularly useful. In particular they are burning energy like there is no tomorrow for things definitely not paperclippy at all. And you have to spend significant energy resources on stopping them from destroying you.

A paperclip optimizer would at some point turn against humans directly, because humans will turn against the paperclip optimizer if it is too ruthless.

2JGWeissman15y

Humans are useful initially as easily manipulated arms and legs, and will not even notice that the paperclipper has taken over before it harvests their component atoms.

Fusing AI with Superstition

Drahflow15y20

Because broken != totally nonfunctional.

If we have an AI which we believe to be friendly, but can not verify to be so, we add the fuse I described, then start it. As long as the AI does not try to kill humanity or tries to understand the red wire too well, it should operate pretty much like an unmodified AI.

From time to time however it will conclude the wrong things. For example it might waste significant resources on the production of red wires, to conduct various experiments on them. Thus the modified AI is not optimal in our universe, and it contains one known bug. Hence I think it justified to call it broken.

1bogdanb15y

The problem with this idea is that it prevents us from creating an AI that is (even in principle) able to find and fix bugs in itself. Given the size of the problem, I wouldn’t trust humans to produce a bug-free program (plus the hardware!) even after decades of code audits. So I’d very much like the AI to be capable of noticing that it has a bug. And I’m pretty sure that an AI that can figure out that it has a silly belief caused by a flipped bit somewhere will figure out why that red wire “can” transmute at a distance. If we even manage to make a hyper-intelligent machine with this kind of “superstition”, I shudder to think what might be it’s opinion on the fact that humans who built it apparently created the red wire in an attempt to manipulate the AI, thus (hyper-ironically!) sealing their fate.... — it will certainly be able to deduce all that from historical observations, recordings, etc.)

Fusing AI with Superstition

Drahflow15y00

If the AI is able to question the fact that the red wire is magical, then the prior was less than 1.

It should still be able to reason about hypothetical worlds where the red wire is just a usual copper thingy, but it will always know that those hypothetical worlds are not our world. Because in our world, the red wire is magical.

As long as superstitious knowledge is very specialized, like about the specific red wire, I would hope that the AI can act quite reasonable as long as the specific red wire is not somehow part of the situation.

2JamesAndrix15y

If the AI ever treats anything as probability 1, it is broken. Even the results of addition. An AI ought to assume a nonzero provability that data gets corrupted moving from one part of it's brain to another.

Fusing AI with Superstition

Drahflow15y00

I think every AI will need to learn from it's environment. Thus it will need to update its current believes based upon new information from sensors.

It might conduct an experiment to check whether transmutation at a distance is possible - and find that transmutation at a distance could never be produced.

As the probability that transmutation of human DNA into fluorine is 1, this leaves some other options, like

the sensor readings are wrong
the experimental setup is wrong
it only works in the special case of the red wire

After sufficiently many experiments,... (read more)

0NancyLebovitz15y

I'm not sure whether it's Bayes or some other aspect of rationality, but wouldn't a reasonably capable AI be checking on the sources of its beliefs?