Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Viliam 28 June 2017 10:17:14AM *  0 points [-]

I think Eliezer once wrote something about things becoming clearer when you think about how you would program a computer to do it, as opposed to e.g. just throwing some applause lights to a human. So, how specifically would you implement this kind of belief in a computer?

Also, should we go meta and say: "'Rationality gives us a better understanding of the world, except when it does not' is a good ideology, except when it is worse" et cetera?

What exactly would that actually mean? (Other than verbally shielding yourself from criticism by endless "but I said 'except when not'".) Suppose a person A believes "there is a 80% probability it will rain tomorrow", but a person B believes "there is a 80% probability it will rain tomorrow, except if it is some different probability". I have an idea about how A would bet about tomorrow's weather, but how would B?

Comment author: Viliam 28 June 2017 09:59:35AM *  0 points [-]

In future, could cryptocurrencies become an important contributor to global warming?

An important part of the common mechanisms is something called "proof of work", which roughly means "this number is valuable, because someone provably burned at least X resources to compute it". This is how "majority" is calculated in the anonymous distributed systems: you can easily create 10 sockpuppets, but can you also burn 10 times more resources? So it's a majority of burned resources that decides the outcome.

I can imagine some bad consequences as a result of this. Generally, if cryptocurrencies would become more popular; in extreme case if they would become the primary form of money used across the planet; it would create a pressure to burn insane amounts of resources... simply because if we decide collectively to only burn a moderate amount of resources, then a rogue actor could burn a slightly more than moderate amount of resources and take over the whole planet's economy.

And in long term, the universe will probably get tiled with specialized bitcoin-mining hardware.

Comment author: Viliam 28 June 2017 09:40:29AM *  0 points [-]

Oh, now I understand the moral dilemma. Something like an Ineffective Friendly AI, which uses sqrt(x) or even log(x) resources for doing actually Friendly things, and the rest of them are wasted on doing something that is not really harmful, just completely useless; with no perspective to ever become more effective.

Would you turn that off? And perhaps risk that the next AI will turn out not to be Friendly, or it will be Friendly, but even more wasteful than the old one, however better at defending itself. Or would you let it run and accept that the price is turning most of the universe into bullshitronium?

I guess for a story it is a good thing when both sides can be morally defended.

Comment author: tut 28 June 2017 08:24:08AM 0 points [-]

Whereas I would take it at 50/50. Scope insensitivity looks to me like it would hit both sides (both are all humanity forever), and so it is not clear which side it favors.

Comment author: halcyon 28 June 2017 07:41:41AM *  0 points [-]

I can't shake the idea that maps should be represented classically and territories should be represented intuitionistically. I'm looking for logical but critical comments on this idea. Here's my argument:

Territories have entities that are not compared to anything else. If an entity exists in the territory, then it is what it is. Territorial entities, as long as they are consistently defined, are never wrong by definition. By comparison, maps can represent any entity. Being a map, these mapped entities are intended to be compared to the territory of which it is a map. If the territory does not have a corresponding entity, then that mapped entity is false insofar as it is intended as a map.

This means that territories are repositories of pure truth with no speck of falsehood lurking in any corner, whereas maps represent entities that can be true or false depending on the state of the territory. This corresponds to the notion that intuitionism captures the concept of truth. If you add the concept of falsehood or contradiction, then you end up with classical logic or mathematics respectively. First source I can think of: https://www.youtube.com/playlist?list=PLt7hcIEdZLAlY0oUz4VCQnF14C6VPtewG

Furthermore, the distinction between maps and territories seems to be a transcendental one in the Kantian sense of being a synthetic a priori. That is to say, it is an idea that must be universally imposed on the world by any mind that seeks to understand it. Intuitionism has been associated with Kantian philosophy since its inception. If The Map is included in The Territory in some ultimate sense, that neatly dovetails with the idea of intuitionists who argue that classical mathematics is a proper subset of intuitionistic mathematics.

In summary, my thesis states that classical logic is the logic of making a map accurate by comparing it to a territory, which is why the concept of falsehood becomes an integral part of the formal system. In contrast, intuitionistic logic is the logic of describing a territory without seeking to compare it to something else. Intuitionistic type theory turns up type errors, for example, when such a description turns out to be inconsistent in itself.

Where did I take a wrong turn?

Comment author: MrMind 28 June 2017 07:40:34AM *  0 points [-]

If a goal made sense, then I could pursue it with instrumental rationality in the present moment, without procrastination as a means of resistance.

Yeah, that's exactly how motivation works <snark/>.

Comment author: Evan_Gaensbauer 28 June 2017 07:39:31AM 0 points [-]

There's an existing EA Discord server. Someone posted about it in the 'Effective Altruism' Facebook group, and it was the first mention I'd seen of an EA Discord anywhere, so it's probably the only/primary one existing. There's nothing "official" about the EA Discord, but it's the biggest and best you'll find if that's what you're looking for. I can send you an invite if you want.

Comment author: cousin_it 28 June 2017 06:52:36AM *  0 points [-]

Weighted sum, with weights determined by bargaining.

Comment author: username2 28 June 2017 06:44:15AM 0 points [-]

Project management training materials.

Comment author: MaryCh 28 June 2017 06:22:37AM 0 points [-]

...and then accuse them of standing by when the dragon was being investigated and slayed. Truly, what did they even do, fought some goblins and withheld intelligence from the elves?..

Comment author: MaryCh 28 June 2017 06:17:29AM 0 points [-]

So it causes disaster. Disaster on this scale was already likely, even warned against. Somehow, the elves inviting themselves along - out of pure greed, no magic involved - is portrayed as less of a fault.

Comment author: MaryCh 28 June 2017 06:01:41AM 0 points [-]

But they don't get the flak for stealing it. They get the flak for claiming it afterwards "because of the curse". It's not avarice that is the main problem - it's the being enthralled. And I can't quite get it. Why not just say "boo greedy dwarves" but go the whole way of "boo greedy dwarves who up and got themselves enchanted"? What does the enchanted bit do?

Comment author: lmn 28 June 2017 03:57:02AM 0 points [-]

A: "I would have an advantage in war so I demand a bigger share now" B: "Prove it" A: "Giving you the info would squander my advantage" B: "Let's agree on a procedure to check the info, and I precommit to giving you a bigger share if the check succeeds" A: "Cool"

Simply by telling B about the existence of an advantage A is giving B info that could weaken it. Also, what if the advantage is a way to partially cheat in precommitments?

Comment author: lmn 28 June 2017 03:50:08AM 0 points [-]

Even if A is FAI and B is a paperclipper, as long as both use correct decision theory, they will instantly merge into a new SI with a combined utility function.

What combined utility function? There is no way to combine utility functions.

Comment author: lmn 28 June 2017 03:47:58AM 0 points [-]

Maybe you can't think of a way to set up such trade, because emails can be faked etc, but I believe that superintelligences will find a way to achieve their mutual interest.

They'll also find ways of faking whatever communication methods are being used.

In response to comment by gjm on Any Christians Here?
Comment author: lmn 28 June 2017 03:42:54AM 0 points [-]

Empirically, people who believe in the Christian hell don't behave dramatically better than people who do.

Hasn't quite been my experience but, whatever.

The doctrine of hell whose (de)merits we're discussing doesn't actually say that people are only eligible for hell if they have never stopped believing in it.

Of course, otherwise it would be completely useless as it would simply motivate people to stop believing in it.

Comment author: korin43 28 June 2017 03:13:45AM 0 points [-]

You might like the book "The End of Time" by Julian Barbour. It's about an alternative view of physics where you rearrange all of the equations to not include time. The book describes the result sort of similarly to what you're suggesting, where the system is defined as the relationship between things and the evolution of those relationships and not precise locations and times.

Comment author: erratio 28 June 2017 03:11:26AM 0 points [-]

Request: A friend of mine would like to get better at breaking down big vague goals into more actionable subgoals, preferably in a work/programming context. Does anyone know where I could find a source of practice problems and/or help generate some scenarios to practice on? Alternatively, any ideas on a better way to train that skill?

Comment author: knb 28 June 2017 02:21:27AM *  0 points [-]

What did you mean by "fanon dwarves"? Is that just a fan interpretation or do you think Tolkien intended it? In Tolkien's idealized world, all economic motivations are marginal and deprecated. The dwarves are motivated partially by a desire for gold, but mostly by loyalty to their king and a desire to see their ancestral homeland restored to them. To the extent the treasure itself motivates Thorin & co., it causes disaster (for example his unwillingness to share the loot almost causes a battle against local men & elves.)

Comment author: Alicorn 28 June 2017 02:14:24AM 0 points [-]

Animals are not zero important, but people are more important. I am a pescetarian because that is the threshold at which I can still enjoy an excellent quality of life, but I don't need to eat chicken fingers and salami to reach that point. Vegans are (hopefully) at a different point on this tradeoff curve than I am and meat-eaters (also hopefully) are at a different point in the other direction.

Comment author: entirelyuseless 27 June 2017 10:45:52PM 0 points [-]

The problem seems to be that you think that you need to choose a goal. Your goal is what you are tending towards. It is a question of fact, not a choice.

Comment author: bogus 27 June 2017 09:33:45PM 0 points [-]

I think you need verifiable pre-commitment, not just communication - in a free-market economy, enforced property rights basically function as such a pre-commitment mechanism. Where pre-commitment (including property right enforcement) is imperfect, only a constrained optimum can be reached, since any counterparty has to assume ex-ante that the agent will exploit the lack of precommitment. Imperfect information disclosure has similar effects, however in that case one has to "assume the worst" about what information the agent has; the deal must be altered accordingly, and this generally comes at a cost in efficiency.

Comment author: SamDeere 27 June 2017 08:42:09PM 0 points [-]

Thought I had, turns out you need to verify separately for the wiki and the forum. Thanks Julia for posting.

Comment author: deluks917 27 June 2017 08:38:04PM 0 points [-]

I would attend the OKC presentation.

Comment author: turchin 27 June 2017 08:33:33PM *  1 point [-]

Some back of envelope calculations about superintelligence timing and bitcoin net power.

Total calculation power of bitcoin now is 5 exohash (5 on 10 power 18) https://bitcoin.sipa.be/

It is growing exponentially with approximately 1 year doubling, but it accelerated in 2017. There are other crypto currencies, probably of the same calculating power combined.

Hash is very roughly 3800 flops (or may be 12000), but the nature of calculation is different. Large part is done on special hardware, but part on universal graphic cards, which could be used to calculate neural nets.
https://www.reddit.com/r/Bitcoin/comments/5kfuxk/how_powerful_is_the_bitcoin_network/

So total power of bitcoin net is 2 on 10power22 classical processor operation per second. This is approximately 200 000 more than most powerful existing supercomputer.

Markram expected that human brain simulation would require 1 exoflop. That means that current blockchain network is computationally equal to 20 000 human brains. It is probably enough to run superintelligence. But most of it can’t do neural nets calculation. However, if the same monetary incentives appear, specialised hardware could be produced, which will able to do exactly those operations, which are needed for future neural nets. If first superintelligence appear, it would try to use this calulation power, and we could see it by changes in bitcoin net usage.

TL;DR There is no hardware problems for creating superintlelligence now.

Comment author: ChristianKl 27 June 2017 08:11:54PM 0 points [-]

It's hard to know from the outside which problems are tractable enough to write a bachelor thesis on them.

Comment author: Screwtape 27 June 2017 07:45:19PM 0 points [-]

My terminal goal is that I exist and be happy.

Lots of things make me happy, from new books to walks in the woods to eating a ripe pear to handing my brother an ice cream cone.

Sometimes trying to achieve the terminal goal involves trading off which things I like more against each other, or even doing something I don't like in order to be able to do something I like a lot in the future. Sometimes it means trying new things in order to figure out if there's anything I need to add to the list of things I like. Sometimes it means trying to improve my general ability to chart a path of actions that lead to me being happy.

One goal, that is arguably selfish but also includes others values as input, that gets followed regardless of the situation. Does that make more sense?

Comment author: blankcanvas 27 June 2017 07:40:11PM 0 points [-]

Posting on behalf of my coworker Sam Deere (who didn't have enough karma to post):

I registered this account today and couldn't post, so I figured I had to verify an email associated with this account and now it works. :)

Comment author: juliawise 27 June 2017 07:32:05PM 1 point [-]

Posting on behalf of my coworker Sam Deere (who didn't have enough karma to post):

"Thanks for the feedback. It's good to know that this is something people are thinking about — we think a lot about how to make EA's online presence best serve the needs of the community too.

For context, I'm head of tech at CEA, which runs EffectiveAltruism.org. (I have less to do with the content and structure of the site these days, but had a hand in putting it up, and am involved in a lot of decision making about which projects to priorities.)

There seem to be a few concerns, one about functionality, one about discoverability, and one about content. That is, EA needs better discussion spaces, the ones it has are too hard to find, and the easiest-to-find content doesn't represent the breadth of EA really well.

In general we agree that EA needs good discussion spaces, and that the current ones could be improved (e.g. by separating concerns of content discovery and content creation etc). This is something that's in CEA's longer-term tech projects roadmap, but we don't have the capacity to prioritise this right now. This is doubly true when there are fairly good discussion spaces available, in particular the EA Forum. However, we're working on building out more features, on top of the EffectiveAltruism.org webapp (which at the moment is functionally just EA Funds).

Individual projects will have their own coordination needs so at this point it hasn't made sense to try to build a be-all/end-all platform that encompasses all of them. You've suggested a number of tools that such a platform could draw inspiration from — in many cases people do just use these tools to coordinate on projects. The EA Forum serves a useful role to announce project ideas and seek collaborators, and this isn't the only place in the community where projects/collaboration happens — EA Grants and the .impact Hackpads were already mentioned. Another example is Effective Altruism Global, which allows people to discuss these projects and ideas in person, which is much higher bandwidth.

(It's also important to get the balance right between shiny new things that work better and continuity — there's always a new platform, a new tool that we could use that will be an improvement on existing processes. But if it doesn't complement existing tools and processes people use, then it risks either not gaining adoption, or splitting the user base. Developer time and energy is a scarce resource, and like everything, needs to be prioritised. Many projects of this scope fall into disuse.)

Regarding discoverability, as others have suggested, it's not clear that the solution is to make things more discoverable. Online communities are very hard to get right — there's a constant tension between preserving the culture and norms that make the culture great, while keeping it open and accessible to newer members who want to get involved. Newer members have less context for certain discussions (which makes people feel they can't be as open for fear of alienating newcomers), newer members may ask lots of basic questions etc (see the Eternal September effect). The solution is never perfect, but it's important to have ways for people to get involved with the community incrementally, so that they can acquire that context as they go — this necessitates having some more introductory content on places like EffectiveAltruism.org, and the selection effects of the effort required to learn a bit more about the community are likely a feature, not a bug.

In general we observe that people start reading introductory content, then those that are hooked do a deep dive and discover the rest of the community in the process. However, it's a useful data point to know that you felt that as someone who was already potentially on board, that the introductory stuff was off-putting, and we'll keep that in mind as we're considering what other content needs to be on EffectiveAltruism.org.

Regarding content breadth, CEA is currently working on a project to make the content covered on EffectiveAltruism.org more comprehensive and representative of the broader spectrum of ideas that get discussed within the community (partly building on the existing Effective Altruism Concepts project, and also drawing inspiration from things like the LW sequences — more details will be announced in time).

As with everything, we're massively constrained by staff and volunteer time. At the moment we're hiring for a number of roles that should speed up the development of some of these features (hint hint...). As someone noted, it would perhaps have been worthwhile to post this on the EA Forum to see if there are more ideas in this vein, or if others in the community are working on something like this."

Comment author: Lumifer 27 June 2017 07:00:37PM 0 points [-]

any goal I make up seems wrong and do not motivate me in the present moment to take action

You are not supposed to "make up" goals, you're supposed to discover them and make them explicit. By and large your consciousness doesn't create terminal goals, only instrumental ones. The terminal ones are big dark shadows swimming in your subconscious.

Besides, it's much more likely your motivational system is somewhat broken, that's common on LW.

a goal which is universally shared among you, me and every other Homo Sapiens, which lasts through time

Some goal, any goal? Sure: survival. Nice terminal goal, universally shared with most living things, lasts through time, allows for a refreshing variety of instrumental goals, from terminating a threat to subscribing to cryo.

Comment author: blankcanvas 27 June 2017 06:43:53PM *  0 points [-]

It doesn't make sense to have internally generated goals, as any goal I make up seems wrong and do not motivate me in the present moment to take action. If a goal made sense, then I could pursue it with instrumental rationality in the present moment, without procrastination as a means of resistance. Because it seems as it simply is resistance of enslavement to forces beyond my control. Not literally, but you know, conditioning in the schooling system etc.

So what I would like, is a goal which is universally shared among you, me and every other Homo Sapiens, which lasts through time. Preferences which are shared.

Comment author: Lumifer 27 June 2017 06:33:26PM *  1 point [-]

What goal should I have?

First, goals, multiple. Second, internally generated (for obvious reasons). Rationality might help you with keeping your goals more or less coherent, but it will not help you create them -- just like Bayes will not help you generate the hypotheses.

Oh, and you should definitely expect your goals and preferences to change with time.

Comment author: cousin_it 27 June 2017 06:31:42PM *  1 point [-]

A somewhat pedantic but important question; are these chances independent of each other?

No, mutually exclusive. Also X + Y can be less than 100, the rest is status quo.

Comment author: Lumifer 27 June 2017 06:30:28PM *  2 points [-]

This is not really a theory. I am not making predictions, I provide no concrete math, and this idea is not really falsifiable in its most generic forms. Why do I still think it is useful? Because it is a new way of looking at physics, and because it makes everything so much more easy and intuitive to understand, and makes all the contradictions go away.

Let's compare it with an alternative theory that there are invisible magical wee beasties all around who make the physics actually work by pushing, pulling, and dragging all the stuff. And "there are alternative interpretations for explaining relativity and quantum physics under this perspective" -- sometimes the wee beasties find magic mushrooms and eat them.

  • Not making predictions? Check.
  • No concrete math? Check.
  • Not really falsifiable? Check.
  • New way of looking at physics? Check (sufficiently so).
  • So much more easy and intuitive to understand? Check.
  • Makes all the contradictions go away? Check.
  • Not a theory, but a new perspective? Check.

It's a tie! But the beasties are cuter, so they win.

Comment author: blankcanvas 27 June 2017 06:20:41PM *  0 points [-]

So all of your actions in the present moment is guided towards your brothers happiness? I didn't mean switching between goals as situations change, only one goal.

Comment author: blankcanvas 27 June 2017 06:19:29PM 0 points [-]

That's why I am asking here. What goal should I have? I use goal and preference interchangeably. I'm also not expecting the goal/preference to change in my lifetime, or multiple lifetimes either.

Comment author: Screwtape 27 June 2017 06:01:30PM 0 points [-]

Your preferences can include other people's well being. I have a strong preference that my brother be happy, for example.

Comment author: Screwtape 27 June 2017 05:58:31PM 1 point [-]

It seems to me like the most important piece of data is your prior on the default conditions. If, for example, there was a 99% chance of the universe winding up getting tiled in paperclips in the next year then I would accept very unlikely odds on X and likely odds on Y. Depending on how likely you think an "I Have No Mouth but I Must Scream" situation is, you might 'win' even if Y happens.

Hrm. A somewhat pedantic but important question; are these chances independent of each other? For example, lets say X and Y are both 50%. Does that mean I'm guaranteed to get either X or Y, or is there a ~25% chance that both come up 'tails' so to speak and nothing happens? If the later, what happens if both come up 'heads' and I get my utopia but everyone dies? (Assuming that my utopia involves being at least me being alive =P)

The second most important piece of data to me is that X is the odds of "my best imaginable utopia" which is very interesting wording. It seems to mean I won't get a utopia better than I can imagine, but also that I don't have to compromise if I don't want to. I can imagine having an Iron Man suit so I assume the required technology would come to pass if I won, but HPMoR was way better than I imagined it to be before reading it. Lets say my utopia involves E.Y. writing a spiritual successor to HPMoR in cooperation with, I dunno, either Brandon Sanderson or Neil Stephenson. I think that would be awesome, but if I could imagine that book in sufficient fidelity I'd just write it myself.

My probability space looks something like 10% we arrive at a utopia close enough to my own that I'm still pretty happy, 20% we screw something up existential and die, 40% things incrementally improve even if no utopia shows up, 20% things gradually decline and get somewhat worse but not awful, 9% no real change, 1% horrible outcomes worse than "everybody dies instantly." (Numbers pulled almost entirely out of my gut, as informed by some kind of gestalt impression of reading the news and looking at papers that come to my attention and the trend line of my own life.) If I was actually presented that choice, I'd want my numbers to be much firmer, but lets imagine I'm pretty confident in them.

I more or less automatically accept any Y value under 21%, since that's what I think the odds are of a bad outcome anyway. I'd actually be open to a Y that was slightly higher than 21%, since it limits how bad things can get. (I prefer existence in pain to non-existence, but I don't think that holds up against a situation that's maximizing suffering.) By the same logic, I'm very likely to refuse any X below 10%, since that's the odds I think I can get a utopia without the deal. (Though since that utopia is less likely to be my personal best imagined case, I could theoretically be persuaded by an X that was only a little under 10%.) X=11%, Y=20% seems acceptable if barely so?

On the one hand, I feel like leaving a chance of the normal outcome is risking a 1% really bad outcome, so I should prefer to occupy as much of the possible outcomes with X or Y; say, X=36% and Y=64%. On the other, 41% of the outcomes if I refuse are worse than right now, and 50% are better than right now, so if I refuse the deal I should expect things to turn out in my favour. I'm trading a 64% chance of bad things for a 41% chance of bad things. This looks dumb, but it's because I think things going a little well is pretty likely- my odds about good an bad outcomes change a lot depending on whether I'm looking at the tails or the hill. Since I'm actually alright (though not enthusiastic) with things getting a little worse, I'm going to push my X and Y halfway towards the values that include things changing a little bit. Lets say X=45% and Y=55%? That seems to square with my (admittedly loose) math, and it feels at first glance acceptable to my intuition. This seems opposed by my usual stance that I'd rather risk moderately bad outcomes if it means I have a bigger chance of living though, so either my math is wrong or I need to think about this some more.

TLDR, I'd answer X=45% and Y=55%, but my gut is pushing for a higher chance of surviving or taking the default outcome depending on how the probabilities work.

Comment author: Lumifer 27 June 2017 05:58:16PM 0 points [-]

Why, my, preferences?

What are your other options?

Comment author: Lumifer 27 June 2017 05:57:40PM 1 point [-]

the first AI can describe the second AI as knowing that there is a 50% chance, but caring more about the heads outcome.

First of all this makes any sense solely in the decision-taking context (and not in the forecast-the-future context). So this is not about what will actually happen but about comparing the utilities of two outcomes. You can, indeed, rescale the utility involved in a simple case, but I suspect that once you get to interdependencies and non-linear consequences things will get more hairy, if possible at all.

Besides, this requires you to know the utility function in question.

Comment author: blankcanvas 27 June 2017 05:41:10PM 0 points [-]

I don't know what goal I should have to be a guide for instrumental rationality in the present moment. I want to take this fully seriously, but for the instrumental rationality in of it self with presence.

"More specifically, instrumental rationality is the art of choosing and implementing actions that steer the future toward outcomes ranked higher in one's preferences.

Why, my, preferences? Have we not evolved rational thought further than simply anything one self cares about? If there even is such a thing as a self? I understand, it's how our language has evolved, but still.

Said preferences are not limited to 'selfish' preferences or unshared values; they include anything one cares about."

Not limited, to selfish preferences or unshared values, what audience is rationality for?

https://wiki.lesswrong.com/wiki/Rationality

Comment author: entirelyuseless 27 June 2017 05:37:52PM 0 points [-]

I think the idea that if one AI says there is a 50% chance of heads, and the other AI says there is a 90% chance of heads, the first AI can describe the second AI as knowing that there is a 50% chance, but caring more about the heads outcome. Since it can redescribe the other's probabilities as matching its own, agreement on what should be done will be possible. None of this means that anyone actually decides that something will be worth more to them in the case of heads.

Comment author: Jiro 27 June 2017 05:17:46PM *  0 points [-]

My understanding is that P depends only on your knowledge and priors.

A per-experiment P means that P would approach the number you get when you divide the number of successes in a series of experiments by the number of experiments. Likewise for a per-awakening event. You could phrase this as "different knowledge" if you wish, since you know things about experiments that are not true of awakenings and vice versa.

Comment author: toonalfrink 27 June 2017 05:14:51PM 0 points [-]

What about the research agendas that have already been published?

Comment author: toonalfrink 27 June 2017 05:13:27PM 0 points [-]

Am not associated. Just found the article in the MIRI newsletter

Comment author: turchin 27 June 2017 04:14:33PM 0 points [-]

I think there are two other failure modes, which need to be a resolved:

A weaker side is making negotiation longer if it helps it to gain power

A weaker side could fake the size of its army (Like North Korea did with its wooden missiles on last military show)

Comment author: Lumifer 27 June 2017 03:47:27PM 1 point [-]

sharing all information is doable

In all cases? Information is power.

before flipping a coin I can say that all good things in life will be worth 9x more to me in case of heads than tails

There is an old question that goes back to Abraham Lincoln or something:

If you call a dog's tail a leg, how many legs does a dog have?

Comment author: cousin_it 27 June 2017 03:45:54PM *  0 points [-]

Yeah, Schelling's "Strategy of Conflict" deals with many of the same topics.

A: "I would have an advantage in war so I demand a bigger share now" B: "Prove it" A: "Giving you the info would squander my advantage" B: "Let's agree on a procedure to check the info, and I precommit to giving you a bigger share if the check succeeds" A: "Cool"

Comment author: turchin 27 June 2017 03:44:44PM 0 points [-]

But EA should be mostly #2?

Comment author: cousin_it 27 June 2017 03:40:17PM *  0 points [-]

Mostly #1. Is there a reason to build AIs that inherently care about the well-being of paperclippers etc?

Comment author: turchin 27 June 2017 03:40:07PM 0 points [-]

BTW, the book "On thermonuclear war" by Kahn is exactly an attempt to predict the ways of war, negotiations and barging between two presumably rational agents (superpowers). Even an idea to move all resources to new third agent is discussed, as I remember - that is donating all nukes to UN.

How B could see that A has hidden information?

Personally, I feel like you have a mathematically correct, but idealistic and unrealistic model of relations between two perfect agents.

Comment author: Dagon 27 June 2017 03:37:02PM 2 points [-]

Note that there are two very distinct reasons for cooperation/negotation:

1) It's the best way to get what I want. The better I model other agents, the better I can predict how to interact with them in a way that meets my desires. For this item, an external agent is no different from any other complex system. 2) I actually care about the other agent's well-being. There is a term in my utility function for their satisfaction.

Very weirdly, we tend to assume #2 about humans (when it's usually a mix of mostly 1 and a bit of 2). And we focus on #1 for AI, with no element of #2.

When you say "code for cooperation", I can't tell if you're just talking about #1, or some mix of the two, where caring about the other's satisfaction is a goal.

Comment author: cousin_it 27 June 2017 03:30:50PM *  0 points [-]

I think sharing all information is doable. As for priors, there's a beautiful LW trick called "probability as caring" which can almost always make priors identical. For example, before flipping a coin I can say that all good things in life will be worth 9x more to me in case of heads than tails. That's purely a utility function transformation which doesn't touch the prior, but for all decision-making purposes it's equivalent to changing my prior about the coin to 90/10 and leaving the utility function intact. That handles all worlds except those that have zero probability according to one of the AIs. But in such worlds it's fine to just give the other AI all the utility.

Comment author: Screwtape 27 June 2017 03:26:54PM 1 point [-]

If there was a briefcase full of hundred dollar bills over there that someone told me was mine by right, I'd be pretty attached to it. If they then added the caveat that there was a massive dragon squatting on the thing who also believed the briefcase was theirs, I do not think I would try and steal the briefcase. Would you?

Comment author: entirelyuseless 27 June 2017 03:24:08PM 1 point [-]

This feels like no. Or rather, it feels about the same as 80/20, which inclines me to say no.

Comment author: Lumifer 27 June 2017 03:24:03PM *  0 points [-]

put an enormous heap of money in front of a few poor travellers

Put an enormous heap of money with a big nasty dragon on top of it in front a few poor travelers...

In response to comment by gjm on Any Christians Here?
Comment author: entirelyuseless 27 June 2017 03:20:11PM 0 points [-]

I agree that hell is a bad idea. That said, many Christians are tending more toward the position that very few people actually go to hell. It could be that only the very worst people go there, together with a few of the better people who could have done far more good than they actually did. If the numbers are small enough, there might be a significantly larger number of people who did a great amount of good, but would not have done it, if they had not believed in hell. In that case, there might be a large impact per person going to hell.

Obviously, there are still many problems with this, like whether that large impact can justify anything eternal.

Comment author: Lumifer 27 June 2017 03:19:26PM *  0 points [-]

The whole point of the prisoner's dilemma is that the prisoners cannot communicate. If they can, it's not a prisoner's dilemma any more.

Comment author: Lumifer 27 June 2017 03:18:39PM 0 points [-]

A meme necessarily looks better than the actual source :-/

Comment author: cousin_it 27 June 2017 03:16:43PM 0 points [-]

Would you accept at 40/10? (Remaining 50% goes to status quo.)

Comment author: Lumifer 27 June 2017 03:15:33PM 1 point [-]

Sufficiently rational agents will never go to war, instead they'll agree about the likely outcome of war, and trade resources in that proportion.

Not if the "resource" is the head of one of the rational agents on a plate.

The Aumann theorem requires identical priors and identical sets of available information.

Comment author: Dagon 27 June 2017 03:14:32PM 1 point [-]

You're not addressing me, as I say morality is subjective. However, even within your stated moral framework, you haven't specified the value range of a marginal animal life. I'm extremely suspicious of arguments that someone else's (including factory-farmed animals) are negative value. If you think that they're lower value than other possible lives, but still positive, then the equilibrium of creating many additional lives, even with suffering, is preferable to simply having fewer animals on earth.

So yes, suffering is worse than contentment. Is it worse than never having existed at all? I don't know, and suspect not.

Comment author: entirelyuseless 27 June 2017 03:13:05PM 1 point [-]

I feel like I would accept approximately at 90/10, and would be reluctant to accept it with worse terms. But this might be simply because a 90% chance for a human being is basically where things start to feel like "definitely so."

Comment author: cousin_it 27 June 2017 03:07:32PM *  1 point [-]

Compared to the current status quo, which we don't understand fully. The question isn't supposed to be easy, it ties into a lot of things. You are supposed to believe the offer though.

Comment author: Dagon 27 June 2017 03:05:44PM *  0 points [-]

I think there are close to zero humans who make this tradeoff. Scope insensitivity hits too hard.

The first question, though, is "compared to what?" If I reject the deal, what's the chance intelligence will attain utopia at some point, and what's the chance of extinction? The second question is "why should I believe this offer?"

Comment author: Screwtape 27 June 2017 02:59:07PM 4 points [-]

I live in a tiny rural town, and get the majority of my meat from farmer's markets. Having been raised on a similar farm to the ones I buy from, I'm willing to bet those cows are happy a greater percentage of their lives than I will be. I recognize this is mostly working because of where I live and the confidence I have in how those farms are run. In the same way that encouraging fewer animals to exist in terrible conditions (by being vegan) is good I feel that encouraging more animals to exist in excellent conditions (by eating meat) is good. I don't stop eating meat (though I do eat less) when I go on trips elsewhere even though I'm aware I'm probably eating something that had a decidedly suboptimal life because switching on and off veganism would be slow.

That's my primary argument. My secondary, less confident position is that since I prefer existing in pain and misery to not existing, my default assumption should be that animals prefer existing in pain and misery to not existing. I'm much less confident here, since I'm both clearly committing the typical mind fallacy and have always had some good things in my life no matter how awful most things were. Still, when I imagine being in their position, I find myself preferring to exist and live rather than not have existed. (Though I prefer existing and not being in pain the superior outcome by a wide margin!)

Comment author: cousin_it 27 June 2017 02:57:11PM *  0 points [-]

Yeah, bargaining between AIs is a very hard problem and we know almost nothing about it. It will probably have all sorts of deception tactics. But in any case, using bargaining instead of war is still in both AI's common interest, and AIs should be able to achieve common interest.

For example, if A has hidden information that will give it an advantage in war, then B can precommit to giving A more share conditional on seeing it (e.g. by constructing a successor AI that visibly includes the precommitment under A's watch). Eventually the AIs should agree on all questions of fact and disagree only on values, at which point they agree on how the war will likely go, so they skip the war and share the bigger pie according to the war's predicted outcome.

Comment author: turchin 27 June 2017 02:42:20PM 0 points [-]

So they meet before the possible start of the arms race and compare each other relative advantages? I still think that they may try to demonstrate higher barging power than they actually have and that it is almost impossible for us to predict how their game will play because of its complexity.

Thanks for participating in this interesting conversation.

Comment author: MrMind 27 June 2017 02:34:34PM 0 points [-]

If agent 1 creates an agent 2, it will always know for sure its goal function.

That is the point, though. By Loeb's theorem, the only agents that are knowable for sure are those with less power. So an agent might want to create a successor that isn't fully knowable in advance, or, on the other hand, if a perfectly knowable successor could be constructed, then you would have a finite method to ensure the compatibility of two source codes (is this true? It seems plausible).

Comment author: MaryCh 27 June 2017 02:32:46PM 0 points [-]

None, if I knew exactly this and nothing more.

Comment author: Screwtape 27 June 2017 02:31:14PM 0 points [-]

Assuming one or two a month counts as "avid" then I count.

Some combination of what's on sale, what my friends/family/favourite bloggers are reading, and what's in a field I feel like I could use more background in for the fiction I write. (Anything that's all three at once gets read first, then anything with two of the three, and then whatever fits at least one of those.)

Comment author: MaryCh 27 June 2017 02:30:37PM 1 point [-]

Isn't it odd how fanon dwarves [from 'Hobbit'] are seen as 'fatally and irrationally enamoured' by the gold of the Lonely mountain? I mean, any other place and any other time, put an enormous heap of money in front of a few poor travellers, tell them it's their, by right, and they would get attached to it and nobody would find it odd in the least. But Tolkien's dwarves get the flak. Why?

Comment author: cousin_it 27 June 2017 02:26:15PM *  0 points [-]

The paperclipper doesn't need to invest anything. The AIs will just merge without any arms race or war. The possibility of an arms race or war, and its full predicted cost to both sides, will be taken into account during barganing instead. For example, if the paperclipper has a button that can nuke half of our utility, the merged AI will prioritize paperclips more.

Comment author: turchin 27 June 2017 02:14:58PM *  0 points [-]

So if the price of turning off paperclip is Y, if Y is higher than X/2 , we should cooperate?

But if we agree on this, we create for the papercliper an incentive to increase Y, until it reaches X/2. To increase Y, papercliper has to invest in defense mechanisms or offensive weapons. It creates arms race, until negotiations become more profitable. However, arms race is risky and could turn into war.

Edited: higher.

Comment author: cousin_it 27 June 2017 01:47:43PM *  0 points [-]

If we can turn off the paperclipper for free, sure. But if war would destroy X resources, it's better to merge and spend X/2 on paperclips.

Comment author: entirelyuseless 27 June 2017 01:41:19PM 1 point [-]

The simple answer is that I care about humans more than about other animals by an extremely large degree. So other things being equal, I would prefer that the other animals suffer less. But I do not prefer this when it means slightly less utility for a human being.

So my general response about "veganism" is "that's ridiculous," since it is very strongly opposed to my utility function.

Comment author: turchin 27 June 2017 01:40:12PM 1 point [-]

But should we merge with papercliper if we could turn it off?

It reminds me Great Britain policy towards Hitler before WW2, which suggested to give him what he wants to prevent the war. https://en.wikipedia.org/wiki/Appeasement

Comment author: cousin_it 27 June 2017 01:35:32PM *  0 points [-]

An AI could devise a very secure merging process. We don't have to code it ourselves.

Comment author: turchin 27 June 2017 01:29:18PM 0 points [-]

I could imagine some failure modes, but surely I can't imagine the best one. For example, "both original AIs shut down" simultaneously is vulnerable for defecting.

I also have some busyness experience, and I found that almost every deal includes some cheating, and the cheating is everytime something new. So I always have to ask myself - where is the cheating from the other side? If don't see it, it's bad, as it could be something really unexpected. Personally, I hate cheating.

Comment author: cousin_it 27 June 2017 01:07:24PM *  0 points [-]

I imagine merging like this:

1) Bargain about a design for a joint AI, using any means of communication

2) Build it in a location monitored by both parties

3) Gradually transfer all resources to the new AI

4) Both original AIs shut down, new AI fulfills their combined goals

No proof of rationality required. You can design the process so that any deviation will help the opposing side.

View more: Next