Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

timtyler comments on Robust Cooperation in the Prisoner's Dilemma - Less Wrong

69 Post author: orthonormal 07 June 2013 08:30AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (145)

You are viewing a single comment's thread. Show more comments above.

Comment author: timtyler 09 June 2013 02:26:08PM *  3 points [-]

I don't have any background in automated proofs; could someone please outline the practical implications of this, considering the algorithmic complexity involved?

Probably pretty low. Most real-world cooperation involves genetic or memetic kin or repetition or reputations. It is likely to be rare to meet a stranger and have reliable access to their source code.

Comment author: DanArmak 09 June 2013 07:37:28PM 2 points [-]

This is true for meat-based evolution. In a software-based world, it can be trivial to prove what your source code is, and sometimes it's advantageous too.

For instance, software might be distributed with its code, advertised as doing something. The software benefits if it can convince you that it will do that something ("cooperate"), because in exchange, you run it.

An em that's good at some kind of work might want to convince you to run copies of it: it gets to live, you get useful work done. But you're afraid that it will spy on you or sabotage its work. In a world of ems, the assumption is that nobody knows how to write superhuman (superintelligent) AIs, so we emulate human brains. That means you can't just inspect an em's software to determine what it's going to do, because you don't really understand how it works.

It would be nice if the em could provide a formal proof that in a certain scenario it would behave in a certain way (cooperate), but for software of an em's complexity that's probably never going to be possible.

Comment author: timtyler 09 June 2013 10:16:52PM *  3 points [-]

For instance, software might be distributed with its code, advertised as doing something. The software benefits if it can convince you that it will do that something ("cooperate"), because in exchange, you run it.

Yes. Open source software does derive some benefits from source code transparency. Today, though, reputations - and technical measures like sandboxes - are the main ways of dealing with the issue of running software from external sources.

Comment author: DanArmak 10 June 2013 07:36:28AM 1 point [-]

Open source (as the term is usually used) is not technically necessary. You could have proofs about native binaries, too. The real distinction is between software you have the code for (the running code, not necessarily the source code) - and software that runs out of your control, you don't know its code, and you communicate with it remotely.

Sandboxes are good but suffer from the "boxed AI" problem, so it's nice to have defense in depth.

Comment author: Eugine_Nier 12 June 2013 08:12:33AM 2 points [-]

This is true for meat-based evolution. In a software-based world, it can be trivial to prove what your source code is, and sometimes it's advantageous too.

So hide the malicious behavior in hardware or in the compiler.

Comment author: DanArmak 12 June 2013 09:42:23AM 3 points [-]

I was thinking of a scenario where the verifier controls the hardware and the compiler.

An em that's good at some kind of work might want to convince you to run copies of it

E.g. it sends you its source code to run on your own hardware because it wants to work for you.

Comment author: Eugine_Nier 15 June 2013 07:31:43AM -1 points [-]

So hide it in some other computer you interact with.

Comment author: DanArmak 15 June 2013 03:41:52PM 1 point [-]

Concerns about what I do on my own computer are separate from proofs about the software I send to you - they might still be valid, and if so, trusted and influential.

Yes, I could achieve malicious ends by outsmarting you: I could give you software that does exactly what it says (and a formal proof of that), but through its seemingly innocent interactions with other servers (which are perfectly valid under that proof), actually create a malicious effect you haven't foreseen. But that concern doesn't invalidate the usefulness of the proof.