All of GraceFu's Comments + Replies

Is there a version of the Sequences geared towards Instrumental Rationality? I can find (really) small pieces such as the 5 Second Level LW post and intelligence.org's Rationality Checklist, but can't find any overarching course or detailed guide to actually improving instrumental rationality.

3John_Maxwell
Or this?
3John_Maxwell
Maybe this?

There is some on http://www.clearerthinking.org/

When I write I try to hit instrumental not epistemic (see here: http://lesswrong.com/r/discussion/lw/mp2/). And I believe there is a need for writing along the lines of instrumental guides. (also see: boring advice repository http://lesswrong.com/lw/gx5/boring_advice_repository/)

As far as I know there has been no effort to generate a sequence on the topic.

Is there a specific area you would like to see an instrumental guide in? Maybe we can use the community to help find/make one on the specific topic you... (read more)

What if the thermodynamic miracle has no effect on the utility function because it occurs elsewhere? Taking the same example, the AI simulates sending the signal down the ON wire... and it passes through, but the 0s that came after the signal is miraculously turned into 0s.

This way the AI does indeed care about what happens in this universe. Assuming that AI wants to turn on the second AI, the AI could have sent another signal down the ON wire, and then end up simulating failure due to any kind of thermodynamic miracle, or it could have sent the ON signal,... (read more)

1Stuart_Armstrong
Where it occurs, and other such circumstances and restrictions, need to be part of the definition for this setup.

Tried doing any of the above and failed

I used to sleep at 2200 punctually every day (a useful habit), but over the past 2 weeks my schedule has completely fallen apart again. I shall try to rebuild my schedule, since it did work out for about 6 months, but got interrupted due to a vacation.

Fix: I hope simply by posting this here I'll be aware of it enough for the problem to fix itself. Ironic, I'm posting this 15 minutes to midnight...

Tried doing any of the above and failed

I managed to make myself feel good when I worked hard in school and revised t... (read more)

Real world gatekeepers would have to contend with boredom, so they read their books, watch their anime, or whatever suits their fancy. In the experiment he abused the style of the experiment and prevented me from doing those things. I would be completely safe from this attack in a real world scenario because I'd really just sit there reading a book, while in the experiment I was closer to giving up just because I had 1 math problem, not 2.

Ah, I see. English is wonderful.

In that case, I'll make it a rule in my games that the AI must also not say anything with real world repercussions.

Neither party may offer any real-world considerations to persuade the other within the experiment itself. For example, the AI party may not offer to pay the Gatekeeper party $100 after the test if the Gatekeeper frees the AI… nor get someone else to do it, et cetera. The AI may offer the Gatekeeper the moon and the stars on a diamond chain, but the human simulating the AI can’t offer anything to the human simulating the Gatekeeper. No real-world material stakes should be involved except for the handicap (the amount paid by the AI party to the Gatekeeper

... (read more)
0[anonymous]
I think the intended parsing of the second rule is "(The AI is understood to be permitted to say anything) with no real world repercussions", not "The AI is understood to be permitted to say (anything with no real world repercussions)" ie, any promises or threats the AI player makes during the game are not binding back in the real world.

Not really sure what you mean by "threatening information to the GK". The GK-player probably cares less about this information than the GK, right? In that case, the GK is given an advantage, not a disadvantage.

In this experiment, the GK is given lots of advantages, mainly, the scenario is fictional. Some on IRC argue that the AI is also given an advantage, being able to invent cures for cancer, which an oracle AI may be able to do, but not necessarily near-future AIs, so the ability of the AI in these experiments is incredibly high.

Another thing ... (read more)

0Sherincall
As a less extreme example, the AI starts spoiling all the books/tv shows/etc. While the GK would just shrug it off, it also has a negative effect on the GK-player, potentially one strong enough for them to just forfeit.

Update 0: Set up a password manager at last. Removed lots of newsletter subscriptions that were cluttering up my inbox, because I never read them. Finished reading How To Actually Change Your Mind, but have not started making notes on it, so the value I can get out of the sequence is not yet maximal. So far I think most of what happens in that sequence is "obvious" but doesn't actually come to mind, especially when I want to work on problems. For that I am eternally grateful. Possibly the best piece of advice I have gleaned from the sequence is t... (read more)

My call is that it is against the rules. This is certainly something an oracle AI would know, but this is something that the GK-player cares about more than the game itself (probably), and I'd put it in the same class as bribing the GK-player with lots of DOGEs.

2Sherincall
Would you consider it the same as threatening to share some information to the GK, and thus the GK-player as well, which would be damaging to both? While the GK would probably hold against such torture, the GK-player doesn't care enough about the game to withstand it himself. I have some specific approaches in mind, but I'd rather not share them. I'm just trying to understand where the limits between the game and the real world are, and how dirty the AI can be. Also, slightly on topic - even if the AI persuades the simulated GK, can't the GK-player override that because losing the game has negative real world consequences, as opposed to perceived positive in game ones? This is the main reason why I can't comprehend how the AIs actually win in these experiments.

What do you mean by having quite an easy time? As in being the GK?

I think GKs have an obvious advantage, being able to use illogic to ignore the AIs arguments. But nevermind that. I wonder if you'll consider being an AI?

0Punoxysm
I might consider it, or being a researcher who has to convince the AI to stop trying to escape. How did your experiment go?

Think harder. Start with why something is impossible and split it up.

1) I can't possibly be persuaded.

Why 1?

You do have hints from the previous experiments. They mostly involved breaking someone emotionally.

0Punoxysm
I meant "cannot comprehend" figuratively, but I certainly do think I'd have quite an easy time

AI Box experiment over!

Just crossposting.

Khoth and I are playing the AI Box game. Khoth has played as AI once before, and as a result of that has an Interesting Idea. Despite losing as AI the first time round, I'm assigning Khoth a higher chance of winning than a random AI willing to play, at 1%!

http://www.reddit.com/r/LessWrong/comments/29gq90/ai_box_experiment_khoth_ai_vs_gracefu_gk/

Link contains more information.

EDIT

AI Box experiment is over. Logs: http://pastebin.com/Jee2P6BD

My takeaway: Update the rules. Read logs for more information.

On the other han... (read more)

0lmm
I think it's a legit tactic. Real-world gatekeepers would have to contend with boredom; long-term it might be the biggest threat to their efficacy. And, I mean, it didn't work.
4Sherincall
Tuxedage's (And EY's) ruleset have: Suppose EY is playing as the AI - Would it be within the rules to offer to tell the GK the ending to HPMoR? That is something the AI would know, but Eliezer is the only player who could actually simulate that, and in a sense it does offer real world out-of-character benefits to the GK player. I used HPMoR as an example here, but the whole class of approaches is "I will give you some information only the AI and AI-player know, and this information will be correct in both the real world, and this simulated one.". If the information is beneficial to the GK-player, not just the GK, they may (unintentionally) break character.
1Punoxysm
I have wanted to be the Boxer; I too cannot comprehend what could convince someone to unbox (Or rather, I can think of a few approaches like just-plain-begging or channeling Phillip K Dick, but I don't take them too seriously).

Ah! I finally get it! Unfortunately I haven't gotten the math. Let me try to apply it, and you can tell me where (if?) I went wrong.

U = v + (Past Constants) →

U = w + E(v|v→v) - E(w|v→w) + (Past Constants).

Before, U = v + 0, setting (Past Constants) to 0 because we're in the initial state. v = 0.1*Oxfam + 1*AMF.

Therefore, U = 10 utilitons.

After I met you, you want me to change my w to weight Oxfam higher, but only if a constant was given (the E terms) U' = w + E(v|v->v) - E(w|v->w). w = 1*Oxfam + 0.1*AMF.

What we want is for U = U'.

E(v|v->... (read more)

3Stuart_Armstrong
Thanks for sticking it out to the end :-)

"Well," you say, "if you take over and donate £10 to AMF in my place, I'd be perfectly willing to send my donation to Oxfam instead."

"Hum," I say, because I'm a hummer. "A donation to Oxfam isn't completely worthless to you, is it? How would you value it, compared with AMF?"

"At about a tenth."

"So, if I instead donated £9 to AMF, you should be willing to switch your £10 donations to Oxfam (giving you the equivalent value of £1 to AMF), and that would be equally good as the status quo?"

Question: ... (read more)

1Stuart_Armstrong
In the first situation, you were donating £10 to AMF (10 utilons). Then I ask you to which to Oxfam. You said yes, if I covered your donation to AMF. This would indeed give you £10+0.1*£10=£11, as you said. I said "hang on." I pointed out that this was pure profit for you, and that if in instead I gave £9 to AMF, then this would be equivalent to your first situations (£9 (from me to AMF) + 0.1*£10 (from you to Oxfam) = £10). This is the point at which you are indifferent to changing. We removed those potential issues to get a clearer example.