All of accolade's Comments + Replies

you could use a tool like https://visualping.io to track & notify about changes on https://www.lesswrong.com/s/TF77XsD5PbucbJsG3
(To convert e.g. from mail notifications to RSS, you could certainly google another tool, maybe https://zapier.com has something)

I guesstimate the deal is not negligible.

Input to my intuition:

a study identified over a hundred different strains of bacteria on dollar bills

Traces of cocaine can be found on almost 80 percent of dollar bills.

(source http://theconversation.com/atms-dispense-more-than-money-the-dirt-and-dope-thats-on-your-cash-79624 )

A powder called Glo Germ, meant to visualize germ spread, was still visible to the naked eye after 8 handshakes (but not 9) in an informal experiment by YouTuber Mark Rober. ( https://youtu.be/I5-dI74zxPg?t=346 )

((
Pretty much deader than disco, but my inet-fu was able to dig up the following excerpts of the original article (from http://newsinfo.inquirer.net/25019/overcoming-procrastination):

“Too many people set goals that are simply unrealistic. Too big, they want it too soon, and they wonder why they don’t have any results in their life. What happens to a person who is consistently setting big goals that are outside of their scope, outside of their belief system, and they keep coming short of them? What kind of pattern does it set up in their mind? That sort o

... (read more)

But they could still use/ sell your address for spam that doesn’t work with a mail response, but clicking a link. (E.g. shopping for C1/\L|S.)

• Everett branches where Eliezer Yudkowsky wasn’t born have been deprecated. (Counterfactually optimizing for them is discouraged.)

"That which can be destroyed by being a motherfucking sorceror should be"

Brilliant!! x'D x'D

(This might make a good slogan for pure NUs …)

“Effective Hedonism”
“Effective Personal Hedonism”
“Effective Egoistic Hedonism”
“Effective Egocentric Hedonism”
“Effective Ego-Centered Hedonism”
“Effective Self-Centric Hedonism”
“Effective Self-Centered Hedonism”

why would anyone facing a Superhappy in negotiation not accept and then cheat?

The SH cannot lie. So they also cannot claim to follow through on a contract while plotting to cheat instead.

They may have developed their negotiation habits only facing honest, trustworthy members of their own kind. (For all we know, this was the first Alien encounter the SH faced.)

Thank you so much for providing and super-powering this immensely helpful work environment for the community, Malcolm!

Let me chip in real quick... :-9

There - ✓ 1 year subscription GET. I can has a complice nao! \o/
"You're Malcolm" - and awesome! :)

[ TL;DR keywords in bold ]

Assuming freedom of will in the first place, why should you not be able to choose to try harder? Doesn't that just mean allocating more effort to the activity at hand?

Did you mean to ask "Can you choose to do better than your best?" ? That would indeed seem similar to the doubtable idea of selecting beliefs arbitrarily. By definition of "best", you can not do better than it. But that can be 'circumvented' by introducing different points in time: Let's say at t=1 your muscle capacity enables you to lift up ... (read more)

And the mock ads at the bottom.

ETA: Explanation: Sometimes the banner at the bottom will contain an actual (randomized) ad, but many of the comics have their own funny mock ad associated. (When I noticed this, I went through all the ones I had already read again, to not miss out on that content.)

(I thought I'd clarify this, because this comment got downvoted - possibly because the downvoter misunderstood it as sarcasm?)

Never too late to upboat a good post! \o/ (…and dispense some bias at the occasion…)

1gwern
You're a bit late.

Thanks for the feedback on the bold formatting! It was supposed to highlight keywords, sort of a TL;DR. But as that is not clear, I shall state it explicitly.

If the author assumes that most people would even put considerable (probabilistic) trust into his assertion of having won, he would not maximize his influence on general opinion by employing this bluff of stating he has almost won. This is amplified by the fact that the statement of an actual AI win is more viral.

Lying is further discouraged by the risk that the other party will sing.

0Qiaochu_Yuan
Agree that lying is discouraged by the risk that the other party will sing, but lying - especially in a way that isn't maximally beneficial - is encouraged by the prevalence of arguments that bad lies are unlikely. The game theory of bad lies seems like it could get pretty complicated.

[TL;DR keywords in bold]

I find your hypothesis implausible: The game was not about the ten dollars, it was about a question that was highly important to AGI research, including the Gatekeeper players. If that was not enough reason for them to sit through 2 hours of playing, they would probably have anticipated that and not played, instead of publicly boasting that there's no way they would be convinced.

0A1987dM
Maybe they changed their mind about that halfway through (and they were particularly resistant to the sunk cost effect). I agree that's not very likely, though (probability < 10%). (BTW, the emphasis looks random to me. I'm not a native speaker, but if I was saying that sentence aloud in that context, the words I'd stress definitely mostly wouldn't be those ones.)
accolade-20

Jung vf guvf tvoorevfu lbh'er jevgvat V pna'g ernq nal bs vg‽

@downvoters: no funny? :) Should I delete this?

Ok, I take it by "one-way-blind" you mean that each layer gets no new information that is not already in its database, but what is explicitly controlled by the humans. (E.g. I guess each layer should know the human query, in order to evaluate if AI's answer is manipulative.)

I also understand that we do look at complex information given by the AI, but only if the security bit signals "ok".

Ideally the AI […] knows as little as possible about humans and about our universe's physics.

That seems problematic, as these kinds of knowledge will be crucial for the optimization we want the AI to calculate.

Persuasion/hyperstimulation aren't the only way. Maybe these can be countered by narrowing the interface, e.g. to yes/no replies, for using the AI as an oracle ("Should we do X?"). Of course we wouldn't follow its advice if we had the impression that that could enable it to escape. But its strategy might evade our 'radar'. E.g. she could make us empower a person, of whom she knows that they will free her but we don't know.

I think you are right, I just shifted and convoluted the problem somewhat, but in principle it remains the same:

To utilize the AI, you need to get information from it. That information could in theory be infected with a persuasive hyperstimulus, effectively making the recipient an actuator of the AI.

Well, in practice the additional security layer might win us some time. More on this in the update to my original comment.

0accolade
Persuasion/hyperstimulation aren't the only way. Maybe these can be countered by narrowing the interface, e.g. to yes/no replies, for using the AI as an oracle ("Should we do X?"). Of course we wouldn't follow its advice if we had the impression that that could enable it to escape. But its strategy might evade our 'radar'. E.g. she could make us empower a person, of whom she knows that they will free her but we don't know.

Update

Have the button turn off the AI immediately instead of doing nothing, so she doesn't have time to switch to a plan B of having the persuaded Gatekeeper find a way to actually free her.

Of course she might anticipate that strategy and thoroughly indoctrinate and train him before he pushes the button. Thus, persuaded test Gatekeepers must be considered infected and put into cryogenic sleep* until the FAI problem is solved.

Furthermore, great caution would have to be taken in analyzing the discourse between Boxy and Guinea, because her arguments must be... (read more)

I don't get the hint. Would you care to give another hint, or disclose your hypothesis?

0A1987dM
Gur erny-jbeyq fgnxrf jrera'g gung uvtu (gra qbyynef), naq gur fpurqhyrq qhengvba bs gur rkcrevzrag jnf dhvgr ybat (gjb ubhef), fb V jnf jbaqrevat vs znlor gur tngrxrrcre cynlre ng fbzr cbvag qrpvqrq gung gurl unq n orggre jnl gb fcraq gurve gvzr va erny yvsr naq pbaprqrq qrsrng.

Please elaborate: What are the benefits of a Gatekeeper? How could you get them without one?

0Desrtopa
If you would want to have a gatekeeper at all, but definitely don't want to let the AI out, I would think that the benefits of having one would be to permit communication with the AI to draw upon its superhuman intelligence. If you can use the setup you just described, you could skip the step of ever using gatekeepers who actually have the power to let the AI out.

Cool, n=65 already. :) When interpreting the results, mind the bias created by my answer preceding the poll question.

How would humanity harness the AI's potential when the only information that escapes the system is a status bit? (Maybe I misunderstood your model.)

1moridinamael
Bearing in mind that most of my thinking on AI is more in the vein of cooking up science fiction plots ... The idea I was trying to evoke is that if you get a "fail" response from the system, you completely wipe the memory and never look at what happened, and if you get a "pass," you stop everything and go in layer by layer checking that none of the gatekeepers were tricked or blackmailed in any obvious ways, and only then do you tentatively look at what the AI was doing. Ideally the AI has no idea how many layers of containment it's in, no idea that the top level gatekeepers are humans, and knows as little as possible about humans and about our universe's physics. This should restrict its ability to plant tailor-made basilisks in its code or activities. I actually don't think these steps would be sufficient to constrain a superhuman AI, I see them more as the types of things you would do when you were already pretty sure the AI was Friendly.

It is, after all, much harder to convince a group of mutually-suspicious humans than to convince one lone person.

That sounds right. Would you have evidence to back up the intuition? (This knowledge would also be useful for marketing and other present life persuasion purposes.)

#( TL;DR: Mo' people - mo' problems?

I can think of effects that could theoretically make it easier to convince a group:

  • For some reason, Boxy might be better skilled at manipulating social/group dynamics than at influencing a lone wolf.
  • More people make the system more complex.
... (read more)
4JoachimSchipper
My comment was mostly inspired by (known effective) real-world examples. Note that relieving anyone who shows signs of being persuaded is a de-emphasized but vital part of this policy, as is carefully vetting people before trusting them. Actually implementing a "N people at a time" rule can be done using locks, guards and/or cryptography (note that many such algorithms are provably secure against an adversary with unlimited computing power, "information theoretic security").
accolade100

How about fooling both the AI and guinea Gatekeepers into thinking that they can let it out, say, at the push of a big red button which in actual fact does nothing? (Make AI forget about the button by reverting it to an older backup.)

0accolade
Update Have the button turn off the AI immediately instead of doing nothing, so she doesn't have time to switch to a plan B of having the persuaded Gatekeeper find a way to actually free her. Of course she might anticipate that strategy and thoroughly indoctrinate and train him before he pushes the button. Thus, persuaded test Gatekeepers must be considered infected and put into cryogenic sleep* until the FAI problem is solved. Furthermore, great caution would have to be taken in analyzing the discourse between Boxy and Guinea, because her arguments must be considered highly memetically infectious. Staff would have to stay under thorough psychological supervision and maybe replaced after some time. No matter how much caution, we would be doomed if we consider Boxy's IQ to be ∞. But if we continuously keep on Gatekeeper-testing every so often, starting at a low AI IQ, we might find persuasion patterns that are dangerous, but against which training/immunization is possible. Of course, there will be a limit to this process. At some point, Boxy will have become smart enough to produce 'mind HIV' - a thought virus we have no cure for yet. A humorous example of an extremely effective mind virus: The Funniest Joke In The World by Monty Python ---------------------------------------- * ETA: They would have declared consent to the cryogenic sleep before their unwitting 'AI-Box Experiment'.
0Desrtopa
If you could deceive the AI that easily, I think it would probably be simpler to get all the benefits of having a gatekeeper without actually using one.
-2MugaSofer
It'd be a pretty bad sign if you gave p=1 for the AI escaping.