User Comment Replies

How it feels to have your mind hacked by an AI

Scott Alexander has an interesting little short on human manipulation: https://slatestarcodex.com/2018/10/30/sort-by-controversial/
So far everything I'm seeing, both fiction and anecdotes, is consistent with the notion that humans are relatively easy to model and emotionally exploit. I also agree with CBiddulph's analysis, insofar as while the paperclip/stamp failure mode requires the AI to have planning, generation of manipulative text doesn't need to have a goal--if you generate text that is maximally controversial (or maximises some related metric) and disseminate the text, that by itself may already do damage.

1tamgent2y

I like it - interesting how much is to do with the specific vulnerabilities of humans, and how humans exploiting other humans' vulnerabilities was what enabled and exacerbated the situation.

1tamgent2y

Whilst we're sharing stories...I'll shamelessly promote one of my (very) short stories on human manipulation by AI. In this case the AI is being deliberative at least in achieving its instrumental goals. https://docs.google.com/document/d/1Z1laGUEci9rf_aaDjQKS_IIOAn6D0VtAOZMSqZQlqVM/edit

Meta-Honesty: Firming Up Honesty Around Its Edge-Cases

Hgbanana1232y20

I think that

"Don't say things that you believe to be literally false in a context where people will (with reasonably high probability) persistently believe that you believe them to be true"

is actually in line with the "bayesian honesty" component/formulation of the proposal. If one is known to universally lie, one's words have no information content, and therefore don't increase other people's bayesian probabilities of falsy statements. However, it seems this is not a behaviour that Eliezer finds morally satisfactory. (I agree with Rob Bensinger that this formulation is more practical in daily life)

LESSWRONG
LW

All of Hgbanana123's Comments + Replies