This is a 4-minute read, inspired by the optimized writing in Eukaryote's Spaghetti Towers post.

My thinking is that generative AI has potential for severe manipulation, but that the 2010s AI used in social media news feeds and other automated systems are a bigger threat, and this tech tells us much more about the future of international affairs and incentives for governments to race to accelerate AI, has fewer people aware of it, has a substantial risk of being used to attack the AI safety community, and the defenses are easy and mandatory to deploy. This post explains why this tech is powerful enough to be a centerpiece of people's world models.

The people in Tristan Harris's The Social Dilemma (2020) did a fantastic job describing the automated optimization mechanism in a quick and fun way (transcript).

[Tristan] A [stage] magician understands something, some part of your mind that we’re not aware of. That’s what makes the [magic trick] illusion work. Doctors, lawyers, people who know how to build 747s or nuclear missiles, they don’t know more about how their own mind is vulnerable. That’s a separate discipline. And it’s a discipline that applies to all human beings...

[Shoshana] How do we use subliminal cues on the Facebook pages to get more people to go vote in the midterm elections? And they discovered that they were able to do that. 

One thing they concluded is that we now know we can affect real-world behavior and emotions without ever triggering the user’s awareness. They are completely clueless.

Important note: all optimization here is highly dependent on measurability, and triggering the user's awareness is a highly measureable thing. 

114501_1_09dreyfuss-video_wg_720p.mp4 [video-to-gif output image]

If anything creeps someone out, they use the platform less; such a thing is incredibly easy to measure and isolate causation. To get enough data, these platforms must automatically reshape themselves to feel safe to use.

This obviously includes ads that make you feel manipulated; it is not surprising that we ended up in a system where >95% of ad encounters are not well-matched.

This gives researchers plenty of degrees of freedom to try different kinds of angles and see what works, and even automate that process.

[Tristan] We’re pointing these engines of AI back at ourselves to reverse-engineer what elicits responses from us. Almost like you’re stimulating nerve cells on a spider to see what causes its legs to respond. 

Important note: my model says that a large share of data comes from scrolling. The movement of scrolling past something on a news feed, with either a touch screen/pad or a mouse wheel (NOT arrow keys), literally generates a curve. 

It is the perfect biodata to plug into ML; the 2010s social media news feed paradigm generated trillions of instances of linear algebra, based on different people's reactions to different concepts and ideas that they scroll past.

So, it really is this kind of prison experiment where we’re just, you know, roping people into the matrix, and we’re just harvesting all this money and… and data from all their activity to profit from. And we’re not even aware that it’s happening.

[Chamath] So, we want to psychologically figure out how to manipulate you as fast as possible and then give you back that dopamine hit...

[Sean Parker] I mean, it’s exactly the kind of thing that a… that a hacker like myself would come up with because you’re exploiting a vulnerability in… in human psychology...

[Tristan] No one got upset when bicycles showed up. Right? Like, if everyone’s starting to go around on bicycles, no one said, “Oh, my God, we’ve just ruined society. Like, bicycles are affecting people. They’re pulling people away from their kids. They’re ruining the fabric of democracy. People can’t tell what’s true.” 

Like, we never said any of that stuff about a bicycle. If something is a tool, it genuinely is just sitting there, waiting patiently. If something is not a tool, it’s demanding things from you... It wants things from you. And we’ve moved away from having a tools-based technology environment to an addiction- and manipulation-based technology environment. 

That’s what’s changed. Social media isn’t a tool that’s just waiting to be used. It has its own goals, and it has its own means of pursuing them by using your psychology against you.

In the documentary, Harris describes a balance between three separate automated optimization directions from his time at tech companies: the Engagement goal, to drive up usage and keep people scrolling, the Growth goal, to keep people coming back and encouraging friends, and the Advertising goal, which pays for server usage. 

In fact, there is a fourth slot: optimizing people's thinking in any measurable direction, such as making people feverishly in favor of the American side and opposed to the Russian side in proxy wars like Ukraine. The tradeoffs between these four optimization directions is substantial, and the balance/priority distribution is determined by the preferences of the company and the extent of the company's ties to its government and military (afaik mainly the US and China are relevant here). 

Clown Attacks are a great example of something to fit into the fourth slot: engineer someone think a topic (e.g. lab leak hypothesis) is low-status by showing them low-status clowns talking about it, and high-status people ignoring or criticizing it. 

It's important to note that this problem is nowhere near as important as Superintelligence, the finish line for humanity. But it's critical for world modelling; the last 10,000 years of human civilization only happened the way it did because manipulation capabilities at this level did not yet exist.

Information warfare was less important in the 20th century, relative to military force, because it was weaker.  As information warfare becomes more powerful and returns on investment grow, more governments and militaries invest in information warfare than historical precedent would imply, and we end up in the information warfare timeline.

In the 2020s, computer vision makes eyetracking and large-scale facial microexpression recognition/research possibly the biggest threat. Unlike the hopelessness of securing your operating system or having important conversations near smartphones, solution is easy and worthwhile for people in the AI safety community (and it is why me posting this is net positive). It only takes a little chunk of tape and aluminum foil (that was easy enough for me to routinely peel off most of it and replace afterwards). 

I've written about other fixes and various examples of ways for attackers to hack the AI safety community and turn it against itself. Please don't put yourself and others at risk.

New Comment
26 comments, sorted by Click to highlight new comments since:
[-]gull1010

such as making people feverishly in favor of the American side and opposed to the Russian side in proxy wars like Ukraine.

Woah wait a second, what was that about Ukraine?

[-]Viliam2814

A page from Russian propaganda textbook: that there is an American side and a Russian side to each conflict, but there is no such thing as an Ukrainian (or any other) side. The rest of the world is not real.

This allows you to ignore everything that happens, and focus on the important question: are you a brainwashed sheep that uncritically believes the evil American propaganda, or are you an independently thinking contrarian? Obviously, the former is low-status and the latter is high-status. But first you have to agree with all the other independently thinking contrarians in order to be recognized as one of them.

We can talk about the sophisticated propaganda methods of the future, but in the meanwhile, it is the simple ones that work best, when applied persistently.

Thanks for pointing this out. I don't know much about US-Russia affairs outside of the US-China context, including Russian propaganda, so I wouldn't have known about that dynamic even if I had more space to write about it. 

It's always helpful to see various state ideologies and propaganda tactics summarized.

We can talk about the sophisticated propaganda methods of the future, but in the meanwhile, it is the simple ones that work best, when applied persistently.

Also, just to clarify, this post is about the propaganda methods of the present (specifically the late 2010s), not the future. Also, this post thoroughly disproves the idea that persistence is the "best"; empiricism and optimization power are the best, and modern automated systems are overwhelmingly capable of optimizing for measurable results.

Yes, if these capabilities weren't deployed in the US during the Ukraine war, that falsifies a rather large chunk of my model (most of the stuff about government and military involvement). It wouldn't falsify everything (e.g. maybe the military cares way more about using these capabilities for macroeconomic stabilization to prevent economic collapses larger than 2008, maybe they consider that a lose condition for the US and public opinion is just an afterthought).

We'll have to wait years for leaks though, and if it didn't happen then we'll be waiting for those leaks for an awful long time, so it might be easier to falsify my model from the engineering angle e.g. spaghetti towers or the tech company/intelligence agency competence angle.

I'd caution against thinking that's easy though, I predict that >66% of tech company employees are clueless about the true business model of their company (it's better to have smaller teams of smarter, well-paid, conformist/nihilistic engineers due to Snowden risk, even if larger numbers of psychologists are best for correlation labelling). Most employees work on uncontroversial parts like AI capabilities or the pipeline of encrypted data. 

I've also encountered political consultants who basically started out assuming it's not possible because they themselves don't have access to the kind of data I'm talking about here, but that's an easy problem to fix with just a conversation or two.

We need to grind hard to produce the first wave of AI propaganda delivering the targeted payload that AI propaganda must be addressed.

This will likely elevate it to attention for adversarial actors who were not fully grasping its capabilities. There are plenty of adversarial actors who are grasping it of course, so this may be an acceptable tradeoff, but it might change the dynamics for weaker adversarial actors who would otherwise take a while to realize just how far they could go to see advertising specifically focused on informing about new ai manipulation capabilities. The reputation of whoever did this advertisement campaign would likely be tainted, imo correctly.

The project that does this would be presumably be defunded on succeeding because none of the techniques it developed work any more.

But wouldn't you forgive its funders? Some people construct pressures for it to be self-funding, by tarring its funders by association, but that is the most dangerous possible model, because it creates a situation where they have the ability and incentive to perpetuate themselves beyond the completion of their mission.

This will likely elevate it to attention for adversarial actors who were not fully grasping its capabilities. There are plenty of adversarial actors who are grasping it of course, so this may be an acceptable tradeoff, but it might change the dynamics for weaker adversarial actors who would otherwise take a while to realize just how far they could go

I actually wrote a little about this last november, but before october 2023, I was dangerously clueless about this very important dynamic, and I still haven't properly grappled with it. 

Like, what if only 5% of the NSA knows about AI-optimized influence, and they're trying to prevent 25% of the NSA from finding out because that could be a critical mass and they don't know what it would do to the country and the world if awareness spread that far? 

What if in 2019 the US and Russia and Chinese intelligence agencies knew that the tech was way too powerful, but Iran and South Africa and North Koreans didn't, and the world becomes worse if they do? 

What if IBM or JP Morgan found out in 2022 and started trying to join in the party? What if Disney predicted a storm is coming and took a high-risk strategy to join the big fish before it was too late (@Valentine)?

If we're in an era of tightening nooses, world modelling becomes so much more complicated when you expand the circle beyond the big American and Chinese tech companies and intelligence agencies.

If Elon did this, everyone would leave twitter. Twitter lists is the next best thing, because it delegitimizes other social media platforms (because they can't do it because they will lose all their money), in a way that benefits twitter.

Oh, you assume X is the only app that would be open to doing this? Hm

  • If it actually requires detailed user monitoring, I wonder if there are any popular platforms that let some advertisers have access to that.
  • I think it might not require that, in which case support from the app itself isn't needed.

Oh, you assume X is the only app that would be open to doing this? Hm

That's a good assumption to call out, I've been thinking about X and lists a lot recently but I haven't properly considered who would be working on improving civilization's epistemics.

Elon musk and his people seem to be in favor of promoting prediction-market style improvements, which I think would probably more than compensate for influence tech because widespread prediction market adoption facilitates sanity and moloch-elimination, and the current state of influence tech is a primarily caused by civilizational derangement and moloch. 

However, twitter isn't secure or sovereign enough to properly do optimized influence, because they would get detected counteracted by the sovereign big fish like Facebook or the NSA. They can, on the other hand, pump out algorithms that reliably predict truth, and deploy them at scale, even if this threatens disinformation campaigns coming out of US state-adjacent agencies. 

I'm not sure whether to trust Yudkowsky, who says community notes seem to work well, or Wikipedia, which claims to list tons of cases where community notes are routinely wrong or exploited by randos (rather than just state-level actors who can exploit anything); if Wikipedia is right and the tech isn't finished, then it's all aspirational and we can't yet evaluate whether Elon and Twitter are committed to civilization.

My understanding of advertisers is that they're always trying to get a larger share of the gains from trade. I don't know how successful advertisers tend to be due to information asymmetry and problems with sourcing talent; this is interesting with powerful orgs like wall street banks like JP Morgan which seems to be trying out their own business model for sensor-related tech.

I don't think wikipedians are generally statistically minded, so when they say "routinely" they could mean it happens like 20 times a year. They'd probably notice most of them.

Hm my intuition would be that platforms aren't totally on point at processing their data and they would want to offload that work to advertisers, and there are enough competing big platforms now (youtube, instagram, X, tiktok) that they might not have enough bargaining power to defend their integrity.

my intuition would be that platforms aren't totally on point at processing their data

That's interesting, we have almost opposite stances on this. 

My intuition is that Instagram, youtube, and possibly tiktok are very on-point with processing their data, but with occasional catastrophic failures due to hacks by intelligence agencies which steal data and poison the original copy, and also incompetence/bloat (like spaghetti towers) or unexpected consequences from expanding into uncharted territory. Whereas Twitter/X lacks the security required to do much more than just show people ads based on easy-to-identify topics of interest.

Uh I guess I meant like, there's no way they can do enough to give advertisers 30% of the value their data has without giving many databrokers (who advertisers contract) access to the data, because the advertisers needs are too diverse and the skill ceiling is very high. This equilibrium might not have realized yet but I'd guess eventually will.

Can you explain the camera tape thing? I clicked the link and it didn’t really explain anything. What’s the idea here, that someone has planted malware on my computer and they are using it to watch me through my camera without my knowing about it? And… this is the biggest threat (in… some larger category of threats? I don’t entirely grasp what this category is supposed to be)?

Yes, the NSA stockpiles backdoors in windows and likely every other major operating system as well (I don't think that buying one of the security-oriented operation systems is a solution, that is what the people who they want to spy on would try, and you still have to worry about backdoors in the chip firmware anyway). I think intelligence agencies and the big 5 tech companies are likely to use those video files for automated microexpression recognition and eyetracking, to find webs of correlations and research ways to automate optimized persuasion. For example, finding information that makes you uncomfortable because it is supposed to be secret, by comparing it to labelled past instances of people's facial microreactions to reading information that was established to be secret. This becomes much easier when you have millions of hours of facial microexpression data in response to various pieces of content.

This is the biggest category of threat in anyone who tries to use modern systems to manipulate you, for any reason, at some point in the 2020s or beyond. I think that people in the AI safety community are disproportionately likely to be targeted due to proximity to AI which is a geopolitically significant technology (for these reasons, and also for military hardware like cruise missiles, and for economic growth/plausibly being the next internet-sized economic paradigm). This is a greater risk than the 2010s because the tech is more powerful now and the ROI is higher, resulting in more incentives for use.

Do you have any evidence at all that this sort of use of webcams is happening or has happened…?

Also, can you say more about what you mean by “finding information that makes you uncomfortable because it is supposed to be secret, by comparing it to labelled past instances of people’s facial microreactions to reading information that was established to be secret” and “millions of hours of facial microexpression data in response to various pieces of content”? You are suggesting that photos are being taken constantly, or… video is being recorded, and also activity data is being recorded about, like… what webpages are being browsed? Is this being uploaded continuously, or…? Like, in a technical sense, what does this look like?

Do you have any evidence at all that this sort of use of webcams is happening or has happened…?

Currently no. My argument focuses on the incentives for tech companies or intelligence agencies to acquire this data illicitly, in addition to existing legal app permissions that people opt into. My argument makes a solid case that these incentives are very strong; however, hacking people's webcams at large scale is risky, even if you get large amounts of data from smarter elites and better targets that way, and select targets based on low risk of detecting the traffic or the spyware. My argument is that the risk is more than sufficient to justify covering up webcams; I demonstrate that leaving webcams uncovered is actually the extreme action.

Also, can you say more about what you mean by “finding information that makes you uncomfortable because it is supposed to be secret, by comparing it to labelled past instances of people’s facial microreactions to reading information that was established to be secret” and “millions of hours of facial microexpression data in response to various pieces of content”? You are suggesting that photos are being taken constantly, or… video is being recorded, and also activity data is being recorded about, like… what webpages are being browsed? Is this being uploaded continuously, or…? Like, in a technical sense, what does this look like?

Yes. A hypothetical example is the NSA trying to identify FSB employees who are secretly cheating on their spouses. The NSA steals face and eyetracking data on 1 million Russians while they are scrolling through twitter on their phones, and manages to use other sources to confirm 50 men who are cheating on their spouses and trying very hard to hide it. The phones record video files and spyware on the system simplifies them to facial models before encrypting and sending the data to the NSA. The NSA has some computer vision people identify trends that distinguish all 50 of the cheating men but are otherwise rare; as it turns out, each of them exhibit a unique facial tic when exposed to the concept of poking holes in condoms. They test it on men from the million, and it turns out that trend wasn't sufficiently helpful at identifying cheaters. They find another trend, the religious men of the 50 scroll slightly faster when a religion-focused influencer talks specifically about the difference between heaven and hell. When the influencer talks about the difference, rather than just heaven or hell, they exhibit a facial tic that turns out to strongly distinguish cheaters from non-cheaters among the million men they stole data from. While it is disappointing that they will only be able to use this technique to identify cheating FSB employees if they are religious and use social media platforms that the NSA can place that specific concept into, it's actually a pretty big win compared to the 500 other things their systems discovered that year. And possibly I'm describing this process as much less automated and streamlined than it would be in reality. 

For steering people's thinking in measurable directions, the only non-automated process is figuring out how to measure/label successes and failures.

[-][anonymous]11

Trevor, when I see this problem, which expands to a general problem that any image, audio, or video recording can in the near future be faked - I don't see any obvious solutions except for 'white channel' protection.  You are talking about a subset of the problem - content that has been manipulated to cause a particular response from the victim.

Meaning that since in the near future any website on the internet can be totally falsified, any message or email you get from a 'real human' may be from an AI - the only way to know if anything is real is to have some kind of list of trusted agents.  

Real human identities for emails, chats etc, some kind of license for legitimate news agencies, digital signatures and security mechanisms to verify senders and websites as belonging to actually legitimate owners.

Cameras almost need to be constantly streaming to a trusted third party server, kind of how if your phone uploads to Google photos shortly after an event happens, or the police upload body camera footage to evidence.com , it is less likely with present technology that the records are completely fake.

Have you thought more on this?  Is there any legislation proposed or discussion of how to deal with this problem?

Yes, my thinking about deepfakes is that they haven't been succeeding at influence, making no money, and everyone worries about them; whereas automated targeting and response prediction have been succeeding at influence with flying colors, making trillions of dollars and dominating the economy, and nobody worries about them.

I'm sympathetic to the risk that generative AI will be able to output combinations of words that are well-suited to the human mind, and also generated or AI-altered images with totally invisible features that affect the deep structures of the brain to cause people to measurably think about a targeted concept more frequently. But I see lots of people working on preventing that even though it hasn't been invented yet, and I don't currently see that tech succeeding at all without measurability and user data/continuous feedback. 

I don't see this tech as a global catastrophic risk to steal priority from AI or Biorisk, but as an opportunity for world modelling (e.g. understanding how AI plays into US-China affairs) and for reducing the AI safety community's vulnerability to attacks as AI safety and orgs like OpenAI become more important on the global stage.

Yes, my thinking about deepfakes is that they haven't been succeeding at influence, making no money, and everyone worries about them; whereas automated targeting and response prediction have been succeeding at influence with flying colors, making trillions of dollars and dominating the economy, and nobody worries about them.

Anecdotally, the ads I get on mobile YouTube have been increasingly high ratio of AI aided image generation. I recognize it because I like to mess with it and know intuitively what it can't currently do well but I expect most people wouldn't realize what kind of edited the pictures are, and the advertiser is one of those very big companies that you hadn't heard of a year ago.

[-][anonymous]21

whereas automated targeting and response prediction have been succeeding at influence with flying colors, making trillions of dollars and dominating the economy, and nobody worries about them.

Do you have direct data on that?  Consumer preferences are affected by advertising, although they are also affected by cost and how mainstream products tend to be pretty good simply from generations of consumer preference.  

For example on the margin, does recsys "make" trillions?  

Sorry, I was referring to the big 5 tech companies, which made trillions of dollars for investors in aggregate. I was more thinking about how prominent the tech is compared to deepfakes, not the annual revenue gained from that tech alone (although that figure might also be around 1 trillion per year depending on how much of their business model is totally dependent on predictive analytics).

[-][anonymous]20

If the big 5 earn a trillion in annual revenue from ads (they don't, the annual global spending is about 1T) you have to model it as utility gain. In a world where the analytics were cruder - using mere statistics similar to FICO and not DL - how much less value would they create/advertisers pay for. A big part of the revenue for big tech is just monopoly rents.

I would guess it's well under 100B, possibly under 10B.

If all the ai models in the world bring in under 10B combined this would explain the slow progress in the field.

Apple makes ~500B in revenue per year and Microsoft makes around ~200B/y, and my argument in other posts is that human manipulation is a big part of their business model/moat, including government ties which in turn guarantee favorable policies. I remain uncertain (this is clearly an update against), but I still think it's possible that the value generated is potentially around 1 trillion per year; it's hard to know what these companies would look like if they lost the moat they gained from influence capabilities, only that companies like Netflix and Twitter don't have the security or data or required and they are much less valuable.

Although in the original context, I actually was basically saying "deepfakes make like no money whereas targeted influence makes like, trillions of dollars".

[-][anonymous]20

I had a little bit of a thought on this, and I think the argument extends to other domains.

Major tech companies, and all the advertising dependent businesses are choosing what content to display mainly to optimize revenue.  Or,

content_shown = argmax(estimated_engagement( filtered(content[]) ) )

Where content[] is all the possible videos youtube has the legal ability to show, a newspaper has the legal ability to publish, reddit has the legal ability to allow on their site, etc.

Several things immediately jump out at me.  First:

    1.  filtering anything costs revenue.  That is to say, on the aggregate, any kind of filter at all makes it less likely that the most engaging content won't be shown.  So the main "filter" used is for content that advertisers find unacceptable, not content that the tech company disagrees with.  This means that youtube videos disparaging Youtube are just fine.  

    2.  Deep political manipulation mostly needs a lot of filtering, and this lowers revenue.  Choosing what news to show is the same idea.

   3.  Really destructive things, like how politics are severely polarized in the USA, and mass shootings, are likely an unintended consequence of picking the most engaging content.  

   4.  Consider what happens if a breaking news event happens, or a major meme wave happens, and you filter it out because your platform finds the material detrimental to some long term goal.  Well that means that the particular content is missing from your platform, and this sends engagement and revenue to competitors.  

General idea: any kind of ulterior motive other than argmaxing for right now - whether it be tech companies trying to manipulate perception, or some plotting AI system trying to pull off a complex long term plan - costs you revenue, future ability to act, and has market pressure against it.

This reminds me of the general idea where AI systems are trying to survive, and somehow trading services with humans for things that they need.  The thing is, in some situations, 99% or more of all the revenue paid to the AI service is going to just keeping it online.  It's not getting a lot of excess resources for some plan.  Same idea for humans, what keeps humans 'aligned' is almost all resources are needed just to keep themselves alive, and to try to create enough offspring to replace their own failing bodies.  There's very little slack for say founding a private army with the resources to overthrow the government.