Disclaimer: I'm writing this with the awareness that Zvi has done research and synthesis, that I am probably not capable of and certainly have not been doing; and that in this specific instance, Zvi's research has run into my area of expertise, and I can make some helpful constructive criticism in the hopes that in the near future, it will help Zvi outperform journalists even further than he already has.
Twitter spent a lot of time preparing for the "algorithm" release, at lease one month and probably many more (possibly the idea was proposed years ago). This implies that they had plenty of time to change their actual systems into something palatable for open source scrutiny.
This is exactly the sort of thing that we would see in a world where 1) a social media platform faced an existential threat, 2) distrust was one of the main factors, and 3) they still had enough talented engineers to think this up, evaluate the feasibility, and ultimately pull off a policy like this.
Whether the algorithm we see facilitates manipulation is a much more difficult question to answer. Like bills written with loopholes and encryption built with backdoors, we don't know how easy this system is to hijack with things such as likes that were strategically placed by botnets. Establishing whether manipulation remains feasible (primarily by third party actors, which is what you should expect) is a security mindset question, thinking about how things could be (easily) broken; not a question of whether the day-to-day stuff seems like it fits together.
Regarding the bot detection, I'm not surprised that LLM bots leave artifacts behind, but I don't think they should generally be this easy to spot in 2023. Botnets and bot detection have been trying to use AI adversarially for nearly 10 years, and probably been gainfully iterated on by engineers at large companies for ~5 years. There's probably other obvious artifacts that a person can spot, and maybe "as an AI language model" is less of an edge case than I think it is (I don't have much experience with language models), but it definitely seems like an edge case of bots being much easier to spot than they should be. It's important to note that not all botnet wielders have the same level of competence; things are moving fast with GPT-4, and I wouldn't be surprised if I vastly underestimated the number of basement people who get away with mistake-filled operations.
Idea for skin-in-the-game for moderation appeals. Mod attention is an expensive, valuable resource. Allow people to appeal by placing a deposit. If your appeal is approved, you get the money back. If rejected, the deposit is lost.
Early in the Ivermectin/COVID discussion, I posted on Twitter the best peer-reviewed study I could find supporting Ivermectin for COVID and the best study (peer-reviewed meta-analysis) I could find opposing Ivermectin for COVID. My comment was that it was important to read reputable sources on both sides and reach informed conclusions. That tweet linking to peer-reviewed research was labeled "misinformation", and my account got my first suspension.
A second tweet (yes, I'm a slow learner - I thought adding good data to that discussion was essential) contained a link to a CDC study at a CDC.gov web address investigating whether Ivermectin worked for COVID. That tweet was also taken down as misinformation, and my account was again suspended, when the only information I added to the link was a brief but accurate summary of the study. Again, this reputable link was labeled "misinformation".
I appealed both suspensions and lost both times. I would have put money down that my appeals would win, back when I assumed these decisions would be thoughtful and fact-based. And, yes, I would have been willing to take Twitter to court over censoring peer-reviewed research and censoring links to CDC studies, if I could find a lawyer and a legal basis. Those lawsuits would be negative for Twitter. Adding financial harm to the personal offense taken over them censoring fact-based posts would also be a strong negative. I don't think their content moderation team is competent enough that Twitter can afford to raise the stakes.
Yeah, for the company. Ideally this is not passed on to the person doing the moderation. But yes, some better more incentive-balanced approach would be more ideal.
This reminds me of a problem I have heard about a few years ago, not sure if it still exists:
The problem was that scientific papers are usually checked from the scientific perspective, but a frequent problem is also horrible English (typically from authors who do not speak English as their first language). So some journals added "language review" as a first step of their reviews, and if the article was not correct English, they told the author to rewrite it, or offered a paid service of rewriting it to proper English.
The paid service turned out to be so profitable, that some journals simply started requiring it from all authors writing from non-English-speaking countries, regardless of the actual quality of their English. Specifically, native English speakers found out that if they move to a different country and start submitting their papers from there, suddenly they are told that their English is not good enough and they have to pay for having their language checked. So in effect this just became an extra tax for scientists based on their country.
Similarly, I am pessimistic about the willingness of companies to isolate potentially profit-generating employees from the financial consequences of their decisions.
Previously: The Changing Face of Twitter
Right after I came out with a bunch of speculations about Twitter and its algorithm, we got a whole bunch of concrete info detailing exactly how much of Twitter’s algorithms work.
Thus, it makes sense to follow up and see what we have learned about Twitter since then. We no longer have to speculate about what might get rewarded. We can check.
We Have the Algorithm
We have better data now. Twitter ‘open sourced’ its algorithm – the quote marks are because we are missing some of the details necessary to recreate the whole algorithm. There is still a lot of useful information. You can find the announcement here and the GitHub depot here. Brandon Gorrell describes the algorithm at Pirate Wires.
Here are the parts of the announcement I found most important.
This matches my experience with For You. Your follows are very much not created equal. The accounts you often interact with will get shown reliably, and even shown when replying to other accounts. Accounts that you don’t interact with, you might as well not be following.
Thus, if there is an account you want to follow within For You, you’ll want to like a high percentage of their tweets, and if you don’t want that for someone, you’ll want to avoid interactions.
What about out-of-network?
So that’s super interesting on both fronts. The algorithm is explicitly looking to pattern match on what you liked. My lack of likes perhaps forced the algorithm to, in my case, fall back on more in-network Tweets. So one should be very careful with likes, and only use them when you want to see more similar things.
More than that, who you follow now is doing two distinct tasks. It provides in-network tweets, but only for those accounts you interact with. It also essentially authorizes those you follow to upvote content for you by interacting with that content.
That implies a strategy of two kinds of follows. You want to follow accounts whose Tweets you want to see, and interact aggressively. You also want to follow accounts whose tastes you want to copy, whether or not you like their content at all, except then you want to avoid interactions.
This means that if you have follows who often interact with things you want to see less of, such as partisan political content, you are paying a higher price than you might realize. Consider re-evaluating such follows (as with all of this, assuming you care about the For You tab).
Exclusively maximizing engagement is a clear Goodhart’s Law problem, that is not what you or Twitter should want. Worth noticing.
If one were focusing on For You or using a hybrid approach, this is another good reason to follow or unfollow someone. Do you want them used as social proof?
The ranking is in two stages. First the ‘light’ ranking to get down to ~1500 candidates, then the ‘heavy’ ranking to choose among them.
What else do we know? These all, I think, refer to the first-stage ‘light’ ranking:
So replies essentially don’t matter in light ranking? This is so weird. Replies are real engagement, likes are only nominal engagement at best. Which the ‘heavy ranking’ understands very well, as discussed later.
It’s not obvious what a 2.0 boost means in practice, in terms of magnitude.
This makes sense provided the threshold is sufficiently low. I don’t think I’ve ever had a problem with that.
That makes sense, provided it is normalized to follower counts.
It’s currently 4.0 in-network, 2.0 out-of-network, and soon the plan is to exclude non-blue out-of-network entirely in many forms. So it’s a big deal whether or not you are already followed.
This is the first thing that outright surprised me. The other things listed here all make sense, whether or not you like the principles used. But why would you downgrade posts about Ukraine?
I have two guesses. One is relatively benign. Hopefully it’s not the other one.
There are other organic reasons why it makes sense to ‘stay in your lane’ on Twitter, as this ensures the people who follow you are interested in your content. Now we find out the algorithm is actively punishing ‘hybrid’ accounts, discouraging me (for example) from posting about both rationality and AI, and then also posting about something else sports or Magic: The Gathering.
Then again, perhaps by using such targeting this actually gives effective permission to exit your lane at times.
I will note that I have seen posts with misspellings do well, so enough engagement can overcome even this level of penalty.
Later, we found out something very different about the ‘heavy’ ranking, it relies much more on strong (‘real’?) engagement metrics.
If you want to boost engagement, sounds like you should reply to your replies.
If you want to help a Tweet out a lot, then it looks like these extended engagements have a big impact – you’ll want to click into and then like, not merely like, if you don’t want to reply. Ideally, you should reply, even if you don’t have that much to say.
You may note I’m making it a habit whenever possible to engage with anyone who replies and isn’t making my life actively worse by doing so. It’s a win win.
Yeah. That seems about right.
This makes it seem even more overdetermined that you want to use your best stuff at the more popular times.
From Steven Tey, remember to keep a high TweepCred? Which essentially means, I think, that you need to have enough interactions and follows to provide social proof. My presumption is that most ‘normal users’ will get there, but if you have few followers might want to be careful about following too many people.
The effect is that if you are over 65 Tweepcred, you can post more and still have your content considered, whereas if you’re too low your content isn’t considered at all.
Alternative Methods Perhaps
A nice brainstorm is to ask, what if you had more control over the algorithm as it applied to you? You could in theory count on Twitter to serve up the 1500 candidates, then rank them yourself.
Eliezer Yudkowsky recently picked up a lot of new Twitter followers, and he is here to report that, algorithmically and experientially speaking, it’s not going great.
I hope Elon takes him up on this offer, whether or not they ever also end up talking about AI. It is so strange to me that what Eliezer is asking for here is hard for him to get.
It would be highly amusing if Eliezer and Elon got together to talk and didn’t discuss AI, despite Elon once again doing what Eliezer thinks is about the worst possible thing. Still seems way better than not talking.
Algorithm Bonus Content: Canadian YouTube
Canada is preparing to pass a law requiring a third of YouTube links be to Canadian content.
If that happens, it will be because YouTube chose that result, sacrificing the quality of the YouTube experience in order to punish Canada for its insolence.
If YouTube treats content surfaced at the whim of a Canadian regulator as if it was recommended by the algorithm, and evaluates customer reactions on that basis, then yes, this would severely punish Canadian creators.
However, that is clearly a distortion, and thus a stupid way to handle this situation. Instead YouTube, if its goal is to serve up the best videos possible, should adjust its evaluations to account for the poor product-market fit, or perhaps (if it didn’t have a better option because everyone is busy working on Bard) simply throw out the data on videos that its algorithm would not have served on its own.
What about the issue of potentially violating content? Twitter is making such actions more transparent, Colin Fraser highlights the inherent dilemmas here.
As every moderator knows, the last thing you want to do is call attention to the thing you are making a choice not to call attention to, nor do you want to have to justify every decision. It rarely goes well. If they are going to allow appeals here, they are going to need to ensure that the appeal comes with skin in the game – if a human looks at your Tweet and does decide it is offensive, there must be a price.
The Great Polling Experiments
So far I’ve run two giant polling threads on Twitter.
The first one polled an AI doom scenario where an ASI (artificial superintelligence) attempted to gather resources, take over the world and then kill all humans, without the ability to foom or itself develop new innovative tech. This experiment went well, engagement was strong, good discussion happened and I learned a lot.
The second one polled the 24 predictions from On AutoGPT. That thread flopped on engagement, with the first post ending up with less than 10% the views and votes of the first polling thread, although still enough votes to get a clear idea. You need 300+ for a robust poll on an election, but 50 votes is plenty for ‘do people more or less believe or expect this?’ I confirmed some things but didn’t learn as much.
I am still holding off on the analysis post for now, hopefully get to that soon.
What I miss most in these situations is correlations. I can’t tell to what extent people’s answers make sense and are consistent. I can’t tell whether people’s answers represent plausible cruxes, either, unless I explicitly ask that, and you don’t want to overstay your welcome in such situations.
I would try other forms of polling, but I’d expect engagement numbers to drop off dramatically, and to do so in ways that skew the data. I asked the person I should obviously ask to see if they had any advice, we’ll see if anything comes of that.
In particular, a few threads I want to do in the future, suggestions welcome:
What else? What questions should be in those? I figure maybe do one of these a week as a Monday special.
What Does the Future of Twitter Look Like?
As I see my reactions to knowing the Twitter algorithm, I see myself doing things that seem mostly net good for Twitter, and also getting more use out of the platform. I am slightly worried about exodus by former blue checks, but only slightly.
Two recent departures were NPR and CBC, both of which were protesting being labeled ‘state media’ merely because they are public broadcasters funded in large part by the state. I get why they are upset about the label, yet I don’t see how one can call it inaccurate.
As for the celebrities who leave? I won’t miss them.
For a while, the uncertainty about Twitter’s future was uncertainty about Elon Musk and his plans, and whether the website would fall apart or Twitter would go bankrupt or everyone would leave in droves.
I no longer worry much about those scenarios. Instead, even in the context of Twitter, I almost entirely worry about AI.
The intersection of those two issues famously includes Twitter bots. An ongoing problem, as you can see:
Reports are they identified almost 60,000 accounts this way. I doubt there were many false positives.
Twitter will rise or fall based on how AI transforms our experiences and the internet – if we’re still around and doing things where Twitter fits in, it’ll be great. If not, not.
The thing about the Twitter bots is there are a lot of them, but mostly they don’t matter. Look at the five posts above where we see view counts. The total is seventeen, or at most maybe five views a minute from all 60k accounts combined. Given how the current model works, almost all the utility lost from bots is due to DM spam, which is made possible because people like me keep our DMs open and find a lot of value in that. So what if I have to block a spam account once a week?