Bloomberg reported 2 weeks ago that Twitter resumed paying Google Cloud: https://www.bloomberg.com/news/articles/2023-06-21/twitter-resumes-paying-google-cloud-patching-up-relationship
Twitter owner Elon Musk continues to be surprised by how Twitter works. Last week he learned that their code ‘shadowbanned’ any account with low reputation score, preventing them from trending, and the calculation was based on ‘how many times were you reported’ so every big account got shadowbanned.
I roll to disbelieve. If this were the case, virtually zero tweets by large accounts would ever trend. There must either be some additional code that overrides this shadowban or some list that nullifies its effects.
It's still hilarious that it was in the codebase at all.
I don't know what to think about Musk not paying the GCP bill. He obviously has the money. Does he really not want to sell more Tesla stock that badly? Why would you risk a 44 billion dollar investment involving a ton of your own money (not to mention that of many of your friends) over a 1 billion dollar bill?
After a ~year of not reading Twitter, I coincidentally returned to reading Twitter the exact day that the rate limits were added. I'll say, this externally imposed daily limit is precisely the feature I've wanted in social media, and each day I've been glad of it.
The number is way too high for that. I use twitter almost an hour a day (way way too much time) and I don't hit the rate limit
I was initially using a brand new account and hit the rate limits in under an hour. Perhaps the rate limits were different for that account. I notice I didn't hit it yesterday (after I managed to login to my old account).
We know that Twitter was in multiple datacenters, including their own datacenter, plus Google Cloud, plus AWS. We know that they were trying to get out of these contracts, possibly using default as a negotiating tactic. We know that their technical-debt level was extraordinarily bad. After joking that they probably had a history of their engineers being spies who obfuscating things in order to make it easier to hide their sponsors' backdoors, I thought about it a bit more and decided that was probably literally true. They were (and probably still are) using orders of magnitude more computing resources than running a service like Twitter ought to take, if it was well engineered. And we know that they started having capacity problems, with timing that seems to line up suspicioously with what we might infer is a monthly billing cycle.
But there are a bunch of very different interpretations of this, which we can't easily distinguish:
One thing I can say, from running LW, is that from a capacity perspective crawlers are a much bigger issue for websites than you'd naively expect. (LessWrong does have rate limits per-IP-address, they're just high enough that you won't ever hit them under normal usage.) So even if there was a capacity-reduction related to their supply of hardware, it may still be the case that most of their capacity was going to scrapers, and to try to limit the scrapers as a way of regaining capacity. It seems fairly likely that the rate-limiting option was set up in advance as a quick-response option for any sort of capacity issue (including capacity issues created by things like a developer accidentally deploying slow code, or surges in usage).
The main problem with crawlers is that their usage patterns don't match those of regular users, and most optimization effort is focused on the usage patterns of real users, so bots sometimes wind up using the site in ways that consume orders of magnitude more compute per request than a regular user would. And some of these bots have been through many iterations of detection and counter-detection, and are routing their requests through residential-IP botnets, with fake user-agent strings trying to approximate real web browsers.
As for the shadowbanning thing--the real bug was probably a bit more subtle than the tweet-length description, but the bug itself is not surprising, and given the high-technical-debt codebase, probably not nearly as stupid as it sounds. Or rather: the effect may have been that stupid, but the code itself probably didn't look on cursory inspection like it was that bad. Ie, I would assign pretty high probability to that code containing an attempt to normalize for visibility that didn't work correctly, or an uncompleted todo-item to make a visibility correction score. A codebase like Twitter is going to have bugs like this, they can ony be discovered by skilled programmers doing forensic investigations, and executives will only know about them during the narrow time window between when they're discovered and when they're fixed.
The main problem with crawlers is that their usage patterns don't match those of regular users, and most optimization effort is focused on the usage patterns of real users, so bots sometimes wind up using the site in ways that consume orders of magnitude more compute per request than a regular user would.
And Twitter has recently destroyed his API, I think? Which perhaps has the effect of de-optimizing the usage patterns of bots.
And some of these bots have been through many iterations of detection and counter-detection, and are routing their requests through residential-IP botnets, with fake user-agent strings trying to approximate real web browsers.
As someone who has done scraping a few times, I can confirm that it's trivial to circumvent protections against it, even for a novice programmer. In most cases, it's literally less than 10 minutes of googling and trial & error.
And for a major AI / web-search company, it could be a routine task, with teams of dedicated professionals working on it.
I think the both explanations can be true at the same time:
One likely scenario is where Google itself is a main culprit.
E.g. Elon learned that Google is scraping twitter data on industrial scale to train its AIs, without paying anything to Twitter. This results in massive infrastructure expenses for Twitter, to be paid to... Google. Outraged Elon stormed into the Alphabet headquarters, but was politely asked to get lost. Hilarity ensues.
The situation is evolving rapidly. Here’s where we stand as of the morning of July 4th.
Well You See What Happened Was…
Oh no! To be clear, by twitches, I mean ‘Elon refused to pay the cloud bill.’
As a result, Twitter has been forced to rate limit users.
That fourth one hurts my process. Navigation is somewhat slower and more annoying. In particular, forced threading breaks chronological order assumptions and one’s ability to use duplication to locate one’s place, and zooming in to move around twisting Twitter threads is so bad you need to jump to Twitter itself. Navigation to zoom back requires clicking in annoying places. I was unable to configure the column order without deleting them all and then creating them again, although this was quick. Column width and monitor real estate use is screwed up in subtle ways. Oh, and now its settings are linked to Twitter’s even though I want them to be different. Sheesh.
Another little thing is that the tab icon is now identical to Twitter’s. So annoying.
This is still vastly better than the period where Tweetdeck stopped working.
The third is brutal for some of my readers. Many report they can’t view any links.
What to do, if this doesn’t end soon?
The Plan
Three parts: How I will deal with processing info, how I will change how I present info, and how you can adjust to the new situation.
Also, clarifying some policies on how Twitter threads work here.
Oh No, That’s Not the Reason, Except It Is
This Maggie Johnson-Pint thread offers a cool explanation of what else might have caused this situation, written before we knew the answer.
Twitter owner Musk explains, with a different justification than the real one:
Three obvious responses.
Some people pointing out the obvious, links signify distinct threads:
People Dealing With It
There is no way to see how close you are to the rate limit.
A strange dynamic is how many people complained about the rate limit being too low for them, yet did not think getting around this was worth $8.
There are also other benefits to paying, which now include Tweetdeck. That can now be used as a justification, avoiding stigma or looking silly. Which is a dumb worry. If people are going to treat you badly because you are paying a reasonable price to improve your experience of a valuable product you are both using a lot? Screw ‘em.
Cutting out low quality follows, blocking or muting people you don’t want around and otherwise improving the quality of your feed is always a great idea. Most Twitter users need to do a lot more of this, so using this moment for ‘spring cleaning’ seems great.
I still am baffled by the degree of refusal to pay. I share the instinct to avoid subscriptions. This still seems like a clear case where heavy users should make an exception.
Other Twitter News
Twitter owner Elon Musk continues to be surprised by how Twitter works. Last week he learned that their code ‘shadowbanned’ any account with low reputation score, preventing them from trending, and the calculation was based on ‘how many times were you reported’ so every big account got shadowbanned.
When I insist that the algorithms used in large companies are counterproductive, stupid and ill-considered, trying to solve the wrong problems using the wrong methods based on a wrong model of the world and all their mistakes have failed to cancel out, I mostly didn’t mean this stupid. A counting statistic on reports to designate low quality? How did anyone think that was a good idea? Did anyone think about it for a minute, let alone test it?
Every time someone says ‘no one would be so stupid as to tell an AI system to…’ remember that no, someone will totally be exactly that stupid.
General Twitter Outlook Going Forward
Until now, none of the changes to Twitter substantially impacted my experience, utility or workflow. If there was one main downside, it was listening to people complain about Twitter.
This time is different. My experience and workflow got worse, as did the experience of many of my best readers. I don’t see the importance of avoiding having a Twitter account at all, but others do, and I respect people who know themselves in such ways.
None of that is especially scary in and of itself. If the new situation proves stable, I am confident it will be fine.
The danger is that the new situation is not obviously stable. Musk has altered the deal, prey that he does not alter it any further. Twitter depends on its network effects. So far, those network effects have held firm, and almost all talk of abandonment has been exactly that, talk. Things now seem much closer to potential tipping points, where the core network effects could become endangered.
If that did happen, it would be extremely bad if it didn’t rapidly result in another site with similar functionality quickly reassembling those network effects and providing a place to make sense of the world in real time and generate a customized information flow. At best, converging on a new location and reestablishing the social graphs there would take years.
A plausible new alternative is about to launch, called Threads, which is closely connected to Instagram. It would be extremely bad if this resulted in Twitter effectively becoming part of the Meta empire, and Meta then had root control of all of that data and its thumb on all the various scales.
Their privacy policies do not seem great, as one would expect, although I’d be more worried about the training data being used for Meta’s LLMs.
Thus, I will continue to rely on and support Twitter in these trying times. There are problems, but all the alternatives are clearly far worse.