I claim that Said's post is bad because it can be rewritten into a post that fulfills the same function but doesn't feel as offensive.[1] Nothing analogous is true for the Scaling Hypothesis. And it's not just that you couldn't rewrite it to be less scary but convey the same ideas; rather the whole comparison in a non-starter because I don't think that your post on the scaling hypothesis has bad vibes, at all. If memory serves (I didn't read your post in its entirety back then, but I read some of it and I have some memory of how I reacted), it sparks a kind of "holy shit this is happening and extremely scary ---(.Ó﹏Ò.)" reaction. This is, like, actively good

This description of 'bad vibes' vs 'good vibes' and what could be 'be rewritten into a post that fulfills the same function', is confusing to me because I would have said that that is obviously untrue of Scaling Hypothesis (and as the author, I should hope I would know), and that was why I highlighted it as an example: aside from the bad news being delivered in it, I wrote a lot of it to be deliberately rude and offensive - and those were some of the most effective parts of it! (And also, yes, made people mad at me.) Just because the essay was effective and is now high-status doesn't change that. It couldn't've been rewritten and achieved the same outcome, because that was much of the point.

Reply

Habryka's Shortform Feed

gwern8d20

If GPT-4.5 was supposed to be GPT-5, why would Sam Altman underdeliver on compute for it? Surely GPT-5 would have been a top priority?

If it's not obvious at this point why, I would prefer to not go into it here in a shallow superficial way, and refer you to the OA coup discussions.

Reply

Habryka's Shortform Feed

gwern9d131

GPT-4.5 is roughly a 10x scale-up of GPT-4, right? And full number jumps in GPT have always been ~100x? So GPT-4.5 seems like the natural name for OpenAI to go with.

10x is what it was, but it wasn't what it was supposed to be. That's just what they finally killed it at, after the innumerable bugs and other issues that they alluded to during the livestream and elsewhere, which is expected given the 'wait equation' for large DL runs - after a certain point, no matter how much you have invested, it's a sunk cost and you're better off starting afresh, such as, say, with distilled data from some sort of breakthrough model... (Reading between the lines, I suspect that what would become 'GPT-4.5' was one of the unknown projects besides Superalignment which suffered from Sam Altman overpromising compute quotas and gaslighting people about it, leading to an endless deathmarch where they kept thinking 'we'll get the compute next month', and the 10x compute-equivalent comes from a mix of what compute they scraped together from failed runs/iterations and what improvements they could wodge in partway even though that is not as good as doing from scratch, see OA Rerun.)

Reply

Habryka's Shortform Feed

gwern10d100

at that time the median estimate for GPT5 release was at December 2024.

Which was correct ex ante, and mostly correct ex post - that's when OA had been dropping hints about releasing GPT-4.5, which was clearly supposed to have been GPT-5, and seemingly changed their mind near Dec 2024 and spiked it before it seems like the DeepSeek moment in Jan 2025 unchanged their minds and they released it February 2025. (And GPT-4.5 is indeed a lot better than GPT-4 across the board. Just not a reasoning model or dominant over the o1-series.)

Reply

Habryka's Shortform Feed

gwern11d227

GPT was $20/month in 2023 and it's still $20/month.

Those are buying wildly different things. (They are not even comparable in terms of real dollars. That's like a 10% difference, solely from inflation!)

Reply

2

[Meta] New moderation tools and moderation guidelines

gwern11d*3224

It’s not my view at all. I think a community will achieve much better outcomes if being bothered by the example message is considered normal and acceptable, and writing the example message is considered bad.

That's a strange position to hold on LW, where it has long been a core tenet that one should not be bothered by messages like that. And that has always been the case, whether it was LW2, LW1 (remember, say, 'babyeaters'? or 'decoupling'? or Methods of Rationality), Overcoming Bias (Hanson, 'politics is the mindkiller'), SL4 ('Crocker's Rules') etc.

I can definitely say on my own part that nothing of major value I have done as a writer online—whether it was popularizing Bitcoin or darknet markets or the embryo selection analysis or writing 'The Scaling Hypothesis'—would have been done if I had cared too much about "vibes" or how it made the reader feel. (Many of the things I have written definitely did make a lot of readers feel bad. And they should have. There is something wrong with you if you can read, say, 'Scaling Hypothesis' and not feel bad. I myself regularly feel bad about it! But that's not a bad thing.) Even my Wikipedia editing earned me doxes and death threats.

And this is because (among many other reasons) emotional reactions are inextricably tied up with manipulation, politics, and status - which are the very last things you want in a site dedicated to speculative discussion and far-out unpopular ideas, which will definitionally be 'creepy', 'icky', 'cringe', 'fringe', 'evil', 'bad vibes' etc. (Even the most brutal totalitarian dictatorships concede this when they set up free speech zones and safe spaces like the 'science cities'.)

Someone once wrote, upon being newly arrived to LW, a good observation of the local culture about how this works:

Could being “status-blind” in the sense that Eliezer claims to be (or perhaps some other not yet well-understood status-related property) be strongly correlated to managing to create lots of utility? (In the sense of helping the world a lot).

Currently I consider Yudkowsky, Scott Alexander, and Nick Bostrom to be three of the most important people. After reading Superintelligence and watching a bunch of interviews, one of first things I said about Nick Bostrom to a friend was that I felt like he legitimately has almost no status concerns (that was well before LW 2.0 launched). In case of S/A it’s less clear, but I suspect similar things.

Many of our ideas and people are (much) higher status than they used to be. It is no surprise people here might care more about status than they used to, in the same way that rich people care more about taxes than poor people.

But they were willing to be status-blind and not prize emotionality, and that is why they could become high-status. And barring the sudden discovery of an infallible oracle, we can continue to expect future high-status things to start off low-status...

Reply

1

Time Machine as Existential Risk

gwern11d143

You might enjoy my review of the movie Timecrimes.

Reply

Racial Dating Preferences and Sexual Racism

gwern11d*61

I would also point out that, despite whatever she said in 1928 about her 1909 inheritance, Woolf committed suicide in 1941 after extensive mental health challenges which included "short periods in 1910, 1912, and 1913" in a kind of insane asylum, and then afterwards beginning her serious writing career (which WP describes as heavily motivated by her psychiatric problems as a refuge/self-therapy), so one can certainly question her own narrative of the benefits of her UBI or the reasons she began writing. (I will further note that the psychological & psychiatric benefits of UBI RCTs have not been impressive to date.)

Reply

AI forecasting bots incoming

gwern11dΩ244813

Update: Bots are still beaten by human forecasting teams/superforecasters/centaurs on truly heldout Metaculus problems as of early 2025: https://www.metaculus.com/notebooks/38673/q1-ai-benchmarking-results/

A useful & readable discussion of various methodological problems (including the date-range search problems above) which render all forecasting backtesting dead on arrival (IMO) was recently compiled as "Pitfalls in Evaluating Language Model Forecasters", Paleka et al 2025, and is worth reading if you are at all interested in the topic.

Reply

Genomic emancipation

gwern11d80

Personality traits are especially nasty a danger because given the existence of: stabilizing selection + non-additive variance + high social homogamy/assortative mating + many personality traits with substantial heritability, you can probably create extreme self-sustaining non-coercive population structure with a package of edits. I should probably write some more about this because I think that embryo selection doesn't create this danger (or in general result in the common fear of 'speciation'), but embryo editing/synthesis does.

Reply