All of O O's Comments + Replies

O O10

I wonder if giving it an example of the intended translated writing style helps. 

O O10

Not just central banks but the U.S. going off the gold standard too then fiddling with bond yields to cover up ensuing inflation maybe?

O O20

It's quite strange that owning all the world's (public) productive assets have only beaten gold, a largely useless shiny metal, by 1% per year over the last 56 years.  

Even if you focus on rolling metrics to(this is 5 year rolling returns).:

there are lots of long stretches of gold beating world equities, especially in recent times. There are people with the suspicion (myself included) that there hasn't been much material growth in the world over the last 40 or so years compared to before. And that since growth is slowing down, this issue is worse if y... (read more)

1Mis-Understandings
Global GDP growth over the same period was around 3 percent. The question is how did equities outperform gdp growth. I think that this has to do with changes in asset prices in general. 
2Mitchell_Porter
Gold was currency, and is still used as a hedge against fiat currency.  I assume most of that growth occurred in China.  What can central banks do to affect GDP growth? 
O O42

I just think they’ll be easy to fool. For example, historically many companies would get political favors (tariff exemptions) by making flashy fake headlines such as promising to spend trillions on factories.

O O-1-3

China has no specific animal welfare laws. There are also some Chinese that regard animal welfare as a Western import. Maybe the claim that they have no concept at all is too strong, but it's certainly minimized by previous regimes. 
ie

Mao regarded the love for pets and the sympathy for the downtrodden as bourgeoise

And China's average meat consumption being lower could just be a reflection of their gdp per capita being lower. I don't know where you got the 14% vegetarian number. I can find 5% online. About the same as US numbers. 

O O4-25

I don't think the Trump admin has the capacity to meaningful take over an AGI project. Whatever happens, I think the lab leadership will be calling the shots.

1Tachikoma
The heads of AI labs are functionally cowards that would been the one at the first knock on their door by state agents. Some have preemptively bent the knee to get into the good graces of the Trump admin like Altman and Zuckerberg to accelerate their progress. While Trump himself might be out of the loop, his adminstration is staffed by people who know what AGI means and are looking for any sources of power to pursue their agenda.
O O31

Looked up a poll from 2023. Though, maybe that poll is biased by people not voicing their true opinions?

O O6-9

Chinese culture is just less sympathetic in general. China practically has no concept of philanthropy, animal welfare. They are also pretty explicitly ethnonationalist. You don’t hear about these things because the Chinese government has banned dissent and walled off its inhabitants.

However, I think the Hong Kong reunification is going better than I'd expect given the 2019 protests. You'd expect mass social upheaval, but people are just either satisfied or moderately dissatisfied.

S M1611

Claiming China has no concept of animal welfare is quite extraordinary. This is wrong both in theory and in practice. In theory, Buddhism has always ascribed sentience in animals, long before it was popular in the west. In practice, 14% of the Chinese population is vegetarian (vs. 4.2% in the US) and China's average meat consumption is also lower.

5Jayson_Virissimo
How are you in a position to know this?
O O2-3

quantum computing, nuclear fusion

O O90

I recently interviewed with them, and one of them said they’re hiring a lot of SWEs as they shift to product. Also many of my friends are currently interviewing with them.

O O42

I mean some hard evidence is them currently hiring a lot of software engineers for random product-y things. If AGI was close, wouldn't they go all in on research and training?

3Thane Ruthenis
Interesting. Source? Last I heard, they're not hiring anyone because they expect SWE to be automated soon.
7Thane Ruthenis
Orrr he's telling comforting lies to tread the fine line between billion-dollar hype and nationalization-worthy panic. Could realistically be either, but it's probably the comforting-lies thing. Whatever the ground-truth reality may be, the AGI labs are not bearish.
O O2-2

Yes, the likely outcome of a long tariff regime is China replaces the U.S. as the hegemon + AI race leader and they can’t read Lesswrong or EA blogs there so all this work is useless. 

5robo
LessWrong is uncensored in China.
2Mateusz Bagiński
VPNs exist and are probably widely used in China + much of "all this work" is on ArXiv etc.
O O32

These tariffs may get rid of the compute disadvantage China faces (ie Taiwan starts to ignore export controls). We might see China being comfortably ahead in a year or two assuming we don’t see Congress take drastic action to eliminate the president’s tariffing powers. 

2Nathan Helm-Burger
Some ask, "what should the US gov have done instead?" Here's an answer I like to that question, from max_paperclips: https://x.com/max_paperclips/status/1909085803978035357 https://x.com/max_paperclips/status/1907946171290775844
O O30

The costs of capex go way up. It costs a lot more to build datacenters. It will cost a lot more to buy GPUs. It might cost less to buy energy? Lenders will be in poorer shape. AI companies will lose funding. I think it's already quite tenuous, given how little moat AI companies have.  Costs are exploding and pretraining scaling seems too diminishing to be worth it. It's also not clear how AI labs will solve the reliability issue (at least to investors). 

I also expect Taiwan to start ignoring export controls if our obscenely high tariffs on them remain.

O O30

The U.S. tariffs, if kept in place, will very likely cede the AI race to China. Has there been any writing on what a China leading race looks like? 

1robo
Could you spell this out?  I don't see how AI has much to do with trade.  Is the idea that AI development is bounded on the cost of GPUs, and this will raise the cost of outside-China GPUs compared to inside-China GPUs?  Or is it that there will be less VC money e.g. because interest rates go up to combat inflation?
O O10

The point is the money or food just won’t get to them. How do you send food to a region in a civil war between 2 dictators? 

O O10

A lot of them are trapped in corrupt systems that are very costly and have ethics concerns blocking change. We have the money to feed them, but it would take far more money to turn a bunch of African countries into stable democracies. Overthrowing dictatorships might also raise ethics concerns about colonialism.

The easiest solution would just be lots of immigration, but host population reject that because of our evolutionary pecularities.

1otto.barten
I agree that changing systems is difficult. But providing basic means isn't, really. I personally think we should feed starving people even if they live in a dictatorship.
O O10

Isn't it a distribution problem? World hunger has almost disappeared however. (The issue is hungrier nations have more kids, so progress is a bit hidden).

1otto.barten
Wikipedia: in 2023, there were 733 million people suffering from hunger. That's 9% of the population. Most of these people just don't have the money to buy food. That's a 'distribution problem', for money, in the sense that we don't give it to them. Also, world hunger is actually rising again.. Some more data: https://www.linkedin.com/posts/ottobarten_about-700-million-people-in-the-world-cannot-activity-7266965529762873344-rvqK We could easily solve this if we wanted to, but apparently we don't. That's one data point why I fear intent-aligned superintelligence.
O O*1-1

In a long AGI world, isn’t it very plausible that it gets developed in China and thus basically all efforts to shape its creation is pointless since Lesswrong and associated efforts don’t have much influence in China?

O O40

I've always wondered, why didn't superpowers apply MAIM to nuclear capabilities in the past?

> Speculative but increasingly plausible, confidentiality-preserving AI verifiers

Such as?

O O10

I mean, I don't want to give Big Labs any ideas, but I suspect the reasoning above implies that the o1/deepseek -style RL procedures might work a lot better if they can think internally for a long time

I expect gpt 5 to implement this. Based on recent research and how they phrase it. 

2Cole Wyeth
Yes, this is the type of idea big labs will definitely already have (also what I think ~100% of the time someone says "I don't have to give big labs any ideas").
O O10

I thought OpenAI’s deep research uses the full o3? 

O O20

Why are current LLMs, reasoning models and whatever else still horribly unreliable? I can ask the current best models (o3, Claude, deep research, etc) to do a task to generate lots of code for me using a specific pattern or make a chart with company valuations and it’ll get them mostly wrong.


Is this just a result of labs hill climbing a bunch of impressive sounding benchmarks? I think this should delay timelines a bit. Unless there’s progress on reliability I just can’t perceive. 

2cubefox
So far o3 isn't released yet, so it might be able to do it.
O O61

SWEs won't necessarily be fired even after becoming useless


I'm actually surprised at how eager/willing big tech is to fire SWEs once they're sure they won't be economically valuable. I think a lot of priors for them being stable come from the ZIRP era. Now, these companies have quite frequent layoffs, silent layoffs, and performance firings. Companies becoming leaner will be a good litmus test for a lot of these claims.

O O10

https://x.com/arankomatsuzaki/status/1889522974467957033?s=46&t=9y15MIfip4QAOskUiIhvgA

 

O3 gets IOI Gold. Either we are in a fast takeoff or the "gold" standard benchmarks are a lot less useful than imagined.

4Vladimir_Nesov
The tweet links to the 3 Feb 2025 OpenAI paper that discusses specialized o1-ioi system based on o1 that competed live during IOI 2024, and compares its performance to later results with o3. I think the most it says about the nature of the distinction between o1 and o3 is this (referring to results of o3): This suggests that o3 is based on the same base model, or even a shared RL checkpoint, but still ambiguously. So doesn't clearly rule out that o3 starts with a different base model and then also does more RL training than o1 did. On the other hand, there's this: The cut-off for o3 is in 2023, which is consistent with GPT-4o or GPT-4 Turbo, and for any other base model this probably also puts start of pretraining to early 2024 at the latest.
O O-20

I feel like a lot of manifold is virtue signaling .

9momom2
There are three traders on this market; it means nothing at the moment. No need for virtue signalling to explain a result you might perceive as abnormal, it's just not formed yet.
O O10

Just curious. How do you square the rise in AI stocks taking so long? Many people here thought it was obvious since 2022 and made a ton of money. 

 

2ozziegooen
I'm somewhere between the stock market and the rationalist/EA community on this.  I'm hesitant to accept a claim like "rationalists are far better at the stock market than other top traders". I agree that the general guess "AI will do well" generally was more correct than the market, but it was just one call (in which case luck is a major factor), and there were a lot of other calls made there that aren't tracked.  I think we can point to many people who did make money, but I'm not sure how much this community made on average. 
O O10

Keep in mind the current administration is replacing incompetent bureaucracies with self assembling corporations. The organization is still there, just more competent and under a different name. A government project could just look like telling the labs to create 1 data center, throwing money at them, and cutting red tape for building gas plants. 

O O62

Seems increasingly likely to me that there will be some kind of national AI project before AGI is achieved as the government is waking up to its potential pretty fast. Unsure what the odds are, but last year, I would have said <10%. Now I think it's between 30% and 60%.

Has anyone done a write up on what the government-led AI project(s) scenario would look like?   

2Mitchell_Porter
You first might want to distinguish between national AI projects that are just about boosting the AI economy or managing the use of AI within government, and government-backed research which is specifically aimed at the AGI frontier. Presumably it's the latter that you're talking about.  There is also the question of what a government would think it was doing, in embarking on such a project. The commercial enterprise of creating AI is already haunted by the idea that it would be bad for business if your creation wiped out the human race. That hasn't stopped anyone, but the fear is there, overcome only by greed.  Now, what about politicians and public servants, generals and spymasters? How would they feel about leading a race to create AI? What would they think they were doing? Creating artificial super-scientists, super-soldiers, super-strategists? Compared to Silcon Valley, these people are more about the power motive than the profit motive. What, apart from the arms race, do they have to lure them along the AI path, comparable to the dream of uber-wealth that drives the tech oligarchs? (In dictatorships, I suppose there is also the dream of absolute personal power to motivate them.)  Apart from the arms race, the vision that seems to animate pro-AI western elites, is economic and strategic competition among nations. If China takes the lead in AI, it will have the best products and the best technologies and it will conquer the world that way. So I guess the thinking of Trump 2.0's AI czar David Sacks (a friend of Thiel and Musk), and the people around him, is going to be some mixture of these themes - the US must lead because AI is the key to economic, technological, and military superiority in the 21st century.  Now I think that even the most self-confident, gung-ho, born-to-rule man-of-destiny who gets involved in the AI race, is surely going to have a moment when they think, am I just creating my own replacement here? Can even my intellect, and my charisma,
4Noosphere89
My guess is that this is reasonably plausible assuming the short timelines are in fact going to happen, but it's going to be up against a backdrop of a shock to government competence such that the people who could do a national project completely fail to even get started, let alone complete a herculean effort, since all the possible choices for the role are selected based on loyalty, not competence. I expect the new administration to break the presidential government and agencies competence by extreme amounts, such that I wouldn't be totally surprised if by the end of the administration, there would be a complete inability to have a national AI project/nationalize the business at all.
5Seth Herd
I agree that expecting nobody in power to notice the potential before AGI is takeover-capable seems implausible on the slow-takeoff path that now looks likely. It seems like the incoming administration is pretty free-market oriented. So I'd expect government involvement to mostly be giving the existing orgs money, and just taking over control of their projects as much as seems necessary - or fun.
O O10

It might just be a perception problem. LLMs don't really seem to have a good understanding of a letter being next to another one yet or what a diagonal is. If you look at arc-agi with o3, you see it doing worse as the grid gets larger with humans not having the same drawback. 

EDIT: Tried on o1 pro right now. Doesn't seem like a perception problem, but it still could be. I wonder if it's related to being a succcesful agent. It might not model a sequence of actions on the state of a world properly yet. It's strange that this isn't unlocked with reasoning. 

O O10

Ah so there could actually be a large compute overhang as it stands?

7Vladimir_Nesov
There is a Feb 2024 paper that predicts high compute multipliers from using more finer-grained experts in MoE models, optimally about 64 experts activated per token at 1e24-1e25 FLOPs, whereas MoE models with known architecture usually have 2 experts activated per token. DeepSeek-V3 has 8 routed experts activated per token, a step in that direction. On the other hand, things like this should've already been tested at the leading labs, so the chances that it's a new idea being brought to attention there seem slim. Runners-up like xAI and Meta might find this more useful, if that's indeed the reason, rather than extremely well-done post-training or even pretraining dataset construction.
O O50

Does deepseek v3 imply current models are not trained as efficiently as they could be? Seems like they used a very small fraction of previous models resources and is only slightly worse than the best LLM.

9Vladimir_Nesov
Its pretraining recipe is now public, so it could get reproduced with much more compute soon. It might also suggest that scaling of pretraining has already plateaued, that leading labs have architectures that are at least as good as DeepSeek-V3, pump 20-60 times more compute into them, and get something only marginally better.
O O30

They did this on the far easier training set though?

An alternative story is they trained until a model was found that could beat the training set but many other benchmarks too, implying that there may be some general intelligence factor there. Maybe this is still goodharting on benchmarks but there’s probably truly something there.

3Thane Ruthenis
There are degrees of Goodharting. It's not Goodharting to ARC-AGI specifically, but it is optimizing for performance on the array of easily-checkable benchmarks. Which plausibly have some common factor between them to which you could "Goodhart"; i. e., a way to get good at them without actually training generality.
O O43

No, I believe there is a human in the loop for the above if that’s not clear.

You’ve said it in another comment. But this is probably an “architecture search”.

I guess the training loop for o3 is similar but it would be on the easier training set instead of the far harder test set.

2Knight Lee
Wow it does say the test set problems are harder than the training set problems. I didn't expect that. But it's not an enormous difference: the example model that got 53% on the public training set got 38% on the public test set. It got only 24% on the private test set, even though it's supposed to be equally hard, maybe because "trial and error" fitted the model to the public test set as well as the public training set. The other example model got 32%, 30%, and 22%.
O O41

I think there is a third explanation here. The Kaggle model (probably) does well because you can brute force it with a bag of heuristics and gradually iterate by discarding ones that don't work and keeping the ones that do. 

3Thane Ruthenis
Do you not consider that ultimately isomorphic to what o3 does?
1Knight Lee
I agree. I think the Kaggle models have more advantages than o3. I think they have far more human design and fine-tuning than o3. One can almost argue that some Kaggle models are very slightly trained on the test set, in the sense the humans making them learn from test sets results, and empirically discover what improves such results. o3's defeating the Kaggle models is very impressive, but o3's results shouldn't be directly compared against other untuned models.
O O31

I have ~15% probability humanity will invent artificial superintelligence (ASI) by 2030.

The recent announcement of the o3 model has updated me to 95%, with most of the 5% being regulatory slow downs involving unprecedented global cooperation.

5samuelshadrach
Sorry if this is rude, but your comment doesn’t engage with any of the arguments in the post, or make arguments in favour of your own position. If you’re just stating your view without proof then sure, that works. 
O O-10

Do you have a link to these?

O O4-1

I think a lot of this is wishful thinking from safetyists who want AI development to stop. This may be reductionist but almost every pause historically can be explained economics. 

Nuclear - war usage is wholly owned by the state and developed to its saturation point (i.e. once you have nukes that can kill all your enemies, there is little reason to develop them more). Energy-wise, supposedly, it was hamstrung by regulation, but in countries like China where development went unfettered, they are still not dominant. This tells me a lot it not being deve... (read more)

O O-4-5

We still somehow got the steam engine, electricity, cars, etc.  

There is an element of international competition to it. If we slack here, China will probably raise armies of robots with unlimited firepower and take over the world. (They constantly show aggression)

The longshoreman strike is only allowed (I think) because the west coast did automate and somehow are less efficient than the east coast for example. 

4Thane Ruthenis
Counterpoints: nuclear power, pharmaceuticals, bioengineering, urban development. Or maybe they will accidentally ban AI too due to being a dysfunctional autocracy, as autocracies are wont to do, all the while remaining just as clueless regarding what's happening as their US counterparts banning AI to protect the jobs. I don't really expect that to happen, but survival-without-dignity scenarios do seem salient.
O O80

Oh I guess I was assuming automation of coding would result in a step change in research in every other domain. I know that coding is actually one of the biggest blockers in much of AI research and automation in general.  

It might soon become cost effective to write bespoke solutions for a lot of labor jobs for example. 

O O40

Why would that be the likely case? Are you sure it's likely or are you just catastrophizing?

O O64

catastrophic job loss that destroys the global economy?

I expect the US or Chinese government to take control of these systems sooner than later to maintain sovereignty. I also expect there will be some force to counteract the rapid nominal deflation that would happen if there was mass job loss. Every ultra rich person now relies on billions of people buying their products to give their companies the valuation they have. 

I don't think people want nominal deflation even if it's real economic growth. This will result in massive printing from the fed that probably lands in poeple's pockets (Iike covid checks).

4Seth Herd
I also expect government control; see If we solve alignment, do we die anyway? for musings about the risks thereof. But it is a possible partial solution to job loss. It's a lot tougher to pass a law saying "no one can make this promising new technology even though it will vastly increase economic productivity" than to just show up to one company and say "heeeey so we couldn't help but notice you guys are building something that will utterly shift the balance of power in the world.... can we just, you know, sit in and hear what you're doing with it and maybe kibbitz a bit?" Then nationalize it officially if and when that seems necessary.
6Noosphere89
I think this is reasonably likely, but not a guaranteed outcome, and I do think there's a non-trivial chance that the US regulates it way too late to matter, because I expect mass job loss to be one of the last things AI does, due to pretty severe reliability issues with current AI.
O O912

While I'm not surprised by the pessimism here, I am surprised at how much of it is focused on personal job loss. I thought there would be more existential dread. 

Existential dread doesn't necessarily follow from this specific development if training only works around verifiable tasks and not for everything else, like with chess. Could soon be game-changing for coding and other forms of engineering, without full automation even there and without applying to a lot of other things.

O O40

It’s better at questions but subjectively there doesn’t feel like there’s much transfer. It still gets some basic questions wrong.

O O1-4

O1’s release has made me think Yann Lecun’s AGI timelines are probably more correct than shorter ones

2Stephen McAleese
Why? O1 is much more capable than GPT-4o at math, programming, and science.
O O3-14

Why is the built-in assumption for almost every single post on this site that alignment is impossible and we need a 100 year international ban to survive? This does not seem particularly intellectually honest to me. It is very possible no international agreement is needed. Alignment may turn out to be quite tractable.

7Seth Herd
It's not every post, but there are still a lot of people who think that alignment is very hard. The more common assumption is that we should assume that alignment isn't trivial, because an intellectually honest assessment of the range of opinions suggests that we collectively do not yet know how hard alignment will be.

Yudkowsky has a pinned tweet that states the problem quite well: it's not so much that alignment is necessarily infinitely difficult, but that it certainly doesn't seem anywhere as easy as advancing capabilities, and that's a problem when what matters is whether the first powerful AI is aligned:

Safely aligning a powerful AI will be said to be 'difficult' if that work takes two years longer or 50% more serial time, whichever is less, compared to the work of building a powerful AI without trying to safely align it.

A mere 5% chance that the plane will crash during your flight is consistent with considering this extremely concerning and doing anything in your power to avoid getting on it. "Alignment is impossible" is not necessary for great concern, isn't implied by great concern.

O O10

I guess in the real world the rules aren’t harder per se but just less clear and not written down. I think both the rules and tools needed to solve contest math questions at least feel harder than the vast majority of rules and tools human minds deal with. Someone like Terrence Tao, who is a master of these, excelled in every subject when he was a kid (iirc).

I think LLMs have a pretty good model of human behavior, so for anything related to human judgement, in theory this isn’t why it’s not doing well.

And where rules are unwritten/unknown (say biology), ar... (read more)

2Vladimir_Nesov
The facts are in there, but not in the form of a sufficiently good reward model that can tell as well as human experts which answer is better or whether a step of an argument is valid. In the same way, RLHF is still better with humans on some queries, hasn't been fully automated to superior results by replacing humans with models in all cases.
O O5-2

O1 probably scales to superhuman reasoning:

O1 given maximal compute solves most AIME questions. (One of the hardest benchmarks in existence). If this isn’t gamed by having the solution somewhere in the corpus then:

-you can make the base model more efficient at thinking

-you can implement the base model more efficiently on hardware

-you can simply wait for hardware to get better

-you can create custom inference chips

Anything wrong with this view? I think agents are unlocked shortly along with or after this too.

5Carl Feynman
I was all set to disagree with this when I reread it more carefully and noticed it said “superhuman reasoning” and not “superintelligence”.  Your definition of “reasoning” can make this obviously true or probably false.  
4Vladimir_Nesov
A reasoning model depends on starting from a sufficient base model that captures the relevant considerations. Solving AIME is like winning at chess, except the rules of chess are trivial, and the rules of AIME are much harder. But the rules of AIME are still not that hard, it's using them to win that is hard. In the real world, the rules get much harder than that, so it's unclear how far o1 can go if the base model doesn't get sufficiently better (at knowing the rules), and it's unclear how much better it needs to get. Plausibly it needs to get so good that o1-like post-training won't be needed for it to pursue long chains of reasoning on its own, as an emergent capability. (This includes the possibility that RL is still necessary in some other way, as an engine of optimization to get better at rules of the real world, that is to get better reward models.)
2ChristianKl
While that tweet says good things about his relationship with truth, his defense of talking about immigrants eating cats and dogs because constituents told him without checking whether or not that's true was awful. Maybe, we felt like he needed to do it because of political pressure and felt dirty doing it, but it was still awful by rationalist standards.   I think there's a good chance that JDVance is better than the average US politician, but there's no good reason to see him as a rationalist. 
8DirectedEvolution
The rationalist movement is associated with LessWrong and the idea of “training rationality.” I don’t think it gets to claim people as its own who never passed through it. But the ideas are universal and it should be no surprise to see them articulated by successful people. That’s who rationalists borrowed them from in the first place.
-1lesswronguser123
Well that tweet can easily be interpreted as overconfidence for their own side, I don't know whether Vance would continue with being more of a rationalist and analyse his own side evenly.
Load More