But maybe the point at which you get into AI safety is less clear once you are on LW. And FO is a something that can more clearly named. So it could all just be availability heuristic.

Reply

Elizabeth's Shortform

Gunnar_Zarncke3d40

Why do you think that is the case? I mean a) why do they reveal that only after a few drinks and b) why was that the convincing story - and not HPMoR?

Reply

Gunnar_Zarncke's Shortform

Gunnar_Zarncke4d40

[Linkpost] China's AI OVERPRODUCTION

Claim by Balaji:

China seeks to commoditize their complements. So, over the following months, I expect a complete blitz of Chinese open-source AI models for everything from computer vision to robotics to image generation.

If true, what effects would that have on the AI race and AI governance?

Reply

Gunnar_Zarncke's Shortform

Gunnar_Zarncke5d40

Yes! That's the right intuition. And the LLMs are doing the same - but we don't know their world model, and thus, the direction of the simplification can be arbitrarily off.

Drilling down on the simplifications, as suggested by Villiam might help.

Reply

Gunnar_Zarncke's Shortform

Gunnar_Zarncke5d20

This is an interesting UI proposal and, if done right, might provide the needed transparency. Most people wouldn't read it, but some would, esp. for critical answers.

Reply

How far along Metr's law can AI start automating or helping with alignment research?

Gunnar_Zarncke5d20

Yes, but it didn't mean that AIs could do all kinds of long tasks in 2005. And that is the conclusion many people seem to draw from the METR paper.

Reply

Gunnar_Zarncke's Shortform

Gunnar_Zarncke5d20

as we use the term, yes. But the point (and I should have made that more clear) is that any mismodeling of the parent of the interests of the child's interests and future environment will not be visible to the child or even someone reading the thoughts of the well-meaning parent. So many parents want the best for their child, but model the future of the child wrongly (mostly by status quo bias; the problem is different for AI).

Reply

How far along Metr's law can AI start automating or helping with alignment research?

Gunnar_Zarncke5d20

It is a decent metric for chess but a) it doesn't generalize to other tasks (as people seem to interpret the METR paper), and less importantly, b) I'm quite confident that people wouldn't beat the chess engines by thinking for years.

Reply

How far along Metr's law can AI start automating or helping with alignment research?

Gunnar_Zarncke6d20

No? It means you can't beat the chess engine.

And even if - they try to argue in the other direction: If it takes the human time X at time T it will take the AI duration L. That didn't work for chess either.

Reply

How far along Metr's law can AI start automating or helping with alignment research?

Gunnar_Zarncke6d20

That we would have AIs performing year-long tasks in 2005. Chess is not the same as software engineering but it is still a limited domain.

Reply