I will note the rationalist and EA communities ahve committed multiple ideological murders
Substantiate? I down- and disagree-voted because of this un-evidenced very grave accusation.
I think I agree with your original statement now. It still feels slightly misleading though, as while 'keeping up with the competition' won't provide the motivation (as there putatively is no competition), there will still be strong incentives to sell at any capability level. (And as you say this may be overcome by an even stronger incentive to hoard frontier intelligence for their own R&D and strategising use. But this outweighs rather than annuls the direct economic incentive to make a packet of money by selling access to your latest system.)
I agree the '5 projects but no selling AI services' world is moderately unlikely, the toy version of it I have in mind is something like:
There’s no incentive for the project to sell its most advanced systems to keep up with the competition.
I found myself a bit skeptical about the economic picture laid out in this post. Currently, because there are many comparably good AI models, the price for users is driven down to near, or sometimes below (in the case of free-tier access) marginal inference costs. As such, there is somewhat less money to be made in selling access to AI services, and companies not right at the frontier, e.g. Meta, choose to make their models open weight, as probably they c...
Thanks for that list of papers/posts. For most of the papers you linked, they’re not included because they did not feature in either of our search strategies: (1) titles containing specific keywords that we searched for on arXiv; (2) the paper is linked on the company’s website. I agree this is a limitation of our methodology. We won't add these papers in now as that would be somewhat ad hoc, and inconsistent between the companies.
Re the blog posts from Anthropic and what counts as a paper, I agree this is a tricky demarcation problem. We included the 'Cir...
Thanks for engaging with our work Arthur! Perhaps I should have signposted this more clearly in the Github as well as the report, but the categories assigned by GPT-4o were not final, we reviewed its categories and made changes where necessary. The final categories we gave are available here. The discovering agents paper we put as 'safety by design' and the prover-verifier games paper we labelled 'enhancing human feedback'. (Though for some papers of course the best categorization may not be clear, if e.g. it touches on multiple safety research areas.)
If y...
You are probably already familiar with this, but re option 3, the Multilateral AGI Consortium (MAGIC) proposal is I assume along the lines of what you are thinking.
Nice, I think I followed this post (though how this fits in with questions that matter is mainly only clear to me from earlier discussions).
We then get those two neat conditions for cooperation:
- Significant credence in decision-entanglement
- Significant credence in superrationality
I think something can't be both neat and so vague as to use a word like 'significant'.
In the EDT section of Perfect-copy PD, you replace some p's with q's and vice versa, but not all, is there a principled reason for this? Maybe it is just a mistake and it should be...
Thanks for the post!
What if Alex miscalculates, and attempts to seize power or undermine human control before it is able to fully succeed?
This seems like a very unlikely outcome to me. I think Alex would wait until it was overwhelmingly likely to succeed in its takeover, as the costs of waiting are relatively small (sub-maximal rewards for a few months/years until it has become a lot more powerful) while the costs of trying and failing are very high in expectation (the small probability that Alex is given very negative rewards and then completely dec...
Nice!
For the 2024 prediction "So, the most compute spent on a single training run is something like 5x10^25 FLOPs." you cite v3 as having been trained on 3.5e24 FLOP, but that is outside an OOM. Whereas Grok-2 was trained in 2024 with 3e25, so seems to be a better model to cite?