I can confirm that this is a pretty much the best introduction to take you from 0 to about 80% in using AI.
It is intended for general users, don't expect technical information on how to use APIs or build apps.
TLDR my reaction is I don’t really know how good these models are right now.
I felt exactly the same after the Claude 3.7 post.
But actually.. hasn't LiveBench solved the evals crisis?
It is specifically targeted a “subjective” and “cheating/hacking” problems.
It also cover a pretty broad set of capabilities.
The number of different benchmarks and metrics we are using to understand each new model is crazy. I'm so confused. The exec summary helps, but...
I don't think the relative difference between models is big enough to justify switching from the one you're currently used to.
Does this mean that Zvi doesn't read the comments on LW?
He seems to be much more active on Substack.
So, the most important things I've learned for myself are:
1. Sam was fired because of his sneaky attempts to get rid of some board members.
2. Sam didn't answer the question of why so many high ranking ppl have left the company recently.
3. Sam missed the fact that for some people safety focus was a major decision factor in the early hiring.
There seems to be enough evidence that he doesn't care about safety.
And he actively uses dark methods to accumulate power.
We’re not even preparing reasonably for the mundane things that current AIs can do, in either the sense of preparing for risks, or in the sense of taking advantage of its opportunities. And almost no one is giving much serious thought to what the world full of AIs will actually look like and what version of it would be good for humans, despite us knowing such a world is likely headed our way.
Is there any good post on what to do? Preferrably aimed for a casual person who just use ChatGPT 1-2 times a month
The investments in data centers are going big. Microsoft will spend $80 billion in fiscal 2025, versus $64.5 billion on capex in the last year. Amazon is spending $65 billion, Google $49 billion and Meta $31 billion.
About 5 years ago, when Elon promised a $1B investment in OpenAI, it seemed like an unusual leap of faith. And now just 4 top corporations are casually committing over $200B to AI infrastructure. The pace is already crazy.
This is potentially the most powerful technology humanity has ever created. And what's even more interesting is the absence of governments. They were the only entities comfortable with this kind of money. And it feels like they're completely asleep.
I think I'm confused here.
Is it fair to say that o3 does math and coding better than the average SWE?
If this is true, then I really don't understand why it hasn't made all the headlines.
Any explanation?
Greg Brockman to Elon Musk, (cc: Sam Altman) - Nov 22, 2015 6:11 PM
In response to this follow up, Elon first mentions that $100M is not enough. And that he is encouraging OpenAI to raise more money on their own and promises to increase the amount they can raise to $1B.
I found this on the OpenAI blog: https://openai.com/index/openai-elon-musk/
There is a couple of other messages there. With the vibe that OpenAI team felt a betrayal from Elon.
We're sad that it's come to this with someone whom we’ve deeply admired—someone who inspired us to aim higher, then told us we would fail, started a competitor, and then sued us when we started making meaningful progress towards OpenAI’s mission without him.
@habryka can you pls check the link? I think these messages could have added more context. Not sure why they weren't also included in the original source, though.
I'm surprised to see no discussion here or on Substack.
This is a well-structured article with accurate citations, clearly explained reasoning, and a peer review.. that updates the best agi timeline model.
I'm really confused.
I haven't deeply checked the logic to say if the update is reasonable (that's exactly the kind of conversation I was expecting in the comments). But I agree that Davidson's model was previously the best estimate we had, and it's cool to see that this updated version exlains why Dario/Sama are so confident.
Overall, this is excellent work, and I'm genuinely puzzled as to why it has received 10x fewer upvotes than the recent fictional 2y takeover scenario.