"We'll be building a cluster of around 22,000 H100s. This is approximately three times more compute than what was used to train all of GPT4.
This bothers me. It's a naive way of seeing compute. It's like confusing Watts and Watt-hours
22,000 H100s is three times the amount of FLOP/s than what was used to GPT-4, so you could train it in 3x less time, of with 1/3 of your cluster and the same time.
I think this view of looking at compute helps making naive asumptions about what this compute can be used to. And FLOP/s are not a perfect unit for normal discourse when we're at x10¹⁵ scales.
If ancestor is parent/mother/grandparent etc but nothing else. Obviously non hunters.
If we count how many people dead or alive are you related to. Farmers.
Better than 90% of amateur fiction posted.
This gap will only widen over time; China is failing to develop a domestic semiconductor industry, despite massive efforts to do so, and is increasingly cut off from international semiconductor supply chains.
I would say this is a falsehood
The US export ban on Controlled GPUs has really made china push for local semiconductor manufacturing way, and accelerate their projects, they dont have 5nm TSMC quality wafers, fine, but they're developing the full stack.
I mean if this was a "The AGi Race Between the US and Russia doesnt exist" okay fine, but Seeing how more than half the papers that land in ArXiv have chinese authors in them, plus the whole China does 90% of electronic manufacturing in the world. I dont understand how you come to the conclusion that china is hopelessly dead in the water.
The day the US export ban on GPU happened, okay, most of us really wondered, but seeing how they're operating 6 months to a year afterwards, it just obvious that they will be able to make it happen.
I dont think the difficulty of the task has much with the outcome.
I mean, I take your comment at face value and update to "it's going to get powerful faster" and not the other way around.
A "moonshot idea" I saw brought up is getting Yudkowsky's Harry Potter fanfiction translated into Chinese (please never ever do this).
Can you expand on this? Why would it be a bad idea? I have interacted with mainland chinese people (outside of china) and I'm not really making the connection.
How do you think discharge rates would affect the battery? Would it behave like a LFP that basically outputs mostly the same rate of Amps in the full spectrum of charge (Voltage varies little with discharge %)
Do you think an approach like this generates bateries with long lifetimes?
Did you expect balancing of these particular batteries to be particularly complicated?
Basically, tell us more!
Well, Evals and that stuff OpenAI did with predicting loss could be a starting point to work in the tables.
But we dont really know, I guess that's the point EY is trying to make.
Use tables for concrete loads and compare experimentally with the to be poured concrete, if a load its off, reject it.
We dont even have the tables about ML. Start making tables, dont build big bridges until you got the fucking tables right.
Enforce bridge making no larger than the Yudkowski Airstrike Threshold.
I dont think this is a good take.
The Cybertruck does not break on that pull. It breaks on this one: 0:27