I've been following the company Imbue and their podcast Generally Intelligent since they started. They've said thoughtful and creative things in their podcast, and I think they are making impressive progress towards AGI considering their relatively smaller size.
Just wanting to keep people appraised. If they did hit on something unusually potent, would they get acquired by a larger actor? Hard to know.
What is it that they're releasing? In addition to their 70B param model...
11 sanitized and extended NLP reasoning benchmarks including ARC, GSM8K, HellaSwag, and Social IQa
An original code-focused reasoning benchmark
A new dataset of 450,000 human judgments about ambiguity in NLP questions
A hyperparameter optimizer for scaling small experiments to a 70B run
Infrastructure scripts for bringing a cluster from bare metal to robust high-utilization training
An interesting quote about their hyperparameter optimizer:
It is possible to run resource-efficient pre-training experiments that can effectively scale to a large model. Using CARBS, we could reliably predict the performance of any model with a given number of parameters according to well-defined scaling laws, lowering the barrier to entry to building large models.
I've been following the company Imbue and their podcast Generally Intelligent since they started. They've said thoughtful and creative things in their podcast, and I think they are making impressive progress towards AGI considering their relatively smaller size.
Just wanting to keep people appraised. If they did hit on something unusually potent, would they get acquired by a larger actor? Hard to know.
What is it that they're releasing? In addition to their 70B param model...
An interesting quote about their hyperparameter optimizer:
I think this quote is relevant to this discussion in the comment section of a different post.