Introduction
My goal is to register and share my expectations and hear others' opinions on their expectation for the relative performances of Gemini VS GPT-4.
My expectations
GPT-4 to Gemini will likely not be as big a jump in capabilities as GPT-3 to GPT-4 was.
Gemini could bring surprises by being more agentic than GPT-4. Being better at planning and longer horizon tasks. But this is likely difficult to achieve, or strong LLM agents would already be making the buzz.
Comparison
From GPT-3 to GPT-4
- Scaling Factor: x100 more compute than GPT-3.
- Optimization: Chinchilla scaling laws (for MoE) over OpenAI/Kaplan scaling laws.
- MoE Over Dense: Utilizes Mixture of Experts (MoE) instead of dense layers.
- Data Quality: Likely higher-quality data, not sure.
- Image Generation: Not publicly released, possibly due to subpar performance or security risks.
- Tools are added during finetuning.
- Algorithmic Gains: 3 years between GPT-3 and GPT-4.
- GPT-4 may already employ process-based feedback.
- GPT-4 aimed for training compute efficiency. GPT-4 was not designed to be commercially deployed at scale.
GPT-4 to Gemini
- Scaling Factor: ~x5 (x20) more compute than GPT-4.
- Supercomputer Constraint: No existing supercomputer could feasibly provide x100 more compute than used for GPT-4. (Not sure but likely)
- Multimodal: maybe image, audio, speech.
- Data Efficiency: Possibly better quality data like Google Books, fewer epochs.
- Tools could be added either during finetuning or pretraining.
- Algorithmic Gains: ~1 year between GPT-4 and Gemini.
- Gemini more likely aims for inference efficiency, given its intended extensive usage by Google. Maybe sacrificing training efficiency.
- Gemini trained to be more agentic, better at planning, etc. ("GPT-4 + AlphaGo").
Note: I drafted that before news of Gemini's release and capabilities but failed to finish writing... Since then, there have been some reports of Gemini being roughly at the level of GPT-4...
This comes from OpenAI saying they didn't expect ChatGPT to be a big commercial success. It was not a top-priority project.