All of defun's Comments + Replies

Can someone create a forecasting question about which model will score better in benchmarks, Claude 3.5 Opus or GPT-5?

(5) Q1 2026: The next version comes online. It is released, but it refuses to help with ML research. Leaks indicate that it doesn't refuse to help with ML research internally, and in fact is heavily automating the process at its parent corporation. It's basically doing all the work by itself; the humans are basically just watching the metrics go up and making suggestions and trying to understand the new experiments it's running and architectures it's proposing.

@Daniel Kokotajlo, why do you think they would release it?

3Daniel Kokotajlo
Twas just a guess, I think it could go either way. In fact these days I'd guess they wouldn't release it at all; the official internal story would be it's for security and safety reasons.

The delta between how fast AI will affect software engineering and how fast AI will transform other (roughly speaking) white collar careers is relatively small.

Agree that the delta is "small", but it might be significant:

  1. LLMs are specially good at coding. Some reasons:
    1. Large amount of training data
    2. Most knowledge is text-based and explicit
    3. Big economic incentive (developers are specially expensive)
    4. Many companies are focused on AI for coding (Devin, GitHub, etc.) and they will probably advance fast because these teams are using their own product (Figma's CPO o
... (read more)