swe, speculative investor
I've always wondered, why didn't superpowers apply MAIM to nuclear capabilities in the past?
> Speculative but increasingly plausible, confidentiality-preserving AI verifiers
Such as?
I mean, I don't want to give Big Labs any ideas, but I suspect the reasoning above implies that the o1/deepseek -style RL procedures might work a lot better if they can think internally for a long time
I expect gpt 5 to implement this. Based on recent research and how they phrase it.
I thought OpenAI’s deep research uses the full o3?
Why are current LLMs, reasoning models and whatever else still horribly unreliable? I can ask the current best models (o3, Claude, deep research, etc) to do a task to generate lots of code for me using a specific pattern or make a chart with company valuations and it’ll get them mostly wrong.
Is this just a result of labs hill climbing a bunch of impressive sounding benchmarks? I think this should delay timelines a bit. Unless there’s progress on reliability I just can’t perceive.
SWEs won't necessarily be fired even after becoming useless
I'm actually surprised at how eager/willing big tech is to fire SWEs once they're sure they won't be economically valuable. I think a lot of priors for them being stable come from the ZIRP era. Now, these companies have quite frequent layoffs, silent layoffs, and performance firings. Companies becoming leaner will be a good litmus test for a lot of these claims.
https://x.com/arankomatsuzaki/status/1889522974467957033?s=46&t=9y15MIfip4QAOskUiIhvgA
O3 gets IOI Gold. Either we are in a fast takeoff or the "gold" standard benchmarks are a lot less useful than imagined.
I feel like a lot of manifold is virtue signaling .
Just curious. How do you square the rise in AI stocks taking so long? Many people here thought it was obvious since 2022 and made a ton of money.
Isn't it a distribution problem? World hunger has almost disappeared however. (The issue is hungrier nations have more kids, so progress is a bit hidden).