Thanks. Still not convinced, but it will take me a full post to explain why exactly. :)
Though possibly some of this is due to a difference in definitions. When you say this:
what I consider AGI - which importantly is fully general in that it can learn new things, but will not meet the bar of doing 95% of remote jobs because it's not likely to be human-level in all areas right away
Do you have a sense of how long you expect it will take for it to go from "can learn new things" to "doing 95% of remote jobs"? If you e.g. expect that it might still take several years for AGI to master most jobs once it has been created, then that might be more compatible with my model.
Hmm, some years back I was hearing the claim that self-driving cars work badly in winter conditions, so are currently limited to the kinds of warmer climates where Waymo is operating. I haven't checked whether that's still entirely accurate, but at least I haven't heard any news of this having made progress.
Thanks, this is the kind of comment that tries to break down things by missing capabilities that I was hoping to see.
Episodic memory is less trivial, but still relatively easy to improve from current near-zero-effort systems
I agree that it's likely to be relatively easy to improve from current systems, but just improving it is a much lower bar than getting episodic memory to actually be practically useful. So I'm not sure why this alone would imply a very short timeline. Getting things from "there are papers about this in the literature" to "actually sufficient for real-world problems" often takes a significant time, e.g.:
My general prior is that this kind of work - from conceptual prototype to robust real-world application - can in general easily take between years to decades, especially once we move out of domains like games/math/programming and into ones that are significantly harder to formalize and test. Also, the more interacting components you have, the trickier it gets to test and train.
Thanks. I think this argument assumes that the main bottleneck to AI progress is something like research engineering speed, such that accelerating research engineering speed would drastically increase AI progress?
I think that that makes sense as long as we are talking about domains like games / math / programming where you can automatically verify the results, but that something like speed of real-world interaction becomes the bottleneck once shifting to more open domains.
Consider an AI being trained on a task such as “acting as the CEO for a startup”. There may not be a way to do this training other than to have it actually run a real startup, and then wait for several years to see how the results turn out. Even after several years, it will be hard to say exactly which parts of the decision process contributed, and how much of the startup’s success or failure was due to random factors. Furthermore, during this process the AI will need to be closely monitored in order to make sure that it does not do anything illegal or grossly immoral, slowing down its decision process and thus whole the training. And I haven’t even mentioned the expense of a training run where running just a single trial requires a startup-level investment (assuming that the startup won’t pay back its investment, of course).
Of course, humans do not learn to be CEOs by running a million companies and then getting a reward signal at the end. Human CEOs come in with a number of skills that they have already learned from somewhere else that they then apply to the context of running a company, shifting between their existing skills and applying them as needed. However, the question of what kind of approach and skill to apply in what situation, and how to prioritize between different approaches, is by itself a skillset that needs to be learned... quite possibly through a lot of real-world feedback.
I think their relationship depends on whether crossing the gap requires grind or insight. If it's mostly about grind then a good expert will be able to estimate it, but insight tends to unpredictable by nature.
Another way of looking at my comment above would be that timelines of less than 5 years would imply the remaining steps mostly requiring grind, and timelines of 20+ years would imply that some amount of insight is needed.
we're within the human range of most skill types already
That would imply that most professions would be getting automated or having their productivity very significantly increased. My impression from following the news and seeing some studies is that this is happening within copywriting, translation, programming, and illustration. [EDIT: and transcription] Also people are turning to chatbots for some types of therapy, though many people will still intrinsically prefer a human for that and it's not affecting the employment of human therapists yet. With o3, math (and maybe physics) research is starting to be affected, though it mostly hasn't been yet.
I might be forgetting some, but the amount of professions left out of that list suggests that there are quite a few skill types that are still untouched. (There are of course a lot of other professions for which there have been moderate productivity boosts, but AFAIK mostly not to the point that it would affect employment.)
Doesn't that discrepancy (how much answers vary between different ways of asking the question) tell you that the median AI researcher who published at these conferences hasn't thought about this question sufficiently and/or sanely?
We know that AI expertise and AI forecasting are separate skills and that we shouldn't expect AI researchers to be skilled at the latter. So even if researchers have thought sufficiently and sanely about the question of "what kinds of capabilities are we still missing that would be required for AGI", they would still be lacking the additional skill of "how to translate those missing pieces into a timeline estimate".
Suppose that a researcher's conception of current missing pieces is a mental object M, their timeline estimate is a probability function P, and their forecasting expertise F is a function that maps M to P. In this model, F can be pretty crazy, creating vast differences in P depending how you ask, while M is still solid.
I think the implication is that these kinds of surveys cannot tell us anything very precise such as "is 15 years more likely than 23", but we can use what we know about the nature of F in untrained individuals to try to get a sense of what M might be like. My sense is that answers like "20-93 years" often translate to "I think there are major pieces missing and I have no idea of how to even start approaching them, but if I say something that feels like a long time, maybe someone will figure it out in that time", "0-5 years" means "we have all the major components and only relatively straightforward engineering work is needed for them", and numbers in between correspond to Ms that are, well, somewhere in between those.
Yeah I'm not sure of the exact date but it was definitely before LLMs were a thing.
Relative to 10 (or whatever) years ago? Sure I've made quite a few of those already. By this point it'd be hard to remember my past beliefs well enough to make a list of differences.
Due to o3 specifically? I'm not sure, I have difficulty telling how significant things like ARC-AGI are in practice, but the general result of "improvements in programming and math continue" doesn't seem like a huge surprise by itself. It's certainly an update in favor of the current paradigm continuing to scale and pay back the funding put into it, though.
Don't most browsers come with spellcheck built in? At least Chrome automatically flags my typos.