All of HunterJay's Comments + Replies

I agree, I definitely underestimated video. Before publishing, I had a friend review my predictions and they called out video as being too low, and I adjusted upward in response and still underestimated it. 

I'd now agree with 2026 or 2027 for coherent feature film length video, though I'm not sure if it would be at feature film artistic quality (including plot). I also agree with Her-like products in the next year or two!

Personally I would still expect cloud compute to still be used for robotics, but only in ways where latency doesn't matter (like a p... (read more)

4mana
2 days ago and I might already have to adjust the timelines. Nvidia's new Digits costs 3K and is the size of a mac mini. Two of them can supposedly run a 400B parameter language model which is crazy. So maybe the hardware issues aren't as persistent for robotics. And also Hailuo has a single-image reference mode now that works like a lora. It's super consistent for faces, even if the rest is a bit quirky.

One year and 3 months on, I'm reviewing my predictions! Overall, I mark 13 predictions as true or mostly true, 6 as false or mostly false, and 3 as debatable.

Rest of 2023

  • Small improvements to LLMs
    • Google releases something competitive to ChatGPT.
      • Mostly True | Google had already released Bard at the time, which sucked, but this was upgraded to Gemini and relaunched in December 2023. Gemini Ultra wasn’t released until February 2024 though, so points off for that.
    • Anthropic and OpenAI slightly improve GPT-4 and Claude2
      • True | GPT-4 Turbo and Claude 2.1 were
... (read more)
2mana
Gemini 1206 Exp has a 2 million token context window, even if that isn't the effective context it probably performs much better in that regard than gpt 4o and such. Haven't tested yet because I don't want to get ratelimited from ai studio incase they monitor that Frankly the "shorter" conversations I had at a few tens of thousand of tokens were already noticeably more consistent than before, e. g. it referenced previous responses significantly later

You might be right -- and whether the per-dollar gains were higher or lower than expected would be interesting to know -- but I just don't have any good information on this! If I'd thought of the possibility, I would have added it in Footnote 23 as another speculation, but I don't think what I said is misleading or wrong.

For what it's worth, in a one year review from Jacob Steinhardt, increased investment isn't mentioned as an explanation for why the forecasts undershot.

10x per year for compute seems high to me. Naïvely I would expect the price/performance of compute to double every 1-2 years as it has been forever, with overall compute available for training big models being a function of that + increasing investment in the space, which could look more like one-time jumps. (I.e. a 10x jump in compute in 2024 may happen because of increased investment, but a 100x increase by 2025 seems unlikely.) But I am somewhat uncertain of this.

For parameters, I definitely think the largest models will keep getting bigger, and for com... (read more)

I wrote this late at night, so to clarify and expand a little bit;

- "Work on more than one time scale" I think is actually an interesting idea to dwell on for a second. Like, when a person is trying to solve a problem, they will often pace back and forth, or talk, etc. They don't have to do everything in one pass, somehow the complex computation which lets them see and move around can work on a very fast time scale, while other problem solving is going on simultaneously, and only starts to effect motor outputs later on. That's interesting. The spinal cord ... (read more)

I'm somewhat skeptical that running out of text data will meaningfully slow progress. Today's models are so sample inefficient compared with human brains that I suspect there are significant jumps possible there. 

Also, as you say;
- Synthetic text data might well be possible (especially for domains where you can test the quality of the produced text externally (e.g. programming)
- Reinforcement-learning-style virtual environments can also generate data (and not necessarily only physics based environments either -- it could be more like playing games or using a computer).
- And multimodal inputs gives us a lot more data too, and I think we've only really scratched the surface of multimodal transformers today.

2Vladimir_Nesov
New untested ideas take unpredictable time to develop. Given the current timeline of pure compute/investment scaling, there is no particular reason for all bottlenecks to be cleared just in time for scaling to continue without slowing down. Hence the possibility of it slowing down at the upcoming possible bottlenecks of natural text data and available-on-short-notice hardware, which are somewhat close together. Sample efficiency (with respect to natural data) can in principle be improved, humans and some RL systems show it's possible, and synthetic data is a particular form this improvement might take. But it's not something that's readily available, known to subsume capabilities of LLMs and scale past them. Also, straying further from the LLM recipe of simulating human text might make alignment even more intractable. In a universe where alignment of LLMs is feasible within the current breakneck regime, the source of doom I worry about is an RL system that either didn't train on human culture or did too much reflection to remain within its frame. Compared to natural text, multimodal data and many recipes for synthetic data give something less valuable for improving model competence, reducing return on further scaling. When competence improvement slows down, and if AGI in the sense of human-level autonomous work remains sufficiently far away at that point, investment scaling is going to slow down as well. Future frontier models cost too much if there is no commensurate competence improvement.

I am honestly very surprised it became a front page post too! It totally is just speculation.

I tried to be super clear that these were just babbled guesses, and I was mainly just telling people to try to do same, rather than trusting my starting point here.

The other thing that surprised me is that there haven't been too many comments saying "this part is off", or "you missed trend X!". I was kind of hoping for that!

Agree on lower depth models being possible, a few other possibilities:

  • Smaller models with lower latency could be used, possibly distilled down from larger ones.

  • Compute improvements might make it practical onboard (like with Tesla's self-driving hardware inside the chest of their andriod).

  • New architectures could work on more than one time scale -- kind of like humans do. E.g. when we walk, not all of the processing is done in the brain. Your spinal cord can handle a tonne of it autonomously. (Will find source tomorrow).

  • LLM-type models could do t

... (read more)
1HunterJay
I wrote this late at night, so to clarify and expand a little bit; - "Work on more than one time scale" I think is actually an interesting idea to dwell on for a second. Like, when a person is trying to solve a problem, they will often pace back and forth, or talk, etc. They don't have to do everything in one pass, somehow the complex computation which lets them see and move around can work on a very fast time scale, while other problem solving is going on simultaneously, and only starts to effect motor outputs later on. That's interesting. The spinal cord doing processing independent of the brain thing I mentioned is evident in this older series of (rather horrible) experiments with cats: https://www.jstor.org/stable/24945006 - On the 'smaller models with lower latency', we already now see models like Minstral-7b outperforming 30b parameter models because of improvements in data, architecture, and training. I expect this trend to continue. If the largest models are capable of operating a robot out of the box, I think you could take those outputs, and use them to train (or otherwise distill down) the larger model to a more manageable size, more specialised for the task. - On the 'LLMs could do the parts with higher latency', just yesterday I saw somebody do something like this with GPT-4V, where they periodically uploaded a photograph of what was in front of them, and got GPT-4V to output instructions on how to find the super market (walk further forward, turn right, etc). Kind of worked, that's the sort of thing I was picturing here, leaving much more responsive systems to handle the low latency work, like balance, gripping, etc.  

I am extremely worried about safety, but I don't know as much about it as I do about what's on the edge of consumer / engineering trends, so I think my predictions here would be not useful to share right now! The main way it relates to my guesses here is if regulation successfully slows down frontier development within a few years (which I would support).

I'm doing the ARENA course async online at the moment, and possibly moving into alignment research in the next year or two, so hoping to be able to chat more intelligently on alignment soonish.

I broadly agree. I think AI tools are already speeding up development today, and on reflection I don't actually think AI being more capable than humans at modeling the natural world would be a discontinuous point on the ramp up to superintelligence, actually. 

It would be a point where AI gets much harder to predict, though, which is probably why it was on my mind when I was trying to come up with predictions.

Thanks, fixed. I did mean 3.5 to 4, not 3 to 4.

Side note -- France isn't a great example for your point here "France, for example, is a very old, well-established and liberal democracy." because the Fifth Republic was only established in 1958. It's also notable for giving the president much stronger executive powers compared with the Fourth Republic!

4alyssavance
The current constitution isn't that old (although 65 years is still longer than most democracies), but with brief interruptions, France has been a democracy for around 150 years, which is far longer than most countries can claim. 

In the spirit of doing low status things with high potential, I am working on a site to allow commissioning of fringe erotica and am looking to hire a second web developer.

The idea is to build a place where people with niche interests can post bounties for specific stories. In my time moonlighting as an erotic author, I've noticed a lack of good sites to do freelance erotic writing work. I think the reason for this is that most people think porn is icky, so despite there being a huge market for extremely niche content, the platforms currently available are... (read more)

Thanks Chris, but I think you linked to the wrong thing there, I can't see your post in the last 3 years of your history either!

2Chris_Leong
Sorry, fixed.

Aye, I agree it is not a solution to avoiding power seeking, only that there may be a slightly easier target to hit if we can relax as many constraints on alignment as possible.

Will check them out, thank you.

I like this story pitch! It seems pretty compelling to me, and a clever way to show the difficulty and stakes of alignment. Good luck!

I am curious if this has changed over the past 6 years since you posted this comment. Do you get the feeling that high profile researchers have shifted even further towards Xrisk concern, or if they continue with the same views as in 2016? Thanks!

8David Scott Krueger (formerly: capybaralet)
There has been continued progress at about the rate I would've expected -- maybe a bit faster.  I think GPT-3 has helped change people's views somewhat, as have further appreciation of other social issues of AI. 

I took the original sentence to mean something like "we use things external to the brain to compute things too", which is clearly true. Writing stuff down to work through a problem is clearly doing some computation outside of the brain, for example. The confusion comes from where you draw the line -- if I'm just wiggling my fingers without holding a pen, does that still count as computing stuff outside the brain? Do you count the spinal cord as part of the brain? What about the peripheral nervous system? What about information that's computed by the outsid... (read more)

1MikkW
I'm not sure that this is correct. While making the motions is needed to engage the process, the important processes are still happening inside of the brain- they just happen to be processes that are associated with and happen during handwriting, not when one is sitting idly and thinking
1[anonymous]
Yes but take this a step further.  If you assume that each synapse is 4 bytes of information (a bit sparse it's probably more than that), 86 billion neurons times 1000 synapses times  4 bytes = 344 terabytes.   How much information do you store when you have 3 fingers up when counting on your fingers?  How much data can a page of handwritten notes hold?   You can probably neglect it.  It doesn't add any significant amount of compute to an AI system to give it perfect, multi-megabyte, working memory.

I really enjoy'd this writeup! I'd probably even go a little bit on the pessimistic (optimistic?) side, and bet that almost all of this technology would be possible with only a few years of development from today -- though I suppose it might be 20 if development doesn't start/ramp up in earnest.

6Daniel Kokotajlo
If we put Elon Musk in charge of making it happen and gave him a hundred billion dollars, I think it could all be achieved in 5 - 10 years. If we had advanced-but-not-godlike AI, it would take less than a year. That's my prediction anyway.

Thanks!

That's a good point, I'll write up a brief explanation/disclaimer and put it in as a footnote.

Typo corrected, thanks for that.

I agree, it's more likely for the first AGI to begin on a supercomputer at a well-funding institution. If you like, you can imagine that this AGI is not the first, but simply the first not effectively boxed. Maybe its programmer simply implemented a leaked algorithm that was developed and previously run by a large project, but changed the goal and tweaked the safeties.

In any case, it's a story, not a prediction, and I'd defend it as plausible in that context. Any story has a thousand assumptions and events tha... (read more)

3Lucas2000
There are some additional it's/its mistakes on your text, e.g. here:
5Rafael Harth
Yeah, I don't actually disagree. It's just that, if someone asks "how could an AI actually be dangerous? It's just on a computer" and I respond by "here look at this cool story someone wrote which answers that question", they might go "Aha, you think it will be developed on a laptop. This is clearly nonsense, therefore I now dismiss your case entirely". I think you wanna bend over backwards to not make misleading statements if you argue for the dangers-from-ai-is-a-real-thing side. You're of course correct that any scenario with this level of detail is necessarily extremely unlikely, but I think that will be more obvious for other details like how exactly the AI reasons than it is for the above. I don't see anyone going "aha, the AI reasoned that X→Y→Z which is clearly implausible because it's specific, therefore I won't take this seriously". If I had written this, I would add a disclaimer rather than change the title. The disclaimer could also explain that "paperclips" is a stand-in for any utility function that maximizes for just a particular physical thing.