All of johncrox's Comments + Replies

Appreciate it. My sense is that the LW feed doesn't prioritize recent posts if they have low karma, so it's hard to get visibility on posts that aren't widely shared elsewhere and upvoted as a result. If you think it's a good post, please send it around!

(Criss-cross)

I claim that this is not how I think about AI capabilities, and it is not how many AI researchers think about AI capabilities. For a particularly extreme example, the Go-explore paper out of Uber had a very nominally impressive result on Montezuma's Revenge, but much of the AI community didn't find it compelling because of the assumptions that their algorithm used.

Sorry, I meant the results in light of which methods were used, implications for other research, etc. The sentence would better read, "My understanding (and I think ... (read more)

(Crossposted reply to crossposted comment from the EA Forum)

Thanks for the comment! In order:

I think that its performance at test time is one of the more relevant measures - I take grandmasters' considering fewer moves during a game as evidence that they've learned something more of the 'essence' of chess than AlphaZero, and I think AlphaZero's learning was similarly superior to Stockfish's relatively blind approach. Training time is also an important measure - but that's why Carey brings up the 300-year AlphaGo Zero milestone.

Indeed we are. And it's not c

... (read more)
4Rohin Shah
(Continuing the crossposting) Mostly agree with all of this; some nitpicks: I claim that this is not how I think about AI capabilities, and it is not how many AI researchers think about AI capabilities. For a particularly extreme example, the Go-explore paper out of Uber had a very nominally impressive result on Montezuma's Revenge, but much of the AI community didn't find it compelling because of the assumptions that their algorithm used. Tbc, I definitely did not intend for that to be an actual metric. I would say that I have a set of intuitions and impressions that function as a very weak prediction of what AI will look like in the future, along the lines of that sort of metric. I trust timelines based on extrapolation of progress using these intuitions more than timelines based solely on compute. To the extent that you hear timeline estimates from people like me who do this sort of "progress extrapolation" who also did not know about how compute has been scaling, you would want to lengthen their timeline estimates. I'm not sure how timeline predictions break down on this axis.

Hopefully. Yeah, I probably could have used better shorthand.