We absolutely do need to "race to build a Friendly AI before someone builds an unFriendly AI". Yes, we should also try to ban Unfriendly AI, but there is no contradiction between the two. Plans are allowed (and even encouraged) to involve multiple parallel efforts and disjunctive paths to success.
Disagree, the fact that there needs to be a friendly AI before an unfriendly AI doesn't mean building it should be plan A, or that we should race to do it. It's the same mistake OpenAI made when they let their mission drift from "ensure that artificial general intelligence benefits all of humanity" to being the ones who build an AGI that benefits all of humanity.
Plan A means it would deserve more resources than any other path, like influencing people by various means to build FAI instead of UFAI.
Also mistakes, from my point of view anyway
As a child I read everything I could get my hands on! Mostly a couple of Silman's books. The appeal to me was quantifying and systematizing strategy, not chess itself (which I bounced off in favor of sports and math contests). E.g. the idea of exploiting imbalances, or planning by backchaining, or some of the specific skills like putting your knights in the right place.
I found these more interesting than Go books in this respect, both due to Silman's writing style and because Go is such a complicated game filled with exceptions that Go books get bogged down in specifics.
I'm not a chess player (have played maybe 15 normal games of chess ever) and tried playing LeelaPieceOdds on the BBNN setting. When LeelaQueenOdds was released I'd lost at Q odds several times before giving up; this time it was really fun! I played nine times and stalemated it once before finally winning, taking about 40 minutes. My sense is that information I've absorbed from chess books, chess streamers and the like was significantly helpful, e.g. avoid mistakes, try to trade when ahead in material, develop pieces, keep pieces defended.
I think the lesson is that a superhuman search over a large search space is much more powerful than a small one. With BBNN odds, Leela only has a queen and two rooks and after sacrificing some material to solidify and trade one of them, I'm still up 7 points and Leela won't enough material to miraculously slip out of every trade until I blunder. By an endgame of say, KRNNB vs KR there are only a small number of possible moves for Leela and I can just check that I'm safe against each one until I win. I'd probably lose when given QN or QR, because Leela having two more pieces would increase the required ratio of simplifications to blunders.
Donated the max to both. I can believe there's more marginal impact for Bores, but on an emotional level, his proximity, YIMBY work, and higher probability of winning make me very excited about Wiener.
While the singularity doesn't have a reference class, benchmarks do have a reference class-- we have enough of them that we can fit reasonable distributions on when benchmarks will reach 50%, be saturated, etc., especially if we know the domain. The harder part is measuring superintelligence with benchmarks.
Do games between top engines typically end within 40 moves? It might be that an optimal player's occasional win against an almost-optimal player might come from deliberately extending and complicating the game to create chances
Does this meaningfully reduce the probability that you jump out of the way of a car or get screened for heart disease? The important thing isn't whether you have an emotional fear response, but how the behavior pattern of avoiding generalizes.
Much of my hope is that by the time we reach a superintelligence level where we need to instill reflectively endorsed values to optimize towards in a very hands-off way rather than just constitutions, behaviors, or goals, we'll have figured something else out. I'm not claiming the optimizer advantage alone is enough to be decisive in saving the world.
To the point about tighter feedback loops, I see the main benefit as being in conjunction with adapting to new problems. Suppose that we notice AIs take some bad but non-world-ending action like murdering people; then we can add a big dataset of situations in which AIs shouldn't murder people to the training data. If we were instead breeding animals, we would have to wait dozens of generations for mutations that reduce murder rate to appear and reach fixation. Since these mutations affect behavior through brain architecture, they would have a higher chance of deleterious effects. And if we're also selecting for intelligence, they would be competing against mutations that increase intelligence, producing a higher alignment tax. All this means that we have less chance to detect whether our proxies hold up (capabilities researchers have many of these advantages too, but the AGI would be able to automate capabilities training anyway).
If we expect problems to get worse at some rate until an accumulation of unsolved alignment issues culminates in disempowerment, it seems to me there is a large band of rates where we can stay ahead of them with AI training but evolution wouldn't be able to.
Agree that your research didn't make this mistake, and MIRI didn't make all the same mistakes as OpenAI. I was responding in context of Wei Dai's OP about the early AI safety field. At that time, MIRI was absolutely being uncooperative: their research was closed, they didn't trust anyone else to build ASI, and their plan would end in a pivotal act that probably disempowers some world governments and possibly ends up with them taking over the world. Plus they descended from a org whose goal was to build ASI before Eliezer realized alignment should be the focus. Critch complained as late as 2022 that if there were two copies of MIRI, they wouldn't even cooperate with each other.
It's great that we have the FLI statement now. Maybe if MIRI had put more work into governance we could have gotten it a year or two earlier, but it took until Hendrycks got involved for the public statements to start.