Right, yes, I'm not suggesting the iterated coding activity can or should include 'build an actual full-blown superhuman AGI' as an iterated step.
Are you advocating as option A, 'deduce a full design by armchair thought before implementing anything'? The success probability of that isn't 1%. It's zero, to as many decimal places as makes no difference.
My argument is not that AI is the same activity as writing a compiler or a search engine or an accounts system, but that it is not an easier activity, so techniques that we know don't work for other kinds of software – like trying to deduce everything by armchair thought, verify after-the-fact the correctness of an arbitrarily inscrutable blob, or create the end product by throwing lots of computing power at a brute force search procedure – will not work for AI, either.
Example: Most people would save a young child instead of an old person if forced to choose, and it is not not just because the baby has more years left, part of the reason is because it seems unfair for the young child to die sooner than the old person.
As far as I'm concerned it is just because the baby has more years left. If I had to choose between a healthy old person with several expected years of happy and productive life left, versus a child who was terminally ill and going to die in a year regardless, I'd save the old person. It is unfair that an innocent person should ever have to die, and unfairness is not diminished merely by afflicting everyone equally.
That would be cheap and simple, but wouldn't give a meaningful answer for high-cost bugs, which don't manifest in such small projects. Furthermore, with only eight people total, individual ability differences would overwhelmingly dominate all the other factors.
Sorry, I have long forgotten the relevant links.
We know that late detection is sometimes much more expensive, simply because depending on the domain, some bugs can do harm (letting bad data into the database, making your customers' credit card numbers accessible to the Russian Mafia, delivering a satellite to the bottom of the Atlantic instead of into orbit) much more expensive than the cost of fixing the code itself. So it's clear that on average, cost does increase with time of detection. But are those high-profile disasters part of a smooth graph, or is it a step function where the cost of fixing the code typically doesn't increase very much, but once bugs slip past final QA all the way into production, there is suddenly the opportunity for expensive harm to be done?
In my experience, the truth is closer to the latter than the former, so that instead of constantly pushing for everything to be done as early as possible, we would be better off focusing our efforts on e.g. better automatic verification to make sure potentially costly bugs are caught no later than final QA.
But obviously there is no easy way to measure this, particularly since the profile varies greatly across domains.
Because you couldn't. In the ancestral environment, there weren't any scientific journals where you could look up the original research. The only sources of knowledge were what you personally saw and what somebody told you. In the latter case, the informant could be bullshitting, but saying so might make enemies, so the optimal strategy would be to profess belief in what people told you unless they were already declared enemies, but base your actions primarily on your own experience; which is roughly what people actually do.
There are no small pauses in progress. Laws, and the movements that drive them, are not lightbulbs to be turned on and off at the flick of a switch. You can stop progress, but then it stays stopped. The Qeng Ho fleets, for example, once discontinued, did not set sail again twenty years later, or two hundred years later.
There also tend not to be narrow halts in progress. In practice, a serious attempt to shut down progress in AI, is going to shut down progress in computers in general, and they're an important enabling technology for pretty nearly everything else.
If you think any group of people, no matter how smart and dedicated, can solve alignment in twenty years of armchair thought, that means you think the AI alignment problem is, on the scale of things, ridiculously easy.
I'm asking you to stop and think about that for a moment.
AI alignment is ridiculously easy.
Is that really something you actually believe? Do you actually think the evidence points that way?
Or do you just think your proposed way of doing things sounds more comfortable, and the figure of twenty years sounds comfortably far enough in the future that a deadline that far off does not feel pressing, but still sooner that it would be within your lifetime? These are understandable feelings, but unfortunately they don't provide any information about the actual difficulty of the problem.
Modern crops are productive given massive inputs of high-tech industry and energy in the form of things like artificial fertilizers, pesticides, tractors. Deprived of these inputs, we won't be able to feed ourselves, let alone have spare food to burn as fuel.
Actually no, the physics wasn't the gating factor for nuclear energy. One scientist in the 1930s remarked that sure, nuclear fission would work in principle, but to get the enriched uranium, you would have to turn a whole country into an enrichment facility. He wasn't that far wrong; the engineering resources and electrical energy the US put into the Manhattan project, were in the ballpark of what many countries could've mustered in total.
Maybe the Earth is about to be demolished to make room for a hyperspace bypass. Maybe there's a short sequence of Latin words that summons Azathoth, and no way to know this until it's too late because no other sequence of Latin words has any magical effect whatsoever. It's always easy to postulate worlds in which we are dead no matter what we do, but not particularly useful; not only are those worlds unlikely, but by their very nature, planning what to do in those worlds is pointless. All we can usefully do is make plans for those worlds – hopefully a majority – in which there is a way forward.
I am arguing that it will never create an AGI with resources available to human civilization. Biological evolution took four billion years with a whole planet's worth of resources, and that still underestimates the difficulty by an unknown but large factor, because it took many habitable planets to produce intelligence on just one; the lower bound on that factor is given by the absence of any sign of starfaring civilizations in our past light cone; the upper bound could be in millions of orders of magnitude, for all we know.
Well, sure. By the time you've got universal consent to peace on Earth, and the existence of a single vaccine that stops all possible diseases, you've already established that you're living in the utopia section of the Matrix, so you can be pretty relaxed about the long-term future. Unfortunately, that doesn't produce anything much in the way of useful policy guidance for those living in baseline reality.
Sure. Hopefully we all understand that the operative words in that sentence are small and simple.