Some points:
Overall, really promising direction. I appreciate the writeup on new and improved edit methods - I had not been following the field closely, and was unaware we had advanced this much on the previously state of the art CRISPR/Cas9.
We tried to model a complex phenomenon using a single scalar, and this resulted in confusion and clouded intuition.
It's sort of useful for humans because of restriction of range, along with a lot of correlation that comes from looking only at human brain operations when talking about 'g' or IQ or whatever.
Trying to think in terms of a scalar 'intelligence' measure when dealing with non-human intelligences is not going to be very productive.
It's conceivable that current level of belief in homeopathy is net positive in impact. The idea here would be that the vast majority of people who use it will follow up with actual medical treatment if homeopathy doesn't solve their problem.
Assume also that medical treatment has non-trivial risks compared to taking sugar pills and infinitely dilute solutions (stats on deaths due to medical error support this thesis). And further that some conditions just get better by themselves. Now you have a situation where, just maybe, doing an initial 'treatment' with homeopathy gives you better outcomes because it avoids the risks associated with going to the doctor.
Probably not true. But the lack of any striking death toll from this relatively widespread belief makes me wonder. The modal homeopathy fan (of those I've personally known) has definitely been more along the lines of 'mild hypochondriac who feels reassured by their bank of strangely labeled sugar pills' than 'fanatic who will die of appendicitis due to complete lack of faith in modern medicine'.
It's not clear to me how you get to deceptive alignment 'that completely supersedes the explicit alignment'. That an AI would develop epiphenomenal goals and alignments, not understood by its creators, that it perceived as useful or necessary to pursue whatever primary goal it had been set, seems very likely. But while they might be in conflict with what we want it to do, I don't see how this emergent behavior could be such that it would be contradict the pursuit of satisfying whatever evaluation function the AI had been trained for in the beginning. Unless of course the AI made what we might consider a stupid mistake.
One design option that I haven't seen discussed (though I have not read everything ... maybe this falls in to the category of 'stupid newbie ideas') is that of trying to failsafe an AI by separating its evaluation or feedback in such a way that it can, once sufficiently superhuman, break in to the 'reward center' and essentially wirehead itself. If your AI is trying to move some calculated value to be as close to 1000 as possible, then once it understands the world sufficiently well it should simply conclude 'aha, by rooting this other box here I can reach nirvana!', follow through, and more or less consider its work complete. To our relief, in this case.
Of course this does nothing to address the problem of AI controlled by malicious human actors, which will likely become a problem well before any takeoff threshold is reached.
I am not so sure about that. I am thinking back to the Minnesota Twin Study here, and the related fact that heritability of IQ increases with age (up until age 20, at least). Now, it might be that we're just not great at measuring childhood IQ, or that childhood IQ and adult IQ are two subtly different things.
But it certainly looks as if there's factors related to adult brain plasticity, motivation (curiosity, love of reading, something) that continue to affect IQ development at least until the age of 18.