All of dsbowen's Comments + Replies

I think this nicely lays out the fundamental issue: If we're going to develop powerful AI, we need to make sure that either 1) it isn't capable of doing anything extremely harmful (absence of harmful knowledge), or 2) it will refuse to do anything extremely harmful (robust safety mechanisms against malicious instructions). Ideally, we'll make progress on both fronts. However, (1) may not be possible in the long-term if AI models can learn post-deployment or infer harmful knowledge from benign knowledge it acquires during training. Therefore, if we're going... (read more)

2Tom DAVID
* Your first two bullet points are very accurate; it would indeed be relevant to continue by addressing these points further. * Finally, regarding your last bullet point, I agree. Currently, we do not know if it is possible to develop such safeguards, and even if it were, it would require time and further research. I fully agree that this should be made more explicit!!