This kind of fine tuning is not a serious danger to humanity. First, it doesn't do anything beyond what the usual "Sure!" at the beginning of a response promt does (I don't think it's harmful to reveal this technique because it's obvious and known to many). Second, the most current models can do is to give away what already exists on the internet. Any recipes for drugs, explosives, and other harmful things are much easier to find in a search than to get them from Transformer models. Third, current and future models on the Transformer architecture do not pose an existential threat to humanity without serious modifications that are currently not available... (read more)
This kind of fine tuning is not a serious danger to humanity. First, it doesn't do anything beyond what the usual "Sure!" at the beginning of a response promt does (I don't think it's harmful to reveal this technique because it's obvious and known to many). Second, the most current models can do is to give away what already exists on the internet. Any recipes for drugs, explosives, and other harmful things are much easier to find in a search than to get them from Transformer models. Third, current and future models on the Transformer architecture do not pose an existential threat to humanity without serious modifications that are currently not available... (read more)