This is a bad idea.
Categories of information like this are commonly used to say "this isn't false, but we want to have an excuse to censor it anyway". Look at how "malinformation" is already being misused.
Thank you for starting a discussion about this. I have two things to say:
1) In the post above, the "inftoxic" adjective means very much false, incorrect information. Additionally, it also means the falseness was intentionally "put in" the data or story with an intent to mislead, cause harm, etc. So, in fact, the term is different (and to me personally more useful) than the term "malinformation" (which I likewise find quite unhelpful).
2) Regardless of the usefulness of the terminology I used as an example, do you think that we could use new words in and around information, that could improve the way how we lead the debate in an attempt to be less wrong?
In the post above, the “inftoxic” adjective means very much false, incorrect information.
No, it doesn't. You've defined it to include harmful and deceptive information, not (or at least not just) false information. And censors love to claim that true things that their political opponents say are "harmful" and "deceptive" because someone might listen to them and draw a conclusion that favors their political opponents.
Epistemic status: Proposal for a new terminology, to make an important concept easier and faster to communicate.
TL;DR: This is a proposition to start using new words such as "inftoxic" to describe information or data intentionally created or spread to cause harm. The post is not meant to present the best way to do this but to see if other people would consider this a useful endeavor.
Background
We are becoming more and more surrounded by synthetic data (such as text or images generated by LLMs) and as these mix together with real-world, human-generated data, there is great need for ways how to distinguish between the two, as well as generally make sense of all the informational mess.
While writing about the risks arising from generating synthetic data with potential malicious uses, I realized that I was lacking the proper vocabulary to effectively describe "a dataset that was intentionally constructed to mislead". I have tried looking for a suitable adjective to describe such dataset, and having failed, I tried to ask GPT4 to help me find and coin a new word for it.
The conversation I had with the LLM was quite engaging at the time, and with some reflection there still seemed to be some valuable takeaways, so I decided to do this little writeup about it. (Read the original conversation only at your own risk, haha!)
Inftoxicity and related terms
After some brainstorming from the GPT, I asked it to elaborate on the word "inftoxic" as an adjective that can be used to describe data, information sources, systems, actors, or actions that had been created with, or are acting in, a malicious intent to damage someone or something, or manipulate or mislead in a negative way.
This would be in contrast to "biased", which describes something or someone arriving at incorrect or misleading conclusions, yet not necessarily producing a negative outcome or being a result of malicious intent.
Below is a bunch of derivative words GPT4 was able to flesh out (slightly edited by me). I am intentionally leaving in all ten it generated, both to amuse the reader and inform of this interesting capability for creating novel words and definitions.
Inftoxic - adjective
Inftox - noun, invariable
Inftoxicity - noun, singular (plural: inftoxicities)
Inftoxinator - noun, singular (plural: inftoxinators)
Inftoxify - verb
Inftoxifiable - adjective
Inftoxication - noun, singular (plural: inftoxications)
Inftoximeter - noun, singular (plural: inftoximeters)
Inftoxology - noun, singular (plural: inftoxologies)
Inftoxical - adjective
Inftoxosphere - noun, singular (plural: inftoxospheres)
Implications and further directions
I was part amused, part impressed and part intrigued by the new terminology above, which I was able to generate within minutes, yet which seemed potentially quite useful to the world at large (especially as we get to more widespread adoption of LLMs and as ByteDance is using GPT4 to create synthetic datasets for its own LLM training).
I continued the conversation asking about potential strategies for spreading this terminology so that there would be more awareness around information that is purposefully made or disseminated for malicious ends. I am not fully convinced I want to spread this terminology, but there is one approach suggested by GPT4 that I particularly liked:
"Spot the Inftox" Challenge
So, after getting all the way through my first post here, do you think this is worthy of spreading around? Want to help kick off the #SpotTheInftox challenge? Let me know in the comments. 🙏