Is there a useful heuristic for detecting rationally-challenged texts (as in Web pages, forum posts, facebook comments) which takes relatively superficial attributes such as formatting choices, spelling errors, etc. as input? Something a casual Internet reader may use to detect possibly unworthy content so they can suspend their belief and research the matter further. Let's call them "text smells" (analogue to code smells), like:
- too much emphasis in text (ALL CAPS, bold, color, exclamations, etc.);
- walls of text;
- little concrete data/links/references;
- too much irrelevant data and references;
- poor spelling and grammar;
- obvious half-truths and misinformation.
Since many crackpots, pseudoscientific con artists, and conspiracy theorists seem to have cleaned up their Web sites in recent years, I wonder do these low-cost baloney detection tools might be of real value. Does anyone know of any studies or analyses of correlation between these basic metrics and the actual quality of the content? Can you think of some other smells typical of Web baloney?
I think walls of text are matter of time and place. My recollection is that Paine, Jefferson, and Franklin wrote in very long paragraphs, with very long sentences in them.
For contemporary texts written for the web, I'd generally agree, and even if I don't think the writer is a crackpot, I'll stop reading because of the difficulty of visually tracking through a wall of text.