You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

sixes_and_sevens comments on Open Thread, May 19 - 25, 2014 - Less Wrong Discussion

2 Post author: somnicule 19 May 2014 04:49AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (289)

You are viewing a single comment's thread.

Comment author: sixes_and_sevens 19 May 2014 05:02:36PM 4 points [-]

A while ago I mentioned how I'd set up some regexes in my browser to alert me to certain suspicious words that might be indicative of weak points in arguments.

I still have this running. It didn't have the intended effect, but it is still slightly more useful than it is annoying. I keep on meaning to write a more sophisticated regex that can somehow distinguish the intended context of "rather" from unintended contexts. Natural language is annoying and irregular, etc., etc.

Just lately, I've been wondering if I could do this with more elaborate patterns of language. It's recently come to my attention that expressions of the form "in saying [X] (s)he is [Y]" is often indicative of sketchy value-judgement attribution. It's also very easy to capture with a regex. It's gone in the list.

So, my question: what patterns of language are (a) indicative of sloppy thinking, weak arguments, etc., and (b) reliably captured by a regex?

(In the back of my mind, I am imagining some sort of sanity-equivalent of a spelling and grammar check that you can apply to something you've just written, or something you're about to read. This is probably one of those projects I will start and then abandon, but for the time being it's fun to think about.)

Comment author: TsviBT 20 May 2014 03:16:42PM 1 point [-]

"[...]may be the case[...]"

Sometimes this phrase is harmless, but sometimes it is part of an important enumeration of possible outcomes/counterarguments/whatever. If "the case" does not come with either a solid plan/argument or an explanation why it is unlikely or not important, then it is often there to make the author and/or the audience feel like all the bases have been covered. E.g.,

We should implement plan X. It may be the case that [important weak point of X], but [unrelated benefit of X].

Comment author: satt 26 May 2014 08:36:09PM 1 point [-]

The pair "tend to always" or "always tend to". Sometimes they come off to me as a way to exploit the rhetorical force of "always" while committing only to a hedged "tend to", in which case they can condense a two-step of terrific triviality into three words. There are likely other phrases that can provide plausibly deniable pseudo-certainty but I can't think of any.

More generally, the Unix utility diction tries to pick out "frequently misused, bad or wordy diction", which is a kinda related precedent.

Comment author: sixes_and_sevens 26 May 2014 11:12:15PM 1 point [-]

two-step of terrific triviality

When they come in the form of portentous pronouncements, Daniel Dennett calls these "deepities"; ambiguous expressions having one meaning which is trivially true but unimportant, and another that is obviously false but would be earth-shatteringly significant if it were true.

Also related in cold reading is the Rainbow Ruse.

Comment author: moridinamael 19 May 2014 06:53:27PM *  1 point [-]

I had the notion a while ago to try to write a linter to aid in tasks beyond code correctness by automatically detecting the desired features in a plethora of objects. Kudos on actually doing it and in a not hare-brained fashion.

Comment author: Punoxysm 20 May 2014 03:38:26AM 0 points [-]

As a former Natural Language Processing researcher, the technology definitely exists. Using general vocabulary combined with many (semi-manually generated) regexes to figure out argumentative or weaselly sentences with decent accuracy should be doable. It could improve over time if you input exemplar sentences you came across.

Comment author: sixes_and_sevens 20 May 2014 11:38:10AM *  2 points [-]

Do you have a recommendation for a good language-agnostic text / reference resource on NLP?

ETA: my own background is a professional programmer with a reasonable (undergrad) background in statistics. I've dabbled with machine learning (I'm in the process of developing this as a skill set) and messed around with python's nltk. I'd like a broader conceptual overview of NLP.

Comment author: Punoxysm 20 May 2014 08:13:48PM 2 points [-]

I'd recommend this book for a general overview : http://nlp.stanford.edu/fsnlp/

However, tasks like parsing are unnecessary for many tasks. A simple classifier on a sparse vector of word counts can be quite effective as a starting point in classifying sentence/document content.