If it's worth saying, but not worth its own post (even in Discussion), then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should be posted in Discussion, and not Main.
4. Open Threads should start on Monday, and end on Sunday.
I was writing a Markov text generator yesterday, and happened to have a bunch of corpora made up of Less Wrong comments lying around from a previous toy project. This quickly resulted in the Automated Gwern Comment Generator, and then the Automated sixes_and_sevens Comment Generator.
Anyone who's ever messed around with text generated from simple Markov processes (or taken the time to read the content of some of their spam messages) will be familiar with the hilarious, and sometimes strangely lucid, garbage they come out with. Here is AutoGwern:
(I should point out that I'm picking on Gwern because his contribution to LW means I have an especially massive text file with his name on it.)
Here's some AutoMe:
I have become weirdly fascinated by these. Although they ditch any comprehensible meaning, they preserve distinctive tics in writing style and vocabulary, and in doing so, they preserve elements of tone and sentiment. Without any meaningful content to try and parse, it's a lot easier to observe what the written style feels like.
On a less introspective note, it's also interesting to note how dumping out ~300 words maintains characteristics of Less Wrong posts, like edits at the end, or spontaneous patches of Markov-generated rot13.
Also Yvain could happen when I feel like my brain compels me to be the money pump.
I did this a while back using cobe: dumped in
... (read more)gwern.net
, all my IRC logs, LW comments etc. It seems that the key is to set the depth fairly high to get more meaningful language out, but then it works eerily well: with some curation, it really did sound like me! (As long as you were reading quickly and weren't trying to extract real meaning out of it. I wonder if a recurrent neural network could do even better?) You can see some at http://pastebin.com/tPGL300J - I particularly like