"Our model significantly outperforms a competitive baseline and generates funny jokes 16% of the time, compared to 33% for human-generated jokes."
From this paper:
Unsupervised joke generation from big data
Sasa Petrovic and David Matthews
The 51st Annual Meeting of the Association for Computational Linguistics - Short Papers (ACL Short Papers 2013)
Sofia, Bulgaria, August 4-9, 2013
Abstract
Humor generation is a very hard problem. It is difficult to say exactly what makes a joke funny, and solving this problem algorithmically is assumed to require deep semantic understanding, as well as cultural and other contextual cues. We depart from previous work that tries to model this knowledge using ad-hoc manually created databases and labeled training examples. Instead we present a model that uses large amounts of unannotated data to generate I like my X like I like my Y, Z jokes, where X, Y, and Z are variables to be filled in. This is, to the best of our knowledge, the first fully unsupervised humor generation system. Our model significantly outperforms a competitive baseline and generates funny jokes 16% of the time, compared to 33% for human-generated jokes.
From The Register:
It uses 2,000,000 noun-adjective pairs of words to draw up jokes "with an element of surprise", something the creators claim is key to good comedy.
...
jokes calculated by the software include:
- I like my relationships like I like my source code... open
They get bonus points for their metrics of LOcal Log-likelihood (aka LOL-likelihood) and Rank OF Likelihood (aka ROFL).
They also get demerits for not discussing the error bars on their estimates given that they had only five testers.