keefe comments on What are you working on? April 2012 - Less Wrong

1 Post author: David_Gerard 01 April 2012 06:40PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (60)

You are viewing a single comment's thread. Show more comments above.

Comment author: keefe 10 April 2012 12:32:05PM 2 points [-]

I would start with something like reuters API, http://wordnet.princeton.edu/ and some research on these guys http://pdos.csail.mit.edu/scigen/ this is a fairly well studied problem by spammers, so I'd also work there

Comment author: imonroe 11 April 2012 05:04:40PM 0 points [-]

Thanks for the tips! I've been playing with the Alchemy API for NLP (http://www.alchemyapi.com/) and an API called DayLife (http://developer.daylife.com/) for news sources, etc.

I'm trying to do my best to make it as un-spammy as possible, but how far I can get with that remains to be seen. I have a plan to take advantage of the inverted pyramid story structure so common in news reporting, along with entity extraction on the paragraph level, to get something out of it that's more or less readable. I'll post an example when my prototype works.