Kaj_Sotala comments on PSA: Eugine_Nier evading ban? - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (68)
Friend of mine did it via computational complexity: using gzip (as an approximation for KC) for attributing classical latin literature to their respective authors by checking which strings add the least additional complexity (due to shared writing styles, word choice, etc.) when compressed together and then clustering. Worked like a charm.
ETA: These were large bodies of text, however. Probably not gonna work for a bundle of comments, except for me, due to my overuse of "obviously", obviously.
I thought this was pretty impressive:
[...]
[...]
Difference was one of scale. Much easier when just taking three dozen? pieces of classical latin literature, some of which were different parts of the same opus magnum, then see them cluster to their respective authors and to the other parts of the same piece. More of a "put the pieces into the box" as opposed to a 100,000 pieces puzzle. In the latter case, you just know most of the puzzle pieces will either show the blue sky, or the blue sea, both a similar shade of blue.