Douglas_Knight comments on K-complexity of everyday things - Less Wrong

11 Post author: cousin_it 04 December 2011 02:54PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (16)

You are viewing a single comment's thread.

Comment author: Douglas_Knight 05 December 2011 01:53:05AM *  8 points [-]

If anyone is curious about regular archivers, Joyce became less compressible throughout his life. The compression ratios (bytes per 100 characters) for Dubliners, Portrait, Ulysses, and Wake are: by gzip -9: 38, 38, 42, 47; for paq8l -7: 24, 24, 26, 33. LZMA and PPMd interpolate these numbers in unsurprising ways. Dubliners and Portrait seem about as compressible as other fiction in English.

Of course, I performed these calculations using a server in Australia, where Finnegans Wake is in the public domain.

This comment was prompted by Finnegans Wake seeming like an odd choice of a novel. War and Peace is a more prototypical novel, so you probably didn't mean anything by the choice.

Can anyone suggest other hard to compress novels?

Comment author: cousin_it 05 December 2011 02:31:02AM *  1 point [-]

Thanks for the pointer to paq8l. And it won the Hutter Prize too! That's funny because my post can be viewed as a comment on the relevance of the Hutter Prize.

Finnegans Wake was just my first idea for "novel that's hard to compress".