I'll be brief, omit needless words.
Intelligence is prediction is compression because
Compression is finding a code that makes the data shorter
And codeword lengths are probabilities
So codes are probability distributions
But probability distributions are prediction strategies.
Did your really need to say that you'd be brief? Wasn't it enough to say that you'd omit needless words? :)
But then he'd lose the Strunk and White allusion.