Omid comments on Stupid Questions March 2015 - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (199)
How do I make a hash? In case I'm using the wrong word, I want to encrypt a message, then put the encrypted message in a public place, then decrypt the message in a way that proves that I actually encrypted a message (I didn't just write a nonsense string and later retcon an encryption scheme that makes it meaningful).
Let's make sure we're talking about cryptographic hashing functions.
Let's say I make a prediction that Barack Obama is a werewolf. I don't want anyone to know I've made this prediction, because otherwise the Secret Werewolf Police will come and eat me, but I do want to lord it over everyone after the fact.
I take the string "Barack Obama is a werewolf" and I put it through a hashing function, (for purposes of this example, the hashing function MD5). This produces the output 37ecb0a3164e6422bedc0f8db82e45ec. The original string about Barack Obama is not recoverable from this output, because MD5 is a lossy function, but anyone else putting that string through the MD5 hashing function will get the same output. The hash output is like a fingerprint for the original string.
So if I put 37ecb0a3164e6422bedc0f8db82e45ec in a public place a year before Barack Obama transforms into a werewolf on live television, after the fact I can give people the original string and they can verify it for themselves against the hash output.
I use a Chrome plugin for most tasks that involve basic encoding of text values. (If I'm honest, I mostly use it for rot13, and more exotic uses aren't that common). There are also online hash generators for different hashing functions. Some popular hashing functions for this purpose are MD5 and SHA1.
Does this answer your question?
You understood my question better than I did. Thank you.
Is the following paragraph correct?:
If I had unlimited computing power, I could search for all the inputs that return 37ecb0a3164e6422bedc0f8db82e45ec from the MD5 function. Then I could search those inputs and see which ones are meaningful sentences in English, and then make an educated guess at what the message is. But in reality, it would take too much computing power to find even a single string that returns 37ecb0a3164e6422bedc0f8db82e45ec.
It's not quite correct, but you've broadly got the thrust of it.
When two different inputs produce the same output from a hashing function, this is called a hash collision. Finding collisions in the SHA256 hashing function is part of how bitcoin mining works. It is very computationally expensive (which is kind of the point, re: bitcoin mining), but it's certainly tractable to find some input that generates the same output as another one.
If you were to try and search the space of all possible inputs for MD5, you'd quickly(ish) find an input that collided with the Obama Werewolf input, but it'd be garbage. If you had some system for quickly and reliably generating comprehensible English sentences, and just searched that space, you'd probably find a comprehensible English sentence that collided with it, but it would almost certainly not be the Obama Werewolf sentence. It's worth noting that MD5 will take an input of any length, and the space of possible comprehensible English sentences of unbounded length is basically infinite.
For short snippets of text, it's hard to find two comprehensible English sentences that collide under MD5, but in the link gjm provides above, there's a method for forcing the MD5 hash of a PDF by exploiting how MD5 works. For this reason and others, if you're doing any cryptographic heavy lifting, you probably want to use something more robust than MD5.
Really? Last I checked, the best known preimage attack against MD5 was too slow to be practical. Finding collisions is drastically easier, though I don't know any method for doing it with arbitrary plain-text English sentences.