ChristianKl comments on A pair of free information security tools I wrote - Less Wrong

17 Post author: Nanashi 11 April 2015 11:03PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (97)

You are viewing a single comment's thread.

Comment author: ChristianKl 13 April 2015 08:56:31PM 2 points [-]

The "decoy" pictures are indistinguishable from any other picture on your or your recipients' camera rolls, and unless you have the passphrase, then the original image is thoroughly inaccessible.

What does "indistinguishable" mean in that sentence? Do you claim that a skilled attacker can't know that there metadata added?

Comment author: Nanashi 13 April 2015 10:02:07PM 2 points [-]

Yes. Without the password, even a skilled attacker cannot confirm the presence of any metadata.

Comment author: ChristianKl 13 April 2015 10:38:36PM 2 points [-]

What do you mean with "confirm"? Can a attacker show that the image isn't of the type produced by the normal photo app?

Comment author: Nanashi 14 April 2015 01:17:28AM 1 point [-]

"Confirm" meaning an attacker cannot demonstrate with ~100% certainty that the image isn't of the type that could normally be found on the camera roll.

Comment author: Nornagest 13 April 2015 11:55:37PM *  3 points [-]

I don't think that attack is practical, as long as Decoy leaves the metadata alone and works only on the image data. You'd need to reproduce the inputs to a particular implementation of the image encoding exactly, which is impossible unless you're snooping the raw data -- my phone camera produces images in JPEG format (high quality, but it's still lossy compression) and does the conversion before the raw image data even leaves RAM.

If you're dealing with images originating off the device, things get both easier and more difficult. Easier because there will typically be unchanged images in the wild to compare against; more difficult because there will typically be several different copies of an image floating around, and I don't think it's practical to reconstruct every possible chain of encodings. Many popular image-hosting sites, for example, reencode everything they get their grubby little paws on. Send an image as a text, that's another reencoding. And so forth.

As I've mentioned elsewhere, though, decoy images may be statistically distinguishable from an untouched JPEG even if you can't conclusively match it to an origin or e.g. validate against its EXIF tags -- though I could be proven wrong here with the right analysis, and I'd like to be.

Comment author: Nanashi 14 April 2015 01:36:06AM 2 points [-]

Your first paragraph nails it. Unless your phone is both jail broken and seriously compromised, there is no means of viewing the "original" version of either picture. Also re: the second paragraph. The app forces you to take a picture from your device to use as the "Decoy", it will not allow you to use an off-device image. (You CAN use an off-device image as the hidden picture).

As for the statistical analysis, it's mostly irrelevant. The encoding algorithm is both reversible and published. So you can extract "Decoy data" from ANY picture that you find, Decoy or no. The only thing that will confirm it one way or the other is a successful decryption. The best you could do is say, "Based on certain telltales, there's a 10% chance this image is a Decoy" or whatever the odds may be.

Such an attack has little to no value. If you are an attacker with a specific target, isolating which pictures are decoys removes a trivial amount of entropy from the equation, especially compared to the work of trying to brute-force an AES-encrypted ciphertext.

Comment author: Nornagest 14 April 2015 05:17:05AM *  4 points [-]

As for the statistical analysis, it's mostly irrelevant. The encoding algorithm is both reversible and published. So you can extract "Decoy data" from ANY picture that you find, Decoy or no.

I understand that, and I understand that it should be impractical to decrypt the hidden image without its key given that strong attacks on AES have not yet been publicly found (key exchange difficulties, which are always considerable, aside). But I think you're being far too nonchalant about detection here. The fact that you can extract "decoy data" from any image is wholly irrelevant; it's the statistical properties of those bits that I'm interested in, and with maybe a million bits of data to play with, the bias per bit does not have to be very high for an attacker to be very confident that some kind of steganography's going on.

That does not, of course, prove that it's being used to hide anything interesting from an attacker's point of view; but that was never the point of this objection.

Comment author: Nanashi 14 April 2015 02:01:27PM *  0 points [-]

--moved to previous comment8

Comment author: Nanashi 14 April 2015 10:33:08AM *  2 points [-]

Well, my point has never been that it's impossible for an attacker to be confident that you're using steganography. Rather it's that an attacker cannot prove with certainty.

The "decoy picture" aspect of the protocol is intended to provide social protection and ensure plausible deniability can be maintained. It is not intended as cryptographic protection, that is what the AES is for.

"Confidence" is only useful to an attacker when it comes to determining a target. But an attacker has to already be confident in order to perform such a test in the first place. Which means you've already been selected as a target. Furthermore they would have to compromise enough of your security to access your image data. If that happens, then the benefit of gaining further confidence is marginal at best.

Incidentally, regarding the specific details of such a detection method:

We (and the attacker) already know that the distribution of base64 characters in an AES-encrypted ciphertext is approximately random and follows no discernible pattern. We also know that the ciphertext is encoded into the last 2 bits of each 8-bit pixel. So, we can, with X amount of confidence, show that an image is not a Decoy if we extract the last 2 bits of each pixel and discover the resulting data is non-randomly distributed.

However, because it is possible for normal, non-Decoy, compressed JPEGs to exhibit a random distribution of the data in the last 2 bits of each pixel, the presence of randomness does not confirm that an image is a Decoy.

The only viable attack here would be to pull images which are "visually similar" (a trivial task by simply using Google image search), reduce them to the same size, compress them heavily, and then examine the last 2 bits of each of their pixels. If there is a significant difference in the randomness of the control images vs. the randomness of the suspected image, you could then suggest with X% confidence that the suspected image has been tampered with.

However, because it is possible for an image to be tampered with and yet NOT be a Decoy image, even then you could still not, with any legitimate amount of confidence, use such a test to state that an image is a Decoy.

Comment author: ChristianKl 14 April 2015 12:37:47AM 1 point [-]

though I could be proven wrong here with the right analysis, and I'd like to be.

If you would put a probability on it, how likely would you expect a proper security audit to prove you wrong?

Comment author: Nornagest 14 April 2015 05:04:18AM *  1 point [-]

Wrong on what count? I intended that sentence to refer only to the last paragraph of my post, and I'd expect that to be very implementation-dependent. Generally speaking, the higher the compression ratio the more perfectly random I'd expect the low bits to be -- but even at low ratios I'd expect them to be pretty noisy. I'm fairly confident that some JPEG implementations would leave distinguishable patterns when fed some inputs, but I don't have any good way of knowing how many or how easily distinguishable. To take a shot in the dark, I'm guessing there's maybe a 30% chance that an arbitrarily chosen implementation with arbitrarily chosen parameters would be easily checked in this way? That's mostly model uncertainty, though, so my error bars are pretty wide.

If we exclude that sort of statistical analysis, I'd estimate on the order of a 10 or 20% chance that Decoy images are distinguishable as such by examining metadata or other non-image traces -- but that comes almost entirely from the fact that I haven't read Nanashi's code, I'm not a JPEG expert, and security is hard. A properly done implementation should not be vulnerable to such an attack; I just don't know if this is properly done.

Comment author: Nanashi 14 April 2015 01:44:11AM 2 points [-]

.01%

Comment author: itaibn0 14 April 2015 02:16:47PM 2 points [-]

How much money are you willing to bet on that?

If the amount is less than $50,000, I suggest you just offer it all as prize to whoever proves you wrong. The value to your reputation will be more than $5, and due to transaction costs people are unlikely to bet with you directly with less than $5 to gain.

Comment author: Nanashi 14 April 2015 02:38:57PM *  2 points [-]

I'd be willing to bet 50% of the market value of a feasible distinguishing-attack against AES. Under the condition that whoever proves me wrong discloses their method to me and only me.

In other words: a shitload. Such an attack would be far more valuable than any sum I'd possibly be able to offer.

Comment author: Nornagest 13 April 2015 09:24:18PM *  2 points [-]

Short answer is I don't know. The long answer will take a little background.

I haven't bothered to read through Decoy's internals, but this sort of steganography usually hides its secret data in the least significant bits of the decoy image. If that data is encrypted (assuming no headers or footers or obvious block divisions), then it will appear to an attacker like random bytes. Whether or not that's distinguishable from the original image depends on whether the low bits of the original image are observably nonrandom, and that's not something I know offhand -- although most images will be compressed in some fashion and a good compression scheme aims to maximize entropy, so that's something. And if it's mostly random but it does fit a known distribution, then with a little more cleverness it should be possible to write a reversible function that fits the encrypted data into that distribution.

It will definitely be different from the original image on the bit level, if you happen to have a copy of it. That could just mean the image was reencoded at some point, though, which is not unheard of -- though it'd be a little suspicious if only the low bits changed.

Comment author: khafra 07 May 2015 01:20:09PM 1 point [-]

If that data is encrypted (assuming no headers or footers or obvious block divisions), then it will appear to an attacker like random bytes. Whether or not that's distinguishable from the original image depends on whether the low bits of the original image are observably nonrandom, and that's not something I know offhand

It's super-easy to spot in a histogram, so much so that there's ongoing research into making it less detectable.

Comment author: Nanashi 13 April 2015 10:08:59PM *  2 points [-]

You're mostly correct. The data is encrypted, and then broken into a base-4 string. The least significant base-4 bit is dropped from each pixel leaving 98.4% fidelity, which is higher fidelity than the compression that gets applied. Thus in terms of image quality, the picture is indistinguishable from any other compressed image.

The encoding is deliberately reversible and also open-sourced. However, you can apply the same algorithm to any image, whether it's a decoy or not, and get a string of possibly-encrypted-data. The only confirmation that the data is meaningful would be a successful decryption which is only possible with the correct passphrase.

All that said, the fact that the picture is indistinguishable from other non-decoy images only adds a trivial amount of entropy to the encryption. An attacker who is determined to brute force their way into your pictures can simply attempt to crack every picture in your camera roll, decoy or no.

Comment author: Pentashagon 15 April 2015 07:48:48AM 2 points [-]

Does it change the low bits of white (0xFFFFFF) pixels? It would be a dead giveaway to find noise in overexposed areas of a photo, at least with the cameras I've used.

Comment author: Nanashi 15 April 2015 11:05:08AM 3 points [-]

It does. Taking a picture of a solid white or black background will absolutely make it easier for an attacker with access to your data to be more confident that steganography is at work. That said there are some factors that mitigate this risk.

  1. The iPhone's camera, combined with its JPG compression, inserts noise almost everywhere. This is far from exhaustive but in a series of 10 all-dark and 10 all-bright photos, the noise distribution of the untouched photos was comparable to the noise distribution of the decoy. Given that I don't control either of these, I'm not counting on this to hold up forever.

  2. The app forces you to take a picture (and disables the flash) rather than use an existing one, lessening the chances that someone uses a noiseless picture. Again though, someone could still take a picture of a solid black wall.

Because of this, the visual decoy aspect of it is not meant as cryptographic protection. It's designed to lessen the chances that you will become a target. Any test designed to increase confidence in a tampered image requires access to your data which means the attacker has already targeted you in most cases. If that happens, there are other more efficient ways of determining what pictures would be worth attacking.

My original statement was that an attacker cannot confirm your image is a Decoy. They can raise their confidence that steganography is taking place. But unless a distinguishing attack against full AES exists, they can't say with certainty that the steganography at work is Decoy.

TL;DR: the decoy aspect of things is basically security through obscurity. The cryptographic protection comes from the AES encryption.

Comment author: ChristianKl 15 April 2015 02:29:05PM *  2 points [-]

The iPhone's camera, combined with its JPG compression, inserts noise almost everywhere.

The fact that it distributes noise doesn't mean that the noise is uniformly distributed. It likely doesn't put the same noise in an area with is uniformly colored and an area that isn't uniformly colored.

My original statement was that an attacker cannot confirm your image is a Decoy. They can raise their confidence that steganography is taking place. But unless a distinguishing attack against full AES exists, they can't say with certainty that the steganography at work is Decoy.

I can't say with certainty either that the sun will rise tomorrow.

Comment author: dxu 15 April 2015 03:40:56PM 1 point [-]

I can't say with certainty either that the sun will rise tomorrow.

This seems like deliberate misinterpretation of Nanashi's point. You can't say with certainty that the Sun will rise tomorrow, but you can say so with extremely high probability. An attacker can't confirm that the image is a Decoy with a probability anywhere near as high.

Comment author: Nanashi 15 April 2015 08:13:44PM 1 point [-]

Correct. I'd assign a probability of, say, 99.999999999999999999% that the sun will rise tomorrow.

If I were an attacker analyzing the noise distribution of an image, I could say with maybe 10% probability that an image has been tampered with. From there I have to further reduce the probability because there are hundreds of ways an image could have been tampered with that aren't Decoy.

Comment author: Nanashi 15 April 2015 08:56:54PM *  2 points [-]

For what it's worth, here is a sample of the noise distribution of the iPhone's JPEG compression vs. Decoy

(iPhone on left, Decoy on right)

http://i.cubeupload.com/ujKps6.png

(Note that these are not the same picture, because Decoy does not save or store the original version of either photo. It's two pictures where I held the iPhone very close against a wall. So there's a slight color variation)

Comment author: Lumifer 16 April 2015 05:09:32PM 2 points [-]

http://i.cubeupload.com/ujKps6.png

That's pretty useless -- what you want is to look at some statistical measures of the empirical distributions of lower-order bits in these images. See e.g. this outdated page.