I actually did not read the linked thread until now, I came across this post from the front page and thought this was a potentially interesting challenge.
Regarding "in the concept of the fiction", I think this piece of data is way too human to be convincing. The noise is effectively a 'gotcha, sprinkle in /dev/random into the data'.
Why sample with 24 bits of precision if the source image only has 8 bits of precision, and it shows. Then why only add <11 bits of noise, and uniform noise at that? It could work well if you had a 16-bit lossless source ima
Morning progress so far:
I figured out how the values (and the noise) are generated.
The source image is an 8-bits per pixel color image, the source pixel value is chosen from one of the color channels using a bayer filter, with a blue filter at 0, 0.
The final value is given by: clamp(0, 2**24-1, (source_value * 65793 + uniform(0, 1676)))
, where uniform(x, y) is a uniformly chosen random value between x and y inclusive.
Without cracking the noise source, the best we can do to encode the noise itself is 465255 bytes.
...because 347475 pixels in the image
I have further discovered that in the bulk of the data, the awkward seventh bit is not in fact part of the values in the image, it is a parity bit. My analysis was confused by counting septets from the beginning of the file, which unfortunately seems incorrect.
Analyzing bi-gram statistics on the septets helped figure out that what I previously believed to be the 'highest' bit is in fact the lowest bit of the previous value, and that value always makes the parity of the septet even.
I was trying to ignore the header for now after failing to find values corr
Morning progress so far:
I figured out how the values (and the noise) are generated.
The source image is an 8-bits per pixel color image, the source pixel value is chosen from one of the color channels using a bayer filter, with a blue filter at 0, 0.
The final value is given by: clamp(0, 2**24-1, (source_value * 65793 + uniform(0, 1676)))
, where uniform(x, y) is a uniformly chosen random value between x and y inclusive.
Without cracking the noise source, the best we can do to encode the noise itself is 465255 bytes.
...because 347475 pixels in the image
I sat down before going to bed and believe I have made some more progress.
I experimented with what I called the sign bit in the earlier post, and I'm certain I got it wrong. By ignoring the sign bit, I can reconstruct a much higher fidelity image. I can also do a non-obvious operation of rotating the bit to least significant place after inverting. I can't visually distinguish the two approaches, though.
I wrote a naive debayering filter and got this image out: https://i.imgur.com/e5ydBTb.png (bit rotated version, 16-bit color RGB. Red channel on even rows
I have further discovered that in the bulk of the data, the awkward seventh bit is not in fact part of the values in the image, it is a parity bit. My analysis was confused by counting septets from the beginning of the file, which unfortunately seems incorrect.
Analyzing bi-gram statistics on the septets helped figure out that what I previously believed to be the 'highest' bit is in fact the lowest bit of the previous value, and that value always makes the parity of the septet even.
I was trying to ignore the header for now after failing to find values corr
I believe I had a good start analyzing the file, although I'm currently slightly stuck on the exact details of the encoding.
Spoilers ahead for people who want to try things entirely by themselves.
My initial findings were that the raw file easily compresses from 2100086 bytes to 846149 bytes with zpaq -m5, probably hard to beat its context modeling with other compressors or even manual implementations.
I wrote a Python script to reverse the intern's transformation and analyzed the bits, looks like the file is largely organized into septets of data. I dumpe
I sat down before going to bed and believe I have made some more progress.
I experimented with what I called the sign bit in the earlier post, and I'm certain I got it wrong. By ignoring the sign bit, I can reconstruct a much higher fidelity image. I can also do a non-obvious operation of rotating the bit to least significant place after inverting. I can't visually distinguish the two approaches, though.
I wrote a naive debayering filter and got this image out: https://i.imgur.com/e5ydBTb.png (bit rotated version, 16-bit color RGB. Red channel on even rows
The button shows up for me despite low karma. I have looked through the client-side code, and found this snippet:
This ... (read more)