I'm a little disappointed that I was able to solve it so quickly, I thought it was going to be a big multi-stage puzzle where the color values and image dimensions/row boundaries would all be important factors...
Still, my solution (in JS) was a little more verbose than yours...
<img src="puzzle2.png" id="puzzle-image">
<script>
/*
Pink (white) 232,3,138
Blue (black) 58,185,229
*/
window.addEventListener("load", function() {
var image = document.getElementById("puzzle-image"),
canvas = document.createElement("canvas"),
context = canvas.getContext("2d");
canvas.width = image.width;
canvas.height = image.height;
context.drawImage(image, 0, 0, image.width, image.height);
// Test 1 - rle
(function() {
var rle = [], prev = null, count = 0;
for (let y = 0; y < image.height; y++) {
for (let x = 0; x < image.width; x++) {
let pd = context.getImageData(x, y, 1, 1).data;
if (prev === null) {
prev = pd[0];
}
if (pd[0] == prev) {
count++;
} else {
rle.push(count);
count = 1;
prev = pd[0];
}
}
}
var r1 = "";
for (let i = 0; i < rle.length; i++) {
r1 += String.fromCharCode(rle[i]);
}
console.log(r1);
var r2 = "";
for (let i = 0; i < r1.length; i += 3) {
r2 += String.fromCharCode(parseInt(r1[i] + r1[i + 1], 16));
}
console.log(r2);
}());
}, false);
</script>
#!/bin/sh
wget https://www.mi5.gov.uk/sites/default/files/u2/puzzle.png
convert puzzle.png puzzle.xpm # Convert the image to XPM with ImageMagick.
# Here is what the contents of puzzle.xpm looks like:
# http://pastebin.com/0PySSAXx. The spaces represent one color and the dots
# the other. We'll replace the spaces with underscores for readability.
<puzzle.xpm tr ' ' _ \
| tr -d "\n" \
| sed -e 's|","||g; s|^.*\*/"||; s|".*$|| # Remove metadata, C framing.' \
| sed -e 's|_\.|_\n.|g; s|\._|.\n_|g # New line on color change. ' \
| awk '{printf "%c", length($0)}' \
| xxd -r -p
The XPM image file format (https://en.wikipedia.org/wiki/X_PixMap) is great for doing things like this. I find it easier to read and to edit by hand than the Netpbm format family. It seems ImageMagick can't write XPM2, though, which makes extra preprocessing necessary.
OK, so, the obvious "congratulations" messages aside, what's really in that file?
Example: Take the PNG, convert it to another lossless format. Then convert that lossless format back to PNG.
Different file size, right? OK, the comment accounts for about 120 bytes.. but what about the rest? Why is it that much bigger?
"As I read, numbers I see. 'Twould be a great shame not to count this art among the great texts of our time" - count this art among the great texts... it's a one time pad, isn't it.
I unpacked the chunks and there doesn't look to be much more (or it's subtle).
The IDAT chunk (which makes up most of the file) is DEFLATE-d but uses full color (rather than palettized color, which would have saved lots of space, and might be how you'd get 40% compression). It also doesn't use any row filtering.
I checked incase something was hidden in the DEFLATE encoding itself (e.g. using sub-optimal block encoding selection) but this round-trips exactly, so I don't think it can be.
I'm always of the opinion that these tests are not necessarily to find someone that can solve the puzzle, it's to find those that can solve it and aren't egotistical enough to brag that they solved it online.
The count-the-pixels thing is too obvious, and doesn't explain why this is a compressed png that's almost twice the size of what you get when you decompress it.
"'Twould be a shame not to count this art among the great texts of our time" makes me think part of the file is a one-time pad against part of some famous book (based on the phrasing, and the Britishness of MI5, probably some 19th century British literary author, maybe Dickens or Austen). Assuming that's the case, the solution would be:
1. Figure out which 'text' they're referring to, or get the text of as many books as possible;
2. Take the chunk of the file between the PNG headers and the Comment;
3. XOR that chunk against the corpus with a sliding window;
4. Score each sample based on number of english words found in the output, keeping the top 10 as you go (or do some map/reduce crap because this would take a while);
5. Read the top outputs and see if anything makes sense.
This would also explain why the file is a bit longer than you'd think it should be based on its visual contents.
If anyone wants to try it, this might be a good start:
I ran `convert -compress None puzzle.png` and the output was ~800 bytes. Clearly that's optimizing the image too, but basically the image seems wastefully large, and that makes me suspicious there's something going on there.
I have a coding challenge for them. Sadly, it's a really easy one that they missed on so far: Do a better-than-random shuffle so the did-you-know questions lower down on that page don't repeat so often.
pngtopnm converts the image to an uncompressed bytestream format (pnm)
xxd converts the bytes to hex representation
-s14 skips the pnm header
-p gives plain hex output
-c groups by 3 byte values (RGB pixel values)
uniq -c counts the run length encoding of the hex triplets
awk / printf %c converts the RLE count to an ascii value
xxd -r reverses the hex dump to the final encoded message
Thought they'd make this challenge more challenging but I reckon anyone who does solve this is pretty smart and worth hiring for them. Remember its government so they aren't getting or even looking for the best of the best... they don't pay enough for the best.
That's the comment at the end of the file, but is this even the challenge?
Because it take takes less than 5 seconds to find it, i think this is a hint not the solution.
EDIT: Use gimp to save it as ASCII portable bitmap (https://en.wikipedia.org/wiki/Netpbm_format), strip the headers and all the junk and you get a message that links to mi5.gov.uk/careers.
Somehow I doubt that's the solution either, you can get to that same link by just using the Careers menu item.
I would think/hope it's a bit more complicated than that.
Looking at the challenge one of the things that has me wondering is why do they repeat the same text twice, but with different font sizes. This suggests some kind of fraction that is somehow relevant to get further in the puzzle.
I did, my comment is about the edit. It feels that there should be more to it than that. I would at least expect the result to point to something you can't get at by just browsing around.
Perhaps I'm just hoping for more than there is to it :).
I got that far too. I don't believe it - there has to be more to it than that. There'll be something else in that file that's better hidden, for sure, and I'm pretty sure it won't be plain ASCII and hex. :)
I'm a bit disappointed there aren't any hidden chunks that take advantage of the PNG format :-(
Uncompressed it's just a binary message of 2 pixel colours, so unless that comment at the end is pointing somewhere else then it's not got much depth...
Iterate over the personal information of 5000 citizens using a for loop and violate their rights by calling the violateHumanRights() function which takes a single parameter which is the personal identification number attribute of the citizen.
3. Strip the PBM header lines and remove newlines so that it's just 0's and 1's.
4. Now feed it through this program:
Output is the following: 5. This smells like ASCII, let's decode it with tr -d - | xxd -p -r> Congratulations, you solved the puzzle! Why dont you apply to join our team? mi5.gov.uk/careers
From: https://archive.rebeccablacktech.com/g/thread/52888920#p5289...
2. Threshold so some of the pixels are black and others are white. Save as ASCII PBM.