Hacker News new | past | comments | ask | show | jobs | submit login
Coding Challenge for MI5 (mi5.gov.uk)
76 points by BerislavLopac on Sept 11, 2016 | hide | past | favorite | 37 comments



1. Open PNG in text editor, see comment:

    As I read, numbers I see. 'Twould be a shame not to
    count this art among the great texts of our time
2. Threshold so some of the pixels are black and others are white. Save as ASCII PBM.

3. Strip the PBM header lines and remove newlines so that it's just 0's and 1's.

4. Now feed it through this program:

    #include <stdio.h>
    main() {
      int c, l=0, n=1;
      while(EOF != (c=getchar())) {
        if(c == l) n++;
        else {
          putchar(n);
          n=1;
        }
        l = c;
      }
      putchar('\n');
      return 0;
    }
Output is the following:

    43-6F-6E-67-72-61-74-75-6C-61-74-69-6F-6E-73-2C-20-79-6F-75-20-73-6F-
    6C-76-65-64-20-74-68-65-20-70-75-7A-7A-6C-65-21-20-57-68-79-20-64-6F-
    6E-92-74-20-79-6F-75-20-61-70-70-6C-79-20-74-6F-20-6A-6F-69-6E-20-6F-
    75-72-20-74-65-61-6D-3F-20-6D-69-35-2E-67-6F-76-2E-75-6B-2F-63-61-72-
    65-65-72-73-20
5. This smells like ASCII, let's decode it with tr -d - | xxd -p -r

> Congratulations, you solved the puzzle! Why dont you apply to join our team? mi5.gov.uk/careers

From: https://archive.rebeccablacktech.com/g/thread/52888920#p5289...


I'm a little disappointed that I was able to solve it so quickly, I thought it was going to be a big multi-stage puzzle where the color values and image dimensions/row boundaries would all be important factors...

Still, my solution (in JS) was a little more verbose than yours...

  <img src="puzzle2.png" id="puzzle-image">
  <script>

  /*
  Pink (white) 232,3,138
  Blue (black) 58,185,229
  */

  window.addEventListener("load", function() {
    var image = document.getElementById("puzzle-image"),
      canvas = document.createElement("canvas"),
      context = canvas.getContext("2d");
    canvas.width = image.width;
    canvas.height = image.height;
    context.drawImage(image, 0, 0, image.width, image.height);
    
    // Test 1  - rle
    (function() {
      var rle = [], prev = null, count = 0;
      for (let y = 0; y < image.height; y++) {
        for (let x = 0; x < image.width; x++) {
          let pd = context.getImageData(x, y, 1, 1).data;
            if (prev === null) {
              prev = pd[0];
            }
            if (pd[0] == prev) {
              count++;
            } else {
            rle.push(count);
            count = 1;
            prev = pd[0];
          }
        }
      }
      var r1 = "";
      for (let i = 0; i < rle.length; i++) {
        r1 += String.fromCharCode(rle[i]);
      }
      console.log(r1);
      var r2 = "";
      for (let i = 0; i < r1.length; i += 3) {
        r2 += String.fromCharCode(parseInt(r1[i] + r1[i + 1], 16));
      }
      console.log(r2);
    }());
  }, false);

  </script>
(Where puzzle2.png is the image in black & white)

(Edited to fix indentation...)


You could have written this up in a Gist or PasteBin, provided a link and a spoiler message.


My solution:

  #!/bin/sh
  wget https://www.mi5.gov.uk/sites/default/files/u2/puzzle.png
  convert puzzle.png puzzle.xpm # Convert the image to XPM with ImageMagick.
  # Here is what the contents of puzzle.xpm looks like:
  # http://pastebin.com/0PySSAXx. The spaces represent one color and the dots
  # the other. We'll replace the spaces with underscores for readability.
  <puzzle.xpm tr ' ' _ \
  | tr -d "\n" \
  | sed -e 's|","||g;   s|^.*\*/"||;   s|".*$||   # Remove metadata, C framing.' \
  | sed -e 's|_\.|_\n.|g;   s|\._|.\n_|g   # New line on color change. ' \
  | awk '{printf "%c", length($0)}' \
  | xxd -r -p
The XPM image file format (https://en.wikipedia.org/wiki/X_PixMap) is great for doing things like this. I find it easier to read and to edit by hand than the Netpbm format family. It seems ImageMagick can't write XPM2, though, which makes extra preprocessing necessary.


Huh. So who benefits from you posting the solution here?


It's not like its a particularly complex puzzle. If you don't want spoilers, don't read it.


Can you spoil the latest Bourne film for me as well please?


Well done!


A few minutes fun diversion.

A bit of guess work and trial and error, and I have the following bash one-liner:

pngtopnm puzzle.png | pnmtoplainpnm | tail -n +4 | sed -e 's/232 3 138/A/g' -e 's/58 185 229/B/g' | tr -d ' \n' | sed -e 's/AB/A\nB/g' -e 's/BA/B\nA/g' | awk '/.*/ { printf "%x", length(); }' | xxd -r -p | sed -e 's/-//g' | xxd -r -p

I'm not sure the "sed -e 's/AB/A\nB/g'" bit is optimal, but it was the first thing I thought of...

...and I definitely have a tendency to use sed when I should be using tr.


OK, so, the obvious "congratulations" messages aside, what's really in that file?

Example: Take the PNG, convert it to another lossless format. Then convert that lossless format back to PNG.

Different file size, right? OK, the comment accounts for about 120 bytes.. but what about the rest? Why is it that much bigger?

"As I read, numbers I see. 'Twould be a great shame not to count this art among the great texts of our time" - count this art among the great texts... it's a one time pad, isn't it.


The alpha channel?

Edit: You're right though. Using imagemagick to clone it like so:

   convert puzzle.png puzzle2.png
results in an image that's at least 40% smaller. It might just be better compression, but there might be another layer of steganography going on.


I unpacked the chunks and there doesn't look to be much more (or it's subtle).

The IDAT chunk (which makes up most of the file) is DEFLATE-d but uses full color (rather than palettized color, which would have saved lots of space, and might be how you'd get 40% compression). It also doesn't use any row filtering.

I checked incase something was hidden in the DEFLATE encoding itself (e.g. using sub-optimal block encoding selection) but this round-trips exactly, so I don't think it can be.


"As I read, numbers I see."

If we move the comma, this becomes: "As I read Numbers, I see"...

OTP on the https://en.wikipedia.org/wiki/Book_of_Numbers, perhaps?

There may well be a lot more to this puzzle than first meets the eye!


I'm always of the opinion that these tests are not necessarily to find someone that can solve the puzzle, it's to find those that can solve it and aren't egotistical enough to brag that they solved it online.


I did HEX > ASCII > BASE64 Decode... Got this.

---

ш4ʄF ꇢʝ 0ʼnÁGۊ㩌挨ﻲn偃Y픤맀㫍zHvB塁ڟङ돒ٱ噭ۻm큔員踥;U35̆^ې5躂FTѥ憆∻bI1隤7糃1sÄàbrXÒ5㨄gP肂2똑jgQ7凮轋DnظƁए4o̤拢۪`z툃䇤1喉鴌桘 xO挎EệhʍϢU뵥f摗䪦驭y秺f¬y靭欅鞞뭡ʮꆫ婨zƛ뫶

---

With some editing out of characters...

Yan Lei who suffer㫍Qie ې Da Cheng ruin Fenghun rope䇤throat Fengchui O destroyed ʍϢU 摗Yu秺 䰢Ren Ju scabbard 뭡 Lun 뫶

---

Not sure if I'm on to something but was interesting seeing the Chinese characters.


I think there is maybe a couple more steps to it but interesting to see what you guys come up with.


The count-the-pixels thing is too obvious, and doesn't explain why this is a compressed png that's almost twice the size of what you get when you decompress it.

"'Twould be a shame not to count this art among the great texts of our time" makes me think part of the file is a one-time pad against part of some famous book (based on the phrasing, and the Britishness of MI5, probably some 19th century British literary author, maybe Dickens or Austen). Assuming that's the case, the solution would be:

1. Figure out which 'text' they're referring to, or get the text of as many books as possible;

2. Take the chunk of the file between the PNG headers and the Comment;

3. XOR that chunk against the corpus with a sliding window;

4. Score each sample based on number of english words found in the output, keeping the top 10 as you go (or do some map/reduce crap because this would take a while);

5. Read the top outputs and see if anything makes sense.

This would also explain why the file is a bit longer than you'd think it should be based on its visual contents.

If anyone wants to try it, this might be a good start:

* https://www.gutenberg.org/ebooks/author/37

* https://www.gutenberg.org/ebooks/author/68


It does seem too obvious, and the comment seems to be pointing somewhere else like you say.

What do you mean about the compressed size?

There are no hidden chunks in the file, and the image data chunk when inflated is 60147 bytes, vs 1270 compressed.


I ran `convert -compress None puzzle.png` and the output was ~800 bytes. Clearly that's optimizing the image too, but basically the image seems wastefully large, and that makes me suspicious there's something going on there.


I have a coding challenge for them. Sadly, it's a really easy one that they missed on so far: Do a better-than-random shuffle so the did-you-know questions lower down on that page don't repeat so often.


Code-golf anyone? Here's my bash one-liner:

    pngtopnm puzzle.png | xxd -s14 -p -c3 | uniq -c | awk '{printf "%c", $1}' | xxd -r -p
Explanation:

   pngtopnm converts the image to an uncompressed bytestream format (pnm)
   xxd converts the bytes to hex representation
      -s14 skips the pnm header
      -p gives plain hex output
      -c groups by 3 byte values (RGB pixel values)
   uniq -c counts the run length encoding of the hex triplets
   awk / printf %c converts the RLE count to an ascii value
   xxd -r reverses the hex dump to the final encoded message


Thought they'd make this challenge more challenging but I reckon anyone who does solve this is pretty smart and worth hiring for them. Remember its government so they aren't getting or even looking for the best of the best... they don't pay enough for the best.


Yes. It would appear that their main criteria are actually mental stability and reliability.


"As I read, numbers I see. 'Twould be a shame not to count this art among the great texts of our time"

Only took about 5 seconds to find.


That's the comment at the end of the file, but is this even the challenge? Because it take takes less than 5 seconds to find it, i think this is a hint not the solution.

EDIT: Use gimp to save it as ASCII portable bitmap (https://en.wikipedia.org/wiki/Netpbm_format), strip the headers and all the junk and you get a message that links to mi5.gov.uk/careers.


Somehow I doubt that's the solution either, you can get to that same link by just using the Careers menu item.

I would think/hope it's a bit more complicated than that.

Looking at the challenge one of the things that has me wondering is why do they repeat the same text twice, but with different font sizes. This suggests some kind of fraction that is somehow relevant to get further in the puzzle.


It's but read the edit, the solution isn't much more complicated than that either :) this is very basic steganography.


I did, my comment is about the edit. It feels that there should be more to it than that. I would at least expect the result to point to something you can't get at by just browsing around.

Perhaps I'm just hoping for more than there is to it :).


Well the full message is

"436F6E67726174756C6174696F6E732C20796F7520736F6C766564207468652070757A7A6C65212057687920646F6E927420796F75206170706C7920746F206A6F696E206F7572207465616D3F206D69352E676F762E756B2F6361726565727320"

"Congratulations, you solved the puzzle! Why don�t you apply to join our team? mi5.gov.uk/careers "

So i doubt there is much more to it ;)


I got that far too. I don't believe it - there has to be more to it than that. There'll be something else in that file that's better hidden, for sure, and I'm pretty sure it won't be plain ASCII and hex. :)


I'm a bit disappointed there aren't any hidden chunks that take advantage of the PNG format :-(

Uncompressed it's just a binary message of 2 pixel colours, so unless that comment at the end is pointing somewhere else then it's not got much depth...


That is most likely just a hint, or misdirection. I'd be greatly disappointed if it's not a multi layered puzzle.


I'm fairly certain that's just a hint.


I am very surprised by the amount of spoilers here. You ruined my fun


Clearly the horizontal span lengths , between each fixed height step, encode a useful sequence.

Numbers. Or an offset sequence. If it's a simple cipher recovery should be quick.

Somehow this whole thing reminds me of Egyptian fractions

How long will hn take to solve if we work together

Okay what if rearranging the red strips so their edges are flush encodes a transposition. Key for the cipher.

I'm betting hn will solve when post is 49 minutes old


Close enough... 43 minutes


Iterate over the personal information of 5000 citizens using a for loop and violate their rights by calling the violateHumanRights() function which takes a single parameter which is the personal identification number attribute of the citizen.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: