
Coding Challenge for MI5 - BerislavLopac
https://www.mi5.gov.uk/careers/opportunities/coding-challenge
======
guessmyname
1\. Open PNG in text editor, see comment:

    
    
        As I read, numbers I see. 'Twould be a shame not to
        count this art among the great texts of our time
    

2\. Threshold so some of the pixels are black and others are white. Save as
ASCII PBM.

3\. Strip the PBM header lines and remove newlines so that it's just 0's and
1's.

4\. Now feed it through this program:

    
    
        #include <stdio.h>
        main() {
          int c, l=0, n=1;
          while(EOF != (c=getchar())) {
            if(c == l) n++;
            else {
              putchar(n);
              n=1;
            }
            l = c;
          }
          putchar('\n');
          return 0;
        }
    

Output is the following:

    
    
        43-6F-6E-67-72-61-74-75-6C-61-74-69-6F-6E-73-2C-20-79-6F-75-20-73-6F-
        6C-76-65-64-20-74-68-65-20-70-75-7A-7A-6C-65-21-20-57-68-79-20-64-6F-
        6E-92-74-20-79-6F-75-20-61-70-70-6C-79-20-74-6F-20-6A-6F-69-6E-20-6F-
        75-72-20-74-65-61-6D-3F-20-6D-69-35-2E-67-6F-76-2E-75-6B-2F-63-61-72-
        65-65-72-73-20
    

5\. This smells like ASCII, let's decode it with tr -d - | xxd -p -r

> Congratulations, you solved the puzzle! Why dont you apply to join our team?
> mi5.gov.uk/careers

From:
[https://archive.rebeccablacktech.com/g/thread/52888920#p5289...](https://archive.rebeccablacktech.com/g/thread/52888920#p52892083)

~~~
f_allwein
Huh. So who benefits from you posting the solution here?

~~~
Retr0spectrum
It's not like its a particularly complex puzzle. If you don't want spoilers,
don't read it.

------
maffydub
A few minutes fun diversion.

A bit of guess work and trial and error, and I have the following bash one-
liner:

pngtopnm puzzle.png | pnmtoplainpnm | tail -n +4 | sed -e 's/232 3 138/A/g' -e
's/58 185 229/B/g' | tr -d ' \n' | sed -e 's/AB/A\nB/g' -e 's/BA/B\nA/g' | awk
'/.*/ { printf "%x", length(); }' | xxd -r -p | sed -e 's/-//g' | xxd -r -p

I'm not sure the "sed -e 's/AB/A\nB/g'" bit is optimal, but it was the first
thing I thought of...

...and I definitely have a tendency to use sed when I should be using tr.

------
jamesbrownuhh
OK, so, the obvious "congratulations" messages aside, what's _really_ in that
file?

Example: Take the PNG, convert it to another lossless format. Then convert
that lossless format back to PNG.

Different file size, right? OK, the comment accounts for about 120 bytes.. but
what about the rest? Why is it that much bigger?

"As I read, numbers I see. 'Twould be a great shame not to count this art
among the great texts of our time" \- count this art among the great texts...
it's a one time pad, isn't it.

~~~
Retr0spectrum
The alpha channel?

Edit: You're right though. Using imagemagick to clone it like so:

    
    
       convert puzzle.png puzzle2.png
    

results in an image that's at least 40% smaller. It might just be better
compression, but there might be another layer of steganography going on.

~~~
maffydub
I unpacked the chunks and there doesn't look to be much more (or it's subtle).

The IDAT chunk (which makes up most of the file) is DEFLATE-d but uses full
color (rather than palettized color, which would have saved lots of space, and
might be how you'd get 40% compression). It also doesn't use any row
filtering.

I checked incase something was hidden in the DEFLATE encoding itself (e.g.
using sub-optimal block encoding selection) but this round-trips exactly, so I
don't think it can be.

------
unfunco
I'm always of the opinion that these tests are not necessarily to find someone
that can solve the puzzle, it's to find those that can solve it and aren't
egotistical enough to brag that they solved it online.

------
dizrupt
I did HEX > ASCII > BASE64 Decode... Got this.

\---

ш 4ʄF ꇢʝ 0ŉÁGۊ㩌挨ﻲn偃Y픤
맀㫍zHvB塁ڟङ돒ٱ噭ۻm큔員踥;U35̆^ې5躂FTѥ憆∻bI1隤7糃1sÄàbrXÒ5㨄gP肂2똑jgQ7凮轋DnظƁए4o̤拢۪`z툃䇤
1喉鴌桘 xO挎EệhʍϢU뵥f摗䪦驭y秺f¬y靭欅鞞뭡ʮꆫ婨zƛ뫶

\---

With some editing out of characters...

Yan Lei who suffer㫍Qie ې Da Cheng ruin Fenghun rope䇤throat Fengchui O
destroyed ʍϢU 摗Yu秺 䰢Ren Ju scabbard 뭡 Lun 뫶

\---

Not sure if I'm on to something but was interesting seeing the Chinese
characters.

~~~
dizrupt
I think there is maybe a couple more steps to it but interesting to see what
you guys come up with.

------
burke
The count-the-pixels thing is too obvious, and doesn't explain why this is a
compressed png that's almost twice the size of what you get when you
decompress it.

"'Twould be a shame not to count this art among the great texts of our time"
makes me think part of the file is a one-time pad against part of some famous
book (based on the phrasing, and the Britishness of MI5, probably some 19th
century British literary author, maybe Dickens or Austen). Assuming that's the
case, the solution would be:

1\. Figure out which 'text' they're referring to, or get the text of as many
books as possible;

2\. Take the chunk of the file between the PNG headers and the Comment;

3\. XOR that chunk against the corpus with a sliding window;

4\. Score each sample based on number of english words found in the output,
keeping the top 10 as you go (or do some map/reduce crap because this would
take a while);

5\. Read the top outputs and see if anything makes sense.

This would also explain why the file is a bit longer than you'd think it
should be based on its visual contents.

If anyone wants to try it, this might be a good start:

* [https://www.gutenberg.org/ebooks/author/37](https://www.gutenberg.org/ebooks/author/37)

* [https://www.gutenberg.org/ebooks/author/68](https://www.gutenberg.org/ebooks/author/68)

~~~
estefan
It does seem too obvious, and the comment seems to be pointing somewhere else
like you say.

What do you mean about the compressed size?

There are no hidden chunks in the file, and the image data chunk when inflated
is 60147 bytes, vs 1270 compressed.

~~~
burke
I ran `convert -compress None puzzle.png` and the output was ~800 bytes.
Clearly that's optimizing the image too, but basically the image seems
wastefully large, and that makes me suspicious there's something going on
there.

------
natch
I have a coding challenge for them. Sadly, it's a really easy one that they
missed on so far: Do a better-than-random shuffle so the did-you-know
questions lower down on that page don't repeat so often.

------
McKayDavis
Code-golf anyone? Here's my bash one-liner:

    
    
        pngtopnm puzzle.png | xxd -s14 -p -c3 | uniq -c | awk '{printf "%c", $1}' | xxd -r -p
    

Explanation:

    
    
       pngtopnm converts the image to an uncompressed bytestream format (pnm)
       xxd converts the bytes to hex representation
          -s14 skips the pnm header
          -p gives plain hex output
          -c groups by 3 byte values (RGB pixel values)
       uniq -c counts the run length encoding of the hex triplets
       awk / printf %c converts the RLE count to an ascii value
       xxd -r reverses the hex dump to the final encoded message

------
slowrabbit
Thought they'd make this challenge more challenging but I reckon anyone who
does solve this is pretty smart and worth hiring for them. Remember its
government so they aren't getting or even looking for the best of the best...
they don't pay enough for the best.

~~~
Retr0spectrum
Yes. It would appear that their main criteria are actually mental stability
and reliability.

------
amgin3
"As I read, numbers I see. 'Twould be a shame not to count this art among the
great texts of our time"

Only took about 5 seconds to find.

~~~
dogma1138
That's the comment at the end of the file, but is this even the challenge?
Because it take takes less than 5 seconds to find it, i think this is a hint
not the solution.

EDIT: Use gimp to save it as ASCII portable bitmap
([https://en.wikipedia.org/wiki/Netpbm_format](https://en.wikipedia.org/wiki/Netpbm_format)),
strip the headers and all the junk and you get a message that links to
mi5.gov.uk/careers.

~~~
daenney
Somehow I doubt that's the solution either, you can get to that same link by
just using the Careers menu item.

I would think/hope it's a bit more complicated than that.

Looking at the challenge one of the things that has me wondering is why do
they repeat the same text twice, but with different font sizes. This suggests
some kind of fraction that is somehow relevant to get further in the puzzle.

~~~
dogma1138
It's but read the edit, the solution isn't much more complicated than that
either :) this is very basic steganography.

~~~
daenney
I did, my comment is about the edit. It feels that there should be more to it
than that. I would at least expect the result to point to something you can't
get at by just browsing around.

Perhaps I'm just hoping for more than there is to it :).

~~~
dogma1138
Well the full message is

"436F6E67726174756C6174696F6E732C20796F7520736F6C766564207468652070757A7A6C65212057687920646F6E927420796F75206170706C7920746F206A6F696E206F7572207465616D3F206D69352E676F762E756B2F6361726565727320"

"Congratulations, you solved the puzzle! Why don�t you apply to join our team?
mi5.gov.uk/careers "

So i doubt there is much more to it ;)

~~~
jamesbrownuhh
I got that far too. I don't believe it - there has to be more to it than that.
There'll be something else in that file that's better hidden, for sure, and
I'm pretty sure it won't be plain ASCII and hex. :)

~~~
estefan
I'm a bit disappointed there aren't any hidden chunks that take advantage of
the PNG format :-(

Uncompressed it's just a binary message of 2 pixel colours, so unless that
comment at the end is pointing somewhere else then it's not got much depth...

------
arisAlexis
I am very surprised by the amount of spoilers here. You ruined my fun

------
cureyourhead
Clearly the horizontal span lengths , between each fixed height step, encode a
useful sequence.

Numbers. Or an offset sequence. If it's a simple cipher recovery should be
quick.

Somehow this whole thing reminds me of Egyptian fractions

How long will hn take to solve if we work together

Okay what if rearranging the red strips so their edges are flush encodes a
transposition. Key for the cipher.

I'm betting hn will solve when post is 49 minutes old

~~~
11010110010101
Close enough... 43 minutes

------
ben_jones
Iterate over the personal information of 5000 citizens using a for loop and
violate their rights by calling the violateHumanRights() function which takes
a single parameter which is the personal identification number attribute of
the citizen.

