
1213486160 has a friend: 1195725856 - TimWolla
http://rachelbythebay.com/w/2016/10/07/magic/
======
ChuckMcM
Hah, one of the side effects of doing embedded programming is spending a lot
of time staring at hex dumps with ascii in them, hex on the left actual
characters on the right. As a result you start recognizing a lot of ASCII
characters when you see the hex codes for them.

I was debugging a library that was a 'native' library for a scripting language
and the code seemed to have a much bigger running footprint than I expected.
It kept allocating this odd sized buffer, a bit over 13,000 bytes in size.
Walking it back to the scripting language interface to C code the buffer it
wanted was '32' bytes long but the scripting language was passing it as a
string so 0x3332 bytes long. oops! Reading hex and seeing ASCII is a very
useful skill to develop.

~~~
schoen
The structure of ASCII itself is very useful for this, if you know the ordinal
positions of letters in the alphabet.

Uppercase letters are 0x40 + position of the letter in the alphabet, so "E",
being the 5th letter, is 45, "I", being the 9th letter, is 49, and so on.

Lowercase letters are 0x60 + position of the letter in the alphabet, so "e" is
65, "i" is 69, and so on.

That also means that you can swap case by flipping a single bit (XOR with
0x20).

Finally, digits are 0x30 + the digit's numeric value (including 0), so the
digit "5" is 35.

(All of these properties were very intentional on the part of ASCII's
creators.)

~~~
schoen
Relatedly, if you want to impress people with the ability to "read binary",
and you know that something is plain ASCII text represented in binary, just
look at the rightmost 5 bits of every byte. They will be the ordinal position
of the letter.

"Hello" is

010 _01000_ 8 (h)

011 _00101_ 5 (e)

011 _01100_ 12 (l)

011 _01100_ 12 (l)

011 _01111_ 15 (o)

And when you see all zeroes, it's probably 00100000, the space character.

~~~
pc86
If 01001000 is 'h' what is 01101000?

~~~
schoen
I deliberately wrote the "h" in lowercase, even though the ASCII character is
uppercase, because I was advising looking only at the five least-significant
bits, which won't tell you the case. Sorry for the confusion.

~~~
pc86
Good clarification thank you :)

------
jstanley
Passing the very first 4 bytes you receive straight to malloc with no sanity
checking? I suspect that application is riddled with other vulnerabilities!

~~~
toast0
If the format is 4 bytes of length followed by that number of bytes -- how
exactly do you sanity check it, if you intend to occasionally send some really
big whatevers at the start of a stream?

~~~
s_kilk
I suppose you could start each message with a magic constant, followed by the
length of the remainder, then the payload.

Any messages that don't start with the magic constant just get ignored.

~~~
amelius
It could help in obvious cases, but it sounds a lot like "security by
obscurity".

~~~
pdpi
No, it's just a bit of defensive programming against silliness and
misconfiguration. Not every errant connection is an attacker trying to pwn
you, some are just going to be honest mistakes. Picking up on wrong magic
numbers and logging "missing magic from <ip>" would help immensely debugging
this issue. All the other advice about avoiding mallocing an unsanitised
amount of memory still stands, but this would just make it all easier to
figure out

------
lbrandy
If you search either of these numbers on google you see a ton of errors and
people asking befuddled questions. We're literally doing a public service to
future versions of ourselves by juicing the google results for this post. For
once, it's totally appropriate to upvote for visibility. Upvotes to the left.

~~~
ot
This comment should be the top-voted comment, so that next time I land on this
story I'll remember why it had so many upvotes.

------
cesarb
The lesson from this would be: when creating a network protocol, always start
the stream or packet with a magic number, in both directions. If the magic
number doesn't match, drop the packet or close the connection.

In fact, one could say that these are HTTP's magic numbers: 'HTTP/' for the
response, and a few ('GET ', 'HEAD ', 'POST ', 'PUT ', and so on) for the
request. IIRC, one trick web servers use to speed up parsing a request is to
treat the first four bytes as an integer, and switch on its value to determine
the HTTP method.

~~~
theoh
Not sure about this. "magic" in the sense of file type recognition has no
cryptographic/security ambitions. On the other hand, when implementing a
network protocol, I'd say one needs to be really careful and expect the
unexpected at any point (not just in the "magic" phase). What does it buy you
(in terms of robustness) to have a fakeable magic number at the start of the
stream?

------
bgrainger
For completeness, we should add 542393671 (0x20544547) and 1347703880
(0x50545448), i.e., the little endian versions, to the list. Googling those
numbers also turns up a lot of people with strange error messages (caused by
deserializing "GET " or "HTTP" as a 32-bit integer).

------
hullo
I was personally expecting a piece about a baffling resurgence in the use of
ICQ.

------
lziest
Previously, Go's TLS library will report "tls oversized record received with
length 20527" when the remote address was not actually handling TLS
connections. The magic number is simply because
[https://github.com/golang/go/issues/11111](https://github.com/golang/go/issues/11111).
Even better, when you google that error, you get all docker-related issues.
Poor docker.

------
ebbv
From the title I was expecting this to be some math post about strange factors
or something. I was really disappointed. Is the youth today really fascinated
by merely translating ASCII strings into numeric translations?

~~~
CUViper
The context is what makes this interesting, that the number showed up in an
unexpected place.

~~~
ebbv
Except it didn't. It's only unexpected if you're ignorant of what it means, in
which case any number is unexpected. You could literally write this same inane
blog post about dozens of different common phrases in binary, starting with
POST, DELETE, etc.

~~~
bbcbasic
Did you read it?

