Hacker News new | comments | ask | show | jobs | submit login

The NSA must have somebody working on the inside at google. Otherwise it would be extremely difficult to reverse engineer the RPC protocol that was used by google's servers to communicate between each other. Even on an unencrypted network I can image it would be very difficult to reverse engineer the protocol without any help.

The RPC protocol is stated to be based on protocol buffers so the encoding is known. What you would need to do to reverse engineer it is coming up with matching message descriptions.

But, even without that piece of the puzzle, reverse engineering a protocol that doesn't use encryption wouldn't be "extremely difficult". This is not an indication of an inside man.


The only published information is how the values are encoded, not what is encoded (the specifications aren't transported together with the data) so to crack 1622 different protocols only involved in authorization according to the NSA slide is not such a small task, at least if they are interested in more than just recognizing e-mail addresses which can be found using regexps. And just counting the protocols proves that they were indeed interested in more.

The screenshots of the ascii dump of the RPC calls shown in the WaPo article show that there is tons of information to work with, besides just the email account.

You're talking about the NSA here, an outfit which has cracked the cryptosystems of foreign governments in a variety of foreign languages, and even cracked a Russian one-time-pad that they had accidental;y used more than once.

I don't think it's very hard at all for them to reverse engineer RPC serialization that is not even encrypted if they can crack cryptosystems.

Of course it's not impossible just reverse engineering the protocols but we now know that these guys also rightly measure their smartness by taking shortcuts wherever they can. It would be stupid to do unnecessary work to "rediscover" easily accessible information. The right approach is using the internal documents describing the protocols. Shouldn't be so hard, "it's all in the cloud."

Yes, that's what I meant with "message descriptions". It's definitely not a small task, but again not something that I would say is extremely hard. It's just a lot of work.

I'm curious what 1622 represents here. 1622 different protocols, each with their different messages? Seems like a crazy amount. 1622 different message types for authorization? Even that seems like a stretch.

I guess its just 1622 fields or message types.

No need to reverse engineer it, it's a public spec, ProtoBufs. It's like saying "it's extremely difficult to reverse engineer JSON"

Or, they could have just hired an ex-Googler familiar with the protocol. No need for someone working for the NSA "on the inside" at Google, and no need for reverse engineering.

Just "being familiar" would probably slow them down too much, considering how much different protocols they attacked "1622 Google authorization protocols" are listed on one NSA slide here:


I know how hard it would be, I implement low level protocols (not Google's or of men in black).

Years and Years ago I reverse engineered the weird binary protocol that Yahoo Messenger used with nothing more than Ethereal and an absolute shit load of packet logs.

I did that for fun and I was (and am) a mediocre programmer at best, the NSA/GCHQ has some of the best talent around I doubt they would find it much of a challenge to this on a bigger more complex protocol.

Unencrypted traffic is (relatively) easy to reverse engineer even without a protocol description (examples, the Samba guys, the Asterix folks) as most protocols are designed to be structured (that is kind of the point of having the protocol).

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact