
Decrypting Blind's Encrypted API - jonluca
https://blog.jldc.me/posts/decrypting-blind
======
kccqzy
This is yet another reminder that good JS minification tools exist that can
absolutely change object properties into short minimal strings instead of
descriptive names. It's called the Closure Compiler in advanced mode. You do
have to have quite a bit of discipline in writing the JS to have that though.
Some languages like ClojureScript actually do this by default, so it doesn't
take much effort.

Also it helps if you don't have to use objects (with keys) to transfer data.
What I mean is that there's little reason to use

    
    
        {
            "alias": "b6WJEDTp",
            "member_nickname": "faRw33",
            "created_at": "4d",
            "is_auth": "Y",
            "board_id": 114961,
            <snip>
    

when you instead can use a simple array

    
    
        [
            "b6WJEDTp",
            "faRw33",
            "4d",
            "Y",
            114961,
            <snip>
    

if you have some post-processing to transform array indices into object keys.

Both of these approaches also cut down on the amount of data transferred over
the wire, so it saves data and helps speed up the site for users too.

~~~
Pfhreak
The thought of the latter just made me shudder. Removing keys locks your API
in really unpleasant ways. More importantly, it's less human readable and
harder to reason about. Please don't do this unless you've got a very specific
need for the performance.

~~~
kccqzy
Keys are not really removed in code, only from the wire format. In code you
can still write `myObject.property` but the minifier translates that to say,
`a[6]` instead. Naturally there would be tools for the developer to translate
these arrays back into full objects in the case of debugging in production.
Harder to reason about yes, but few devs would reason about their code using
minified code, why should they reason about using minified wire format?

(Of course I must admit that this is only suitable for private APIs, not APIs
published to explicitly allow third parties to use.)

Have you tried looking at, say, Gmail's XHR requests and responses?

~~~
fastball
If you're going to do that, why not just forego human readability entirely and
user an interchange format like protobuf or flatbuffers?

~~~
kccqzy
Because if you use the protobuf wire format, all the parsing code needs to be
written in JS and won't be as performant as JSON parsing built into the
browser.

But yes you can in fact transform protocol buffers into JS arrays in the way I
described. I'm essentially describing protobuf designed for JS. Imagine your
protobuf definitions are read by a compiler which spits out JS classes with
getters and setters. These getters and setters access the underlying array
with an assigned index. Your minifier inlines these getters and setters into
direct array access. Voilà.

~~~
yonder
Why do you assume that native JSON parsing is faster than protobuf parsing
using JS?

In most cases protobuf is faster.

~~~
seangrogg
Having spent a LOT of time looking at this, browsers come with built-in (nil
cost) JSON parsers.

You need a proto parsing lib and a collection of .proto schemas to even begin
using protobufs, so you need to be dealing with at least that much data saved
before proto even starts being a win. While the parsing lib can be cached and
is largely a rounding error over a long term, every iteration to the .proto
files means fetching a new version which contains all the contents of the
previous version (or else sacrificing backwards compatibility).

Beyond the additional payload costs you also have to factor in the API itself.
Any win to keys can largely be obtained via compression so that's only a
nominal win. APIs with many string values are not going to see many benefits,
either, and may actually be better served by compression. The real win for
proto is in large numbers but there aren't many APIs using many values in the
256-65k range (let alone higher). Proto does do really well with booleans and
null, though. Unpacked arrays aren't a really strong win for them, either
(though packed ones are a win for large arrays). They also have weird quirks
for maps that don't let them achieve parity with JSON, IIRC.

Parsing time is not a huge win given normal API response sizes. I was parsing
a JSON blob with 100k values four years ago on a shitty Dell in 2 seconds and
can't think of anything near that size in the wild. Most API responses are
going to be parsed faster than human perception rendering the point mostly
moot.

The real win is the direct impact to spend on bandwidth that scales with size,
but that comes at the cost of developer productivity and not everyone has
Google's warchest and can afford SWEs memeing about how they get promoted by
spending 2 years updating protos.

Having worked at Google, Protobuf is a solid choice when you're working in
multiple languages on multiple internal machines and haven't already bought
into other means of serializing data. But they do not particularly shine when
targeting browsers unless there is a LOT of data going back and forth and your
front-end engineering team doesn't mind working around jspb's quirks, opaque
errors, and subtle nuances.

------
jiofih
Is there any point in encrypting API payloads when the traffic is going via
TLS?

~~~
chocolatkey
Potentially to prevent MITM proxies on company computers from being able to
sniff the traffic. Maybe because of what blind is about, that would make
sense? Otherwise, if it's secure TLS, then no reason at all

Edit: maybe the reason they use public key for transmission is because you
can't reverse that, and that would potentially be where your anonymous
complaints your complaints (or whatever you do on blind) would be?

~~~
a1369209993
Note that as thenewnewguy says, anyone who can MITM your connection can also
inject JS spyware (well, more so than usual) to exfiltrate your comments.
That's harder (and much harder still to avoid discovery) than just sniffing
the traffic, so it might be a useful stopgap, but for real security you need
to fix your web browser to reject MITMed connections.

------
eralps
Nice article! I always wonder what the legal aspects of publishing a reverse
engineering article for a private API are? Does the company that the API
belongs to have rights to an obligatory take down request?

~~~
userbinator
Is it really "private" if everyone with a browser and a brain can see what
it's doing...?

~~~
artificial
This is what's frustrating about accessing content online. Is it fair game if
it's on a web server since the requester cannot determine intent? Legally it
doesn't appear so.

------
bowmessage
I've gone through this same exercise in the past in order to mass-delete a
large number of comments on different threads. I was afraid that Blind may one
day suffer a data leak. I attempted to reroll the crypto in Ruby, but
ultimately failed and went the JS route, same as the author. I also had to
roll my own sesion-token refresh logic. Finally I was wondering if any kind of
data mining could be done with the tool, but I never took it that far. Thanks
for the writeup!

~~~
choppaface
Well they already had at least one breach:
[https://techcrunch.com/2018/12/20/blind-anonymous-app-
data-e...](https://techcrunch.com/2018/12/20/blind-anonymous-app-data-
exposure/)

~~~
sonicggg
You'd think that engineers from top - tier tech companies would know better,
before sharing sensitive information on some random website.

~~~
tehlike
People like venting.

~~~
seangrogg
Others don't even care if the info was de-anonymized in the first place and
just enjoyed the topics that were more openly discussed there.

------
tomsmeding
So, they used asymmetric encryption for the request so that a MITM can't read
that, but they used symmetric encryption for the response. Though it requires
a MITM to fully analyse the code, it allows a MITM to decrypt any response.
Cited possible reason (in the conclusion) is performance.

I think you don't have to resort to symmetric encryption here, even keeping
performance in mind. What you do is generate a new asymmetric keypair on the
client for every session, then send the public key over to the server. Then
the server encrypts every response with that public key, allowing only the
client to decrypt it.

Doing that, one can only read a session's network traffic, both ways, if they
can read values of variables on the client -- but if one can do that, you can
read everything anyway. ;)

EDIT: forgot to talk about performance -- you just use a so-called "envelope",
where the sending party first encrypts the data symmetrically with a randomly
generated key, then encrypts that random key with the asymmetric crypto. The
pair (symmetrically encrypted data and the asymmetrically encrypted key) is
sent to the receiver, which can use its private key to decrypt the symmetric
key, with which it decrypts the data.

------
zapttt
the sad state of web developers.

from the silly comments of "infinite scrolling" being definitive proof of a
solid rest api behind and that php is or is not capable of either (the writing
is too ambiguous). to the roundabout amateur obfuscation (the author calls
encryption) that is entirely akin to the JavaScript that disabled right click
to "copyright" the page's content in the 90s.

sigh

~~~
strictnein
The code was mildly obfuscated. The data was encrypted.

