
Ask HN: How to protect public APIs from bots when using a BaaS? - raulk
Say you have an listings&#x2F;classifieds app with public entries served via an API. How do you prevent bots from scraping and stealing your data by directly hitting your API?<p>If you build your own backend, you could put a gateway like Kong or similar in front —to detect and throttle&#x2F;ban robotic usage patterns.<p>But how do you achieve this if you use Firebase, Graphcool, or another Backend As A Service (BaaS)?<p>You could deploy a proxy&#x2F;gateway, but that would incur in an extra hop (= latency) for every single call.<p>EDIT: Actually, this question is applicable to any API, not just public ones. For private APIs restricted by login, the bot would simply have to create a user first.
======
tyingq
I don't know that there's a generally applicable answer if the api calls are
direct from end user -> public api, and on infrastructure you don't control.

The answers would be highly dependent on the specific service, and whatever
capabilities they offer. Firebase, for example, has a concept of custom tokens
where you could implement rules on a per-api-consumer basis.

There does seem to be an opportunity for CDN companies to offer an API gateway
with throttle, scripting, oath, conditional caching, bot blocking, etc. I
don't know why they haven't offered yet. A CDN hosted Tyk or Kong instance
would likely be popular.

~~~
ig1
There's already companies like Mashape who offer these services.

~~~
tyingq
Not at the edge. A combined CDN and API gateway would be unique and have
benefits a hosted API gateway would not. For one, in this scenario, you would
avoid the circuitous route from client to API gateway to 3rd party api, since
the CDN would be close to the client.

------
1ba9115454
You could try an API key policy. Any calls to the API without a key would be
throttled down to normal usage levels.

People can apply for an API key and you can monitor for any abuse.

