There are some interesting attack vectors to be aware of if you run a service where users can define webhooks, and your service will will call the user-defined webhooks to notify about certain system events. In my case, a monitoring service which can send notifications by calling user-defined webhook.
* Timeouts: the user can set up a webhook receiver that takes very long to generate a response. Your service must be able to deal with that.
* Timeouts (slowloris): the webhook target could be sending back one byte at a time, with 1 second pauses inbetween. If you are using, say, the "requests" python library for making HTTP requests, the "timeout" parameter will not help here
* Private IPs and reserved IPs: you probably don't want users defining webhooks to http://127.0.0.1:<some-port> and probing your internal network. Remember about private IPv6 ranges too
* Domains that resolve to private IPs: attacker could set up foo.com which resolves to a private IP. It is not enough to just validate webhook URLs when users set them up.
* HTTP redirects to private IPs. If your HTTP client library follows HTTP redirects, the attacker can set up a webhook endpoint that redirects to a private IP. Again, it is not enough to validate the user-supplied URL.
* Excessive HTTP redirects. The attacker can set up a redirect loop - make sure this does not circumvent your timeout setting.
>Domains that resolve to private IPs: attacker could set up foo.com which resolves to a private IP
There's a clever extension to this attack; a naive way to mitigate it is to do a DNS resolution first to verify it's not a private IP and then do the actual request. An attacker can simply return a public IP on the first DNS resolution (with a 0 TTY) and then return a private IP on the second. This is called a "TOCTOU" (time-of-check time-of-use) vulnerability. I've written about this and other security best practices on my blog here - https://www.ameyalokare.com/technology/webhooks/2021/05/03/s...
Yeah, resolving twice is a really bad idea. A good rule of thumb for security: if you think you have a clever hack (e.g. checking DNS twice as a workaround to not being able to patch DNS resolution), it probably isn't so clever as you think.
It seems like webhooks have enough corner cases for the sender to require a specialized tool to protect itself from malicious users and to stay performant.
Does anyone have suggestion for such tools/services that they might have used in production?
This is the path I've seen be fairly robust at a few tech companies I've helped sort out this defense for. I've helped write libraries too but the proxy is the easiest approach when targeting many languages.
Those are all great points! At Svix (we do webhooks as a service), we disallow redirects because of what you mentioned above, and because it's also just bad for performance (for both sides) and is most likely a configuration error anyway.
Resolution and timeouts: the aiohttp library for Python is slightly better in terms of letting you configure these things, though it's better to just use a sending proxy that does it all for you and is also located in an isolated VPC to make sure that you're protected.
Disallowing redirects altogether is probably too big of a hammer. There are legit reasons to use redirects (like migration to new versions). A limit to the number of redirects seems ideal -- that's what Twilio does, for example.
Migrating to a new service: you can just update the webhook URL. On the other hand, there are at least a couple of problems with allowing even one redirect: it opens a pandora's box of security implications, and it's a performance penalty that is paid both by the sender and the receiver on every webhook sent. Realistically, 3xx are most likely to be a mis-configurations (e.g. including a trailing slash where one shouldn't) so I think being noisy about it is a great idea.
Not saying that there aren't valid use-cases (e.g. maybe some sort of dynamic webhook receiving), I'm just saying that we made this choice given the above, and we would be willing to change it if it's ever a barrier for someone.
Fair enough, but often consumers that use multiple vendors receive webhooks in the same service; think about going from /v1/webhooks -> /v2/webhooks, they'd have to change the URL for every vendor. Easier to redirect first then update the URLs later. I think it's a reasonable expectation that a HTTP client would honor redirects as long as the usage isn't malicious (like loops etc)
Great list. Closing layer 3* access (inbound firewall rule of deny-all) helps mitigate against the common webhook attack vectors as well (should still do all the L7 mitigation and best practices alluded to above for a multi-layered approach).
* Timeouts: the user can set up a webhook receiver that takes very long to generate a response. Your service must be able to deal with that.
* Timeouts (slowloris): the webhook target could be sending back one byte at a time, with 1 second pauses inbetween. If you are using, say, the "requests" python library for making HTTP requests, the "timeout" parameter will not help here
* Private IPs and reserved IPs: you probably don't want users defining webhooks to http://127.0.0.1:<some-port> and probing your internal network. Remember about private IPv6 ranges too
* Domains that resolve to private IPs: attacker could set up foo.com which resolves to a private IP. It is not enough to just validate webhook URLs when users set them up.
* HTTP redirects to private IPs. If your HTTP client library follows HTTP redirects, the attacker can set up a webhook endpoint that redirects to a private IP. Again, it is not enough to validate the user-supplied URL.
* Excessive HTTP redirects. The attacker can set up a redirect loop - make sure this does not circumvent your timeout setting.
My current solution for all of the above is to use libcurl via pycurl. I wrote a wrapper that mimics requests API: https://github.com/healthchecks/healthchecks/blob/master/hc/... (may contain bugs, use at your own risk :-)