Hacker News new | past | comments | ask | show | jobs | submit login

There are some interesting attack vectors to be aware of if you run a service where users can define webhooks, and your service will will call the user-defined webhooks to notify about certain system events. In my case, a monitoring service which can send notifications by calling user-defined webhook.

* Timeouts: the user can set up a webhook receiver that takes very long to generate a response. Your service must be able to deal with that.

* Timeouts (slowloris): the webhook target could be sending back one byte at a time, with 1 second pauses inbetween. If you are using, say, the "requests" python library for making HTTP requests, the "timeout" parameter will not help here

* Private IPs and reserved IPs: you probably don't want users defining webhooks to<some-port> and probing your internal network. Remember about private IPv6 ranges too

* Domains that resolve to private IPs: attacker could set up foo.com which resolves to a private IP. It is not enough to just validate webhook URLs when users set them up.

* HTTP redirects to private IPs. If your HTTP client library follows HTTP redirects, the attacker can set up a webhook endpoint that redirects to a private IP. Again, it is not enough to validate the user-supplied URL.

* Excessive HTTP redirects. The attacker can set up a redirect loop - make sure this does not circumvent your timeout setting.

My current solution for all of the above is to use libcurl via pycurl. I wrote a wrapper that mimics requests API: https://github.com/healthchecks/healthchecks/blob/master/hc/... (may contain bugs, use at your own risk :-)

>Domains that resolve to private IPs: attacker could set up foo.com which resolves to a private IP

There's a clever extension to this attack; a naive way to mitigate it is to do a DNS resolution first to verify it's not a private IP and then do the actual request. An attacker can simply return a public IP on the first DNS resolution (with a 0 TTY) and then return a private IP on the second. This is called a "TOCTOU" (time-of-check time-of-use) vulnerability. I've written about this and other security best practices on my blog here - https://www.ameyalokare.com/technology/webhooks/2021/05/03/s...

I've also built an egress proxy that prevents such attacks here - https://github.com/juggernaut/webhook-sentry

Same caveat applies, use at your own risk :-)

Yeah, resolving twice is a really bad idea. A good rule of thumb for security: if you think you have a clever hack (e.g. checking DNS twice as a workaround to not being able to patch DNS resolution), it probably isn't so clever as you think.

It seems like webhooks have enough corner cases for the sender to require a specialized tool to protect itself from malicious users and to stay performant.

Does anyone have suggestion for such tools/services that they might have used in production?

At Slite, for all outgoing calls we use a sandboxed proxy. It has saved us a few times already. We detailed the trick in a blog post -> https://slite-tech-blog.ghost.io/anti-ssrf-solution/

This is the path I've seen be fairly robust at a few tech companies I've helped sort out this defense for. I've helped write libraries too but the proxy is the easiest approach when targeting many languages.

For half of those it's called a firewall and for the other half you should specify timeouts / max redirects.

Outgoing webhooks dispatched from a Lambda seemed to solve most of the above problems for us

Using lambda doesn't protect you against any of these.

Timeout attacks: instead of failing you'll now be just be paying through the nose. And also with enough scale you can hit lambda limits and fail.

SSRF: you are still as vulnerable unless the lambda is outside your VPC (but then it's the VPC that solved it, not the lambda).

Stripe built an OSS egress network proxy that prevents webhooks from reaching internal resources:


Just adding to that - don't forget about users defining AWS metadata addresses for a webhook. Returning IAM data to them can be .. bad.

Those are all great points! At Svix (we do webhooks as a service), we disallow redirects because of what you mentioned above, and because it's also just bad for performance (for both sides) and is most likely a configuration error anyway.

Resolution and timeouts: the aiohttp library for Python is slightly better in terms of letting you configure these things, though it's better to just use a sending proxy that does it all for you and is also located in an isolated VPC to make sure that you're protected.

Disallowing redirects altogether is probably too big of a hammer. There are legit reasons to use redirects (like migration to new versions). A limit to the number of redirects seems ideal -- that's what Twilio does, for example.

Migrating to a new service: you can just update the webhook URL. On the other hand, there are at least a couple of problems with allowing even one redirect: it opens a pandora's box of security implications, and it's a performance penalty that is paid both by the sender and the receiver on every webhook sent. Realistically, 3xx are most likely to be a mis-configurations (e.g. including a trailing slash where one shouldn't) so I think being noisy about it is a great idea.

Not saying that there aren't valid use-cases (e.g. maybe some sort of dynamic webhook receiving), I'm just saying that we made this choice given the above, and we would be willing to change it if it's ever a barrier for someone.

>you can just update the webhook URL

Fair enough, but often consumers that use multiple vendors receive webhooks in the same service; think about going from /v1/webhooks -> /v2/webhooks, they'd have to change the URL for every vendor. Easier to redirect first then update the URLs later. I think it's a reasonable expectation that a HTTP client would honor redirects as long as the usage isn't malicious (like loops etc)

Great list. Closing layer 3* access (inbound firewall rule of deny-all) helps mitigate against the common webhook attack vectors as well (should still do all the L7 mitigation and best practices alluded to above for a multi-layered approach).

* examples:

https://netfoundry.io/zero-trust-webhook-security/ (blog)

https://youtu.be/TWohosldLP4 (video specifically for a Lambda function)

Some webhook providers implement one-time verifications, CRCs, and heartbeat checks to address some of these problems:

https://webhooks.fyi/security/one-time-verification-challeng... https://webhooks.fyi/ops-experience/resiliency

Some also don't follow 301s or 302s and will process only specific http responses.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact