A) A server that makes HTTP requests.
B) It does so based on a white or blacklist
C) These white or blacklists are circumvented due to bad parsing
D) This can be used to call unintended services (and sometimes protocols).
I had an idea to build a generic Webhook service. My thought was that the actual machine that does the requests should be 100% isolated from our VPS and only have public network access. Would that largely stop this attack? Perhaps it should also have a rate limiter to prevent DDOS attacks.
Waiting for every url parser to be fixed seems impracticable. This to me is a great example of why the WHATWG url specification is so terrible. It's so much harder to implement than RFC 3986.
Things that aren't browsers should just implement RFC3986 and reject anything invalid.
You're not the only one who thinks that:
(Be sure to read the link to the WHATWG's GitHub issue.)
Good moment to lock down outgoing requests, I guess. At least to port 25 :)
The wrinkle here that might be, in the opinion of some, worth noting is that this gives SMTP servers something they understand fully. The HTTP server has been convinced to send perfectly well-formed and valid SMTP commands.
This would at least prevent it from attacking your own internal services but wouldn't prevent it from being made to make requests to other outside services. Effectively leaving you up as a potential proxy for other attacks. I.e. the SMTP TLS stuff, making you able to send spam on someone else's behalf. A bit nuts.
But could be done via host firewall rules to some extent.
Of course, that assumes you're not doing something like running in AWS and using the AWS firewall. And that your firewall never has any issues at all.
In general, defense in depth is a preferable strategy. Your entire defense should never be reliant upon a single control.
Exactly why I used “to some extent” as a suffix to the statement. One tool alone doesn’t necessarily make you safe, but it might be enough for practical purposes depending on your risk profile.
I would rather let the OS connect the way it prefers and once it is connected, check that the remote is a legit IP address before sending anything.
Connecting first and then checking the remote IP before sending anything could work I suppose, but I think you'd still need to check against a blacklist.
So, again in practice, the security consequence of implementing RFC3986 is typically much worse than that of implementing the WHATWG URL spec. At least the latter actually defines handling for all inputs, instead of leaving implementations to make things up as they go.
The main problem is that the parsing is inconsistent between different subsystems, i.e. the parse function interprets the URL one way and the fetch function interprets the URL in another way.
This is why I think the most practical solution is to add another layer of validation that checks that a given URL belongs to a very small subset of all valid URLs that is interpreted consistently across all your functions.
This isn't it, but same title and content