

Ask HN: Should I block HughesNet satellite provider? - pinkbike

HughesNet is a satellite internet provider so I assume that traffic from their IP addresses is some proxy for their clients. Now these proxies seem to be broken as they don't seem to process relative links in css and javascript files correctly.<p>Our pages www.example.com include css and js files which are located on other hosts such as, static.example.com. These static css/js files include a relative link to static files ex. url(../i/corner.gif).  Everything works correctly, but the ip requests of  from HughesNet are requesting 1000s per minute of www.example.com/i/corner.gif  
and all other static files that we use.  In fact they are requesting things like  www.example.com/google_ads.js which are files that are relatively linked from the google ads includes.
Clearly these guys are doing something wrong. As you can imagine they request these at once, so I see things like 100 request in the same second to my web server.<p>What do you suggest?
======
SwellJoe
Why are you asking us?

Contact Hughes network administrator. They probably have a prefetching web
caching server, that may be misconfigured or have a bug when interacting with
your particular web server with your particular configuration...or your site
may have some cache-control data botched. Or a little of both.

You have the offending IP addresses...look them up at ARIN.net, and find out
who is responsible for their network. Send them an email with what you've just
told us.

~~~
pinkbike
It would be their systems that are misconfigured.

Well I'm asking for a couple of reasons.

1\. To get advice like yours, which I followed in addition to sending emails
to their main site.

2\. This is a somewhat interesting issue in that more and more sites simply
pass every request to their script engine and then process the URI to
determine what to do with it. This seems to be true on HN also as you get a
response to every request.

If you go to <http://news.ycombinator.com/test.gif> this is being processed by
the back end most likely. Now multiply this by 10 or 20 such requests for
every 1 real page load from one of these misconfigured systems. This of course
wastes a bunch of resources. In the least it may start sessions that don't
need to be started, at the most, even db connections are made before module
information is determined and the request can be rejected.

This is not just some single person on the net with a misconfigured
cache/proxy but an ISP. This potentially means that all users using their ISP
are causing much more load on sites than they need to be.

Is this trivial? It may seem so. In the old days this was a non issue. The
server did not have that file and minimum resources were used to send a 404
response. Today, it's probably less likely, as your web app may be launching
bootstrapping code, starting sessions, db connection, just to determine that
this is a bad request.

3\. It's a warning for all us developers, that misconfigured systems like this
do exist out there and you can take measures in designing your site to handle
these with minimum resources.

