Hacker Newsnew | comments | leaders | jobs | submitlogin
1 point by anApple 127 days ago | link | parent

After all, amazon was right to tell them that everything was in order. As it was.

He should fire his sysadmin for not checking the in/out network traffic...



4 points by spudlyo 127 days ago | link

Let's not be too hasty to play armchair sysadmin. Someone who claimed to be involved said the traffic never reached their servers.

http://news.ycombinator.com/item?id=859941

-----

2 points by anApple 127 days ago | link

They said they moved their servers to 3 availabilty zones and still had the same problems. What's the probably of having the same attack in 3 zones on a subnet you are randomly assigned to?

Besides, firing would definitely be a little harsh. Everybody deserves a 2nd chance.

-----

1 point by mingdingo 127 days ago | link

Is that possible? How could they be the only ones affected then?

-----

2 points by gojomo 127 days ago | link

Anything's possible. What if there was enough traffic targeted at just BitBucket that one of the 'last hops' to BitBucket's machines, which may just be a virtual hop in Amazon's own infrastructure, was the only one saturated? I suppose it's even possible that the affected machines could only see high-packet loss (and EBS sluggishness), not the arriving packets themselves.

-----

4 points by jespern 127 days ago | link

Correct. Our machines don't allow UDP, even. Either the physical machine our VM runs on, or the segment was flooded, which means we couldn't talk outside it.

-----

1 point by jacquesm 127 days ago | link

Switch statistics should be able to rule that one out for you.

-----

2 points by gojomo 127 days ago | link

Does Amazon make such network-equipment statistics available?

-----

2 points by jacquesm 127 days ago | link

I do not know, but any competent hosting facility will have those stats on call, it's what you base your billing on, so you'd better have them.

For the sites I operate this is my 'general health' indicator, bandwidth says a lot more than my alarms, if there is a problem it usually shows up in the bandwidth graphs before the alarms trigger (unless it is a power failure, but those are extremely rare).

Our providers make them available to us, and this has been the case with any provider that we've had to date (the planet, vxs, leaseweb and a couple of smaller ones), I'd imagine amazon has them too.

According to the Amazon FAQ you have to use 'cloudwatch' to get at this data:

"An Amazon VPC router enables Amazon EC2 instances within subnets to communicate with Amazon EC2 instances in other subnets within the same VPC. They also enable subnets and VPN gateways to communicate with each other. You can create and delete subnets attached to your router. Network usage data is not available from the router; however, you can obtain network usage statistics from your instances using Amazon CloudWatch."

You may have to do some arithmetic to see if a link got overloaded, one telltale on the bandwidth graphs is 'flat caps', where in spite of the machines inbound limit still not being reached you see a fairly flat top on the in or outbound bandwidth graph on several machines at the same time (if they're on the same segment, which on amazons infrastructure could be quite hard to figure out).

-----




Lists | RSS | Bookmarklet | Guidelines | FAQ | News News | Feature Requests | Y Combinator | Apply | Library

Analytics by Mixpanel