Hacker News new | comments | show | ask | jobs | submit login

This is using "ping", i.e. ICMP echoes. That's easy to implement, but problematic in more ways than you might at first think. Firstly, some network elements may be configured to give ICMP echo request/response packets a different delivery priority, which could invalidate results for your probably-not-ping-based application. Secondly, a virtual machine may have more jitter in scheduling the transmission and the response. Thirdly, you don't get a picture of how congested the route is - is it good for a gigabit blast, or only a trickle of data?

But there's a deeper problem: ping measures round-trip time.

Interesting fact about IP routing: quite often, route out != route back. There's no requirement that paths in the Internet be symmetric, and very often, they aren't. What's more, congestion is often one-way.

So let's say you have a 92ms RTT between two sites; you can't know, from the ping alone, if that's an even 46ms each way, or 53ms one way and 39ms the other, or perhaps even 83ms + 9ms. If your application is sensitive enough to latency that this tool might be interesting, then it's quite possible that such asymmetric results are also relevant.

(obviously the speed of light can give you lower bounds on the split, if you have knowledge of DC locations).

There have been substantial projects to accurately measure one-way latency. For example, the RIPE Test Traffic project from the RIPE NCC (https://www.ripe.net/analyse/archived-projects/ttm) was a large-scale and long-running observatory that kept more statistics besides, such as packet loss. Sadly the successor to this service appears not to measure one-way latency. For precision, it required both an appliance and a GPS antenna to be installed, so major cloud providers were unlikely to cooperate.




> Secondly, a virtual machine may have more jitter in scheduling the transmission and the response.

I have very recently been over exactly that with a customer that had considerably increased network latency on some VMs on his VMWare cluster when the CPU was under load. From the very beginning I was pointing to the hypervisors scheduler, and ended up measuring the time when the interrupt handler for the network card was active to receive the packet (which no linux or program setting could really influence). It took some convincing and arguing, but they found the magic setting on the hypervisor that made the problem go away (aptly named Latency Sensitivity).

I'm sure both AWS and GCP teams have that under very tight control, but especially the cheapest instances with the smallest and burstable CPU budgets that you would run for such a project are probably running with the noisiest neighbors on the most oversubscribed hardware.


Exactly. I've seen on a particular cloud provider too where, although the VM was idle it was discarding 20%+ inbound packets. That only seems to happen with providers that overcommit their VMs or don't configure their NICs correctly.


Hey, cclatency dev here.

When I wrote this tool my main purpose was to know where to create my VMs to have a reasonable latency to other services. In my case reasonable means < 100ms. This will definitly vary across products. As for the asymetric results you are absolutely right in fact if you look at the chart in the site the roundtrip fron A to B and B to A are different, which is typical in cloud environments.

I try very hard to emphasise on latency and not throughput, in fact I don't think I mention throughput at all since it adds another layer of complexity.

As for ICMP vs TCP/IP or any other layer 3/4 protocol it's just a matter of time. If this becomes a topic of conversation often I might end up extending the service to support "custom" packets.




Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: