Show HN: Rust Library to check your Internet connectivity

OskarS · on May 28, 2019

So... the test is "can I connect to either the website icanhazip.com or OpenDNS". With the IPv4 addresses of both burned into the code.

So this depends on two external servers never going down and not ever changing IP addresses, one of which is pretty sketchy ("icanhazip.com", seriously?), and it wouldn't work AT ALL in an IPv6-only environment (which I grant is rather unlikely at the moment, but might not be in the future).

I like Rust a lot, but micro-packages like these are giving me real bad "left-pad" vibes.

sushibowl · on May 28, 2019

Am I currently connected to the internet isn't really a question which has a meaningful answer in the first place. The internet is not a monolithic object you can connect to, and achieving a connection to any endpoint doesn't particularly guarantee your connection to anything else.

The actually meaningful question you almost always want answered instead of "am I online?" is "will I be able to connect to X host?" The obvious check is to try to connect to that host.

OskarS · on May 28, 2019

Yeah, I totally agree, but you can sort-of imagine this being useful as like an indicator light in your OS or something. But don't use this library for that. Don't use this library for anything.

jesusprubio · on May 28, 2019

Obviously, you're right. And I can imagine any developer can imagine the same thing :).

But I couldn't find a better way to check if I am online than making a request to any server. In fact, I'm using two over different protocols just in case.

oconnor663 · on May 28, 2019

One area for improvement: Default to a set of domains to check rather than just one, for example google.com, facebook.com, baidu.com. Check them all in parallel, and return success as soon as you get a good answer from 2 out of 3. This avoids falsely reporting "offline" when one host happens to be down, and it avoids slowing down the check when one host happens to be slow. Probably use `mio` to orchestrate this, to avoid the overhead of threads. (Note that this is similar to how musl libc implements DNS.)

Another area: Think about what should be reported in a country like China. For example, facebook.com is blocked. If China blocks google.com tomorrow, does that mean that every computer in China is offline? Probably not. On the one hand, this isn't something that a library can give a one-size-fits all answer for. On the other hand, since China has the most internet users of any country in the world, this is probably something that the design of the library should consider and offer documentation and best practices for.

Another area: Many GUI apps expect to make ongoing connections, e.g. chat apps that are always listening for messages. These apps need to persistently check the status of the network in the background, and many would be willing to dedicate a background thread to this library if it helped with some of the details of that, like 1) detecting hardware-level state changes, like WiFi coming and going, to prompt an immediate re-check, 2) checking in a rapid loop after a network change, but backing off to a lower background rate when the network is stable, 3) "de-bouncing" network events in case the network seems to be coming and going multiple times a second, so that UIs responding to those events don't flicker.

Higher level: This is a common problem. Find other libraries that have solved this problem before, compare the different ways they've solved it, and document the tradeoffs. I appreciate that I'm suggesting that you do a lot of work, but if you intend for production applications to use this code then it's very important work.

jesusprubio · on May 29, 2019

Thank you for the feedback. It's only my first crate I wrote to learn about the environment. It emulates more or less this one in Node.js a lot of people uses: https://github.com/sindresorhus/is-online

I wanted to implement it in this way. But I'm waiting for async/await, it should be trivial with a map function or something similar. But now it already tries a fallback connection if the first one fails.

Good point about the countries, I need to add an issue to cover it.

twic · on May 28, 2019

There's been a push in recent years to make Rust better for web development. One consequence of that is that you get more web developers writing Rust.

TD-Linux · on May 28, 2019

There are a lot better websites to ping for this, that are used for captive portal detection. E.g. Chromium uses http://clients3.google.com/generate_204 and Firefox uses http://detectportal.firefox.com/success.txt.

bspammer · on May 28, 2019

Am I right in thinking that either of those pages going down would result in "Sign in to your network" dialogs being shown in every single instance of their respective browsers? That's crazy.

I mean I know this is google we're talking about, their uptime is second-to-none, but that still seems like a hell of a dependency.

OskarS · on May 28, 2019

I doubt it. I think these are for detecting whether or not you're in a "captive portal". I.e. airport wifi that requires you to enter login credentials or something.

But hey, give it a whirl: block detectportal.firefox.com with your hosts file and see what happens.

baud147258 · on May 28, 2019

I gave it a try, I didn't see any difference.

praseodym · on May 28, 2019

No it does not. It will only show that dialog if the page returns anything different than the expected response, exactly like a captive portal would.

sgeisler · on May 28, 2019

That could even be used to return an enum instead of bool: Online, CaptivePortal with the URL as member and Offline.

timClicks · on May 28, 2019

Rust is growing. Growth means that you'll see a larger number of these sorts of mini projects. My first public Rust project was tiny (although it didn't connect to the network) https://github.com/timClicks/cool_faces

It's easy to make comparisons with left-pad, but left-pad was an issue because of the dependency graph. It wasn't an issue as a package in isolation.

jesusprubio · on May 28, 2019

Couldn't agree more, thank you! :)

liara_k · on May 28, 2019

This is the pro/con of making systems programming more accessible. You're getting a lot of new developers without a lot of experience wanting to start learning a powerful new system. This often means making something semi-useful to get started, and hoping to contribute at least _something_ for the effort.

The downside is that there's definitely a learning curve and you get packages like this early in the process. The upside is that it's bringing a whole bunch of new people in who are willing (and increasingly more able over time as they learn) to help write new open-source software.

StreamBright · on May 28, 2019

This functionality is covered by netcat on Unix but I am not sure how to do this easily on Windows. I am also not sure if this tool would compile on Windows though.

jesusprubio · on May 28, 2019

This why we're called engineers, we put pieces together.

So, your solution to "left-pad" is to write all your own code? xD. I think you need to review what was the problem there.

About the support to different services, those are the ones I use in my applications. But you are right, it would be a nice feature. PR's are welcome by the way ;).

The same about IPv6 support, this is an important feature.

Added both as issues: - https://github.com/jesusprubio/online/issues/5 - https://github.com/jesusprubio/online/issues/6

OskarS · on May 28, 2019

If you want this package to be a quality package that can be used in production, there are many more steps you need to take. Determining if you have access to "the internet" is not a trivial thing. Off the top of my head, here are some things you should consider doing:

1. Use platform-native APIs. Both Windows and macOS has apis for this [0] [1], and you are not going to know better than them. Use the "test websites" option only as a fallback or on platforms that aren't supported.

2. Remove all burned-in IP addresses and use system DNS. Since the whole point of this is to make a generic "can the system reach the internet" indicator that is never going to be 100% accurate, you can pretty much just say that "if you don't have access to DNS, you don't have access to the internet".

3. Make sure it works with IPv6.

4. Have a long list (at least 20-30, preferably more) of servers that you check, of companies you can be very sure will not go down. Things like "google.com", "apple.com", "microsoft.com", etc (other good endpoints have been suggested in this thread). When you want to check if you have internet, pick a random 5 and check if any of them works. Consider censorship here, Wikipedia is not going to work for you because it's blocked in many countries. Also: try and find sites that are cool with you using them like this, which I imagine will be a challenge. Preferably, you should get explicit permission.

(honestly, I'm unsure about this part. It feels very skeevy to use these kinds of sites in this way, but it's hard to think of an alternative. I guess check what sites other libraries that provides this kind of support uses)

5. Preferably provide some extended error information. Is the problem that we don't have any network adapters, or a cable not plugged in?

Point number 1 would probably be the most important thing, since that's a genuinely annoying and difficult thing to do. That might be worthy of a package/crate that properly abstracts away the native APIs. But if I were to even consider using this for anything, I would want a FAR more robust system than what you cooked up.

In regards to your comment about engineers: what your package does is trivial. It makes two requests and than reports if either worked. If this is what I wanted, it would have taken me ten minutes to write my own version.

If I had tasked an engineer with a task that could be solved this trivially, and their solution was to pull in an external dependency you have no control over, that would be unacceptable. It would fail code-review in a second and we would have a serious chat about what their job is. I realize that JavaScript and NPM has a different culture, but for systems programming for production use, this would simply not be an acceptable solution.

An engineer is not just a person who "put pieces together". An engineer also considers questions like "what is the cost of using this piece? What are the risks with bringing in an external dependency? What is the code quality of this dependency? Is any of these costs bigger than writing the code myself, from scratch?" As an engineer, you have a responsibility to ensure the robustness and stability of the whole system over the long term. It not just your job to slap together a few packages and call it a day when the CI build passes. A package/crate is not just "free code, yay!". Every dependency you add to any project brings with it costs and risks.

I hate to be this harsh. I understand your eagerness to contribute to open source, and I commend you for that instinct. But when you "Show HN", we're going to give you our honest opinions.

[0]: https://docs.microsoft.com/en-gb/windows/desktop/api/netlist...

[1]: https://stackoverflow.com/questions/15184490/check-for-activ...

dsl · on May 28, 2019

This library is dangerous and should not be used.

It contains hardcoded values that could result in the library not working as intended, or unintentional DoS attacks against innocent third parties if it were implemented in a widely distributed piece of software.

ay · on May 28, 2019

In general “checking whether the connectivity to internet exists” is an idea that may create a massive pain down the road when implemented, because the heuristics used are not equal to actual connectivity, thus will break one day.

Just attempt to perform the operation you want, and if it fails, give the feedback to the user that it failed. Why does it have to be more complicated than that ? (It is not a rhetorical question)

spuz · on May 28, 2019

Way back in the dial-up days, I wrote a program to measure the periods of time that a user was online and the corresponding phone call charges which varied depending on the time of day. This meant I had to know to the second whether or not the user was 'online'. Since the days of per-minute charges for internet are long gone, I find it hard to imagine other reasons you would need to know if the user is online rather than able to access a specific server.

jesusprubio · on May 28, 2019

It's only my first crate I made to learn about the environment.

But I truly believe in this micro modules way of building things. If it reduces complexity it's welcome. :)

jesusprubio · on May 28, 2019

Agree, answered in other comments but the idea is I truly believe in this micro modules way of building things. If it reduces complexity it's welcome. :)

jesusprubio · on May 28, 2019

About the option to pass other services I agree, answered here: https://news.ycombinator.com/item?id=20028483

About DoS attack, I think you need to review how those work, the library is not implementing amplification or something similar.

dsl · on May 28, 2019

Did you get permission from OpenDNS and Icanhazip? If not, you are effectively launching a distributed attack against them with worthless traffic.

If you don't understand what I mean - just pull the project and do some more research before you release something like this.

jesusprubio · on May 29, 2019

Launching a distributed attack by publishing a crate which makes one request? XD

You can relax, I know how they work. My final year project was about DoS. I would send you the link but it's Spanish.

But I can say you I've written a tool to implement this vector (among others) against VoIP services and it was presented even at the BlackHat conference.

dsl · on June 1, 2019

http://pages.cs.wisc.edu/~plonka/netgear-sntp/

d33 · on May 28, 2019

It's pretty easy to measure how many projects your library is tied to and scale the service up. Yes, ideally it would have a large pool of servers to test against, but I think it's a good idea to abstract the problem away and then work on it.

EDIT: it looks like I'm getting downvoted for this opinion - could any of the downvoters also reply as to why it's a bad idea?

martin_a · on May 28, 2019

Did not downvote you, but I think you might have missed the point: If the package is widely in use, one could change the host that is pinged/whatever. People probably don't look in depth at such packages and will just update it. This leads to probably MASSIVE traffic to endpoints which are not expecting this.

That is pretty much what happend with this dubious WordPress plugin developers that changed their "license check" or keep-alive check or whatever to do some hundred thousand (or so) "checks" to their competitors website per hour.

dsl · on May 28, 2019

> It's pretty easy to measure how many projects your library is tied to and scale the service up.

You are being downvoted because the author pointed at random services not under his control. So if a mobile app with million of users deployed this, an innocent third party that has nothing to do with the author would be hit with millions of requests they didn't ask for.

twic · on May 28, 2019

I don't know of any IP addresses that it's safe to hardcode in software like this.

However, there are hostnames that it's safe to hardcode: the DNS root servers, a.root-servers.net etc.

If you have working hostname resolution, even if intermittently, you could in principle resolve those, and then check connectivity. However, it's probably not a good idea to use these crucial public resources for something so pointless!

Interestingly, all of the root servers except G also have a companion website, ${letter}.root-servers.org, which could perhaps be used for an HTTP connectivity check. Those websites aren't hosted on the same machines as the nameservers.

jesusprubio · on May 28, 2019

Thank you! Very interesting thoughts, adding them to the repo here:https://github.com/jesusprubio/online/issues/7

Please feel free to contribute ;).

georgyo · on May 28, 2019

The universal consensus after anyone looking at the code is that they should not use it.

What I don't understand is why and who are they people who are up voting this and keeping it on the front page.

Do people just see rust, and then up vote? If so, while rust may be good, blind voting everything written in rust is good will cause distrust later.

maltalex · on May 28, 2019

I don't think there's a reason to use TCP for this unless you want to check for TCP connectivity specifically.

Just do a DNS lookup against a list of well-known public DNS servers.

oconnor663 · on May 28, 2019

One reason not to do DNS exclusively is that many networks allow DNS requests (and intercept them) but block HTTP requests.

jesusprubio · on May 29, 2019

BTW, there're parts of DNS over TCP.