Hacker News new | comments | ask | show | jobs | submit login
The Internet of Unprofitable Things (strugglers.net)
408 points by popey 24 days ago | hide | past | web | favorite | 123 comments



Device fleets are incredibly hard to get right, and if you have no updateability, you have to nail them the first time. For someone coming from a cloud background, it's a jungle out there, where all sorts of "easy" or "solved" problems are nothing of the sort.

As a sort of PSA, I want to plug the work that we have done at balena.io (formerly known as resin.io) to solve problems like these for everyone deploying Linux IoT devices, including open source / non-paying users.

We've built the open source balenaOS[1] and the (newly) open sourced openBalena[2][3] management server that anyone can use without paying us a penny. If you're about to manufacture a fleet of linux devices, and are about to put them out there, and don't want to pay for a service (so you're not using balenaCloud) but also don't want to solve problems we've spent years already finetuning, (such as ntp, dns, cellular modem support, read-only filesystem, host updates, etc etc) use the above projects and save yourself a mountain of pain in the ensuing years.

I hate being alarmist, but working with device fleets over the last 7 years, I've seen things you wouldn't believe. I've seen devices catch on fire, drown in storms, launch DDoS attacks, and everything in between. It's incredibly hard to underestimate how bad things can get when you put your devices out there with no ability to find them and update them if push comes to shove.

[1] balenaOS: https://www.balena.io/os/ [2] openBalena: https://www.balena.io/open/ [3] openBalena github: https://github.com/balena-io/open-balena


> I've seen things you wouldn't believe. I've seen devices catch on fire

...off the shoulder of Orion?


Like bad enclosures in rain.


> If you have no updateability, you have to nail them the first time

I would go a bit more radical: I'd say you'd have a problem.


Seriously. I do robots (OTTO Motors) and I just can't imagine shipping _anything_ to the field that I didn't have at least some plan for how to keep up to date. Even if the plan was the Panavision model of only ever leasing hardware and having all hardware periodically make its way back to the factory for updates and refurbishing, that's still more of a plan than a bunch of unidentifiable devices floating around at customer sites.


Hey, I wish the Pi foundation would take up https://github.com/balena-io/wifi-connect - I thought about doing something like this ages ago but never got around to it. Thanks for open sourcing it.


This one is one of those things that took us so many iterations to nail. Very happy it finally got to a good place. We'll integrate it into balenaOS soon by default, hopefully.


What kinds of algorigthms you guys use to coordinate fleets?

Also, your story is quite fascinating, can you tell us more?

I am very, and I mean VERY interested in working on something like that, but not sure how to transition out of the generic CRUD web programmer to your area... :(


To be honest, with fleets in the wild, simple works. We do deploy containers to these devices, but rather than kubernetes-ing the whole thing, it's more like "every device in the fleet must run this container". In IoT/Edge affinity is a big deal, so the payloads are pretty uniform. You don't want your container to migrate away to another device while you're trying to use the device to e.g. fly a drone or open a lock.

The difficulty comes in configuration, where things are much more complex than the cloud, since every device is "special". Different customers, keys, settings of all sorts. So we allow individual environment variable settings for each device.

The real complexity is in ensuring you can reach every device as long as possible, and that the devices behave well when they can't reach the cloud. We set up a VPN to ensure the former, and have a pull-based architecture where the device is responsible for "catching up" to the fleet when it comes back online to handle the latter.

There's a bunch of other things we've solved, like container deltas to improve download times by something like 10-70 times (and reduce bandwidth costs) but honestly the biggest difficulty has been in integrating cloud/hardware/developer/network/OS etc seamlessly, so people can succeed at building fleets, and fast. It's like there's 10,000 papercuts moreso than one or two big problems to solve. It's a long chain of things that all have to work right to make the system work, and each one has to be done in a way so that it doesn't break after some time passes (see the OP for an example).

To answer your last question with (yet another) plug, you can probably run through our getting started guide in an hour and have your first device that you can deploy a js project to pretty fast. If you spend a few days, you can start to have some real accomplishments. It's not that hard to make the jump these days if you have a web background (hell, most of the founders didn't have IoT experience when we started) but there are things you need to learn that relate to hardware and linux if you want to do something more advanced. You can learn at your pace though, and I thoroughly recommend starting with a project you enjoy, something like, but not necessarily https://www.balena.io/blog/make-a-web-frame-with-raspberry-p...


This looks really cool. I may be deploying a fleet of Raspberry Pi or similar servers in the near future and this might save us a ton of headaches.

The target context is a medical facility in a remote location without Internet (or extremely limited Internet), though. Is there a way to distribute updates by USB stick?

What's the best place to read stories of people who have tried deploying with Balena on RasPi-type servers and/or contexts without Internet access, and learn about their successes and difficulties?

Thanks!


please do yourself a favor and don't remotely deploy rpi or anything that uses microsd cards for OS image and boot. The write lifetime of those, even using "industrial" cards, will mean a huge failure rate even with very light usage. Only use some sort of real sata3 or nvme bus SSD.


There are ways to reduce the risk, though, depending on your workload. It is possible to use the sd card just to boot, with all actual file systems in a USB attached device. You can also use some of the (very limited) memory as a ram disk for things with lots of writes...


Yes, but even so, the write wear leveling algorithm in a $50 consumer class SSD and dwpd (drive writes per day) are vastly better than microsd media.


Will second this. A friend of mine had to write his own wear leveling software to make use of SD cards in his hardware, sadly bom made anything better way too expensive to be commercially viable.


If your friend has written about this anywhere or is able to share any code, I would love to learn more!


Hey there -- we have a bunch of case studies on our blog, click the "case study" links on this page -- https://www.balena.io/customers/

And if you're interested in a board that is as easy to program as the Pi but more robust, you can check out -- https://www.balena.io/blog/introducing-project-fin-a-board-f...


Hi, thanks! From the description at https://www.balena.io/fin/, it looks like I'd be spending ~$170 instead of ~$35 to get:

  - Ability to run on 6V to 24V power
  - Low-power coprocessor
  - Battery-powered real-time clock
The rest of the interfaces all look pretty much the same, except that the Fin loses two USB ports and the headphone jack.

Given that my plan is to run off of 5V USB power and I don't intend to write special software for the coprocessor, can you help me understand whether there might still be good reasons to use a Fin?


I have looked at your stack and it’s very cool. One thing I was most curious about is the long term plan for your docker fork. Are you going to upstream changes or APIs so you don’t have to maintain the fork?

The other question I have is about the stability of docker itself at scale. I worry about the reliability of essentially developer/server tooling in an embedded space where there is no ops team to help it along. Have you had any issues?


Hey there -- so this is a bit of a terminological thing, but we don't consider balenaEngine to be a docker fork -- it's a sibling project descending from the Moby Project, just like Docker, and of course highly copmpatible with Docker. The things we've done in balenaEngine aren't things that Docker would do. Shrinking the size to a quarter, removing Swarm, adding container deltas, changing the tradeoffs around durability -- most of the changes we have made relate to the embedded use case. As such, we intend to keep evolving balenaEngine alongside Docker, also because it's important for us to control the size of the binary. It goes into our root partition, and if Docker keeps adding things to their engine, we can't have that variable affect our ability to update devices.

We've done a lot of things to make sure balenaEngine is rock solid in an embedded environment, and our fleetOps team is pretty hardcore in helping customers get out of hard situations (which help us further evolve the OS and the engine to avoid those by default).


The way they burned the NTP addresses reminded me of Douglas Adams quotes "The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong it usually turns out to be impossible to get at or repair".


I think that a pretty neat idea to solve all these time-related issues would be more digital wristwatches.


But making sure that there is a wrist besides each clock is yet another problem.


That's what the towels are for.


Good a place as any to mention my short story

https://slashdot.org/comments.pl?sid=7132077&cid=49308245

possibly the only work of fiction ever inspired by NTP, IoT and cryptographic protocols.

This struggle with embedded ip address seems to echo this part of my story, "These hastily made things flooded the market and soon replaced other well-documented things. At times something failed and its inventors could not say why, they just assembled a new one or went bankrupt."


I absolutely loved this, thanks for posting it.

Do you have a list of your stories avalible?



Thanks. I like this part of the story: "some needed to be connected in certain places first then moved to the far reaches and re-connected, in order to work properly."


This was wonderful, thanks for sharing! Are you a Borges fan by any chance?


I got butterflies reading this because it's something I can see myself having done early in my career.

A relative newbie finding himself responsible for nontrivial design and implementation decisions for fleets of robots. Luckily they were always updatable. But if you asked me to set up the NTP story for them (which they had but people smarter than me worried about it) I would have Googled it for a while and just hoped that I didn't miss any fundamental understanding of how to use NTP.

p.s. this article felt like it was the perfect length. It shares the perfect amount of detail succinctly.


I like that story.

The author responded to the initial problem in the old-fashioned internet "we're all here to make things work" kind of way, without letting himself get taken advantage of.

And then when the problem decided not to play nicely, he increased the pressure in a civilized way.

These days most companies would have just said, "Sucks to me you" and cut off the dummy IoT company.

This also illustrates why big companies like Apple maintain their own NTP services.


I'm trying to coin "Postel decentralisation" for things like this: people assume NTP is a distributed robust system, but in practice it turns out to be run by one very overwhelmed guy in a basement somewhere.

(It could have been set up properly to be distributed here, but they didn't do it)


NTP itself is not decentralized, but the largest NTP provider is the NTP community pool, www.pool.ntp.org. The pool is operated by MANY PEOPLE IN BASEMENTS AROUND THE WORLD, using DNS to distribute the traffic. It's practically pretty robust for most purposes.

However, people just go to a list of NTP servers, then copy a few into the code, instead of using the distributed pool. Then it's not a surprise that the NTP in a product is going to stop working, meanwhile a one very overwhelmed guy who happened to run one of the server is going to have serious troubles, see https://news.ycombinator.com/item?id=18753835.


David L. Mills, the inventor of NTP, probably looks exactly how you picture him.

https://en.wikipedia.org/wiki/Network_Time_Protocol#/media/F...


Good on the author for playing nice with them. Hardcoding IPs for any purpose is a bad idea, this is literally the reason DNS exists.


I don't think that was really the main problem in this case. Hardcoding any address (DNS or IP) that you do not control is always a horrible idea.


Does anyone really control their IP addresses? I thought that ultimately those were controlled by ARIN, RIPE, et.al.


It usually goes like... If you have an AS number, then you have a public presence on the Internet, now you apply/purchase your own dedicated IP addresses, and get them assigned to your AS number, finally broadcast your route in BGP. Practically, you own a portion of the Internet, and you own the IP addresses. You can set redirection, AnyCast, CDN...


If you have your own dedicated IP range and you are advertising routes correctly, then you're as close to ownership of your IP addresses as Google or Facebook are of theirs.

ARIN and RIPE control allocations, but do not control routing to allocated ranges.


I think the point GP was making is that you don’t “own” IP addresses, you effectively “lease” them. If you don’t pay your ARIN (or others) dues, you’ll lose them.


Sure but that is also true of a domain name.


Domains are much more vocal about expiration in my experience, but maybe that’s just the registrars I’ve used.


A few "legacy" address holders in the ARIN region have managed to hold on to legal ownership. https://www.internetgovernance.org/2012/09/22/its-official-l...

But it's true that newcomers these days can't truly "own" their own addresses.


It’s so simple to setup your own Stratum-1 server, I don’t get why more folks don’t. The one I keep in my house is a 1U unit about the size of a 16 port switch and keeps time accurate to 15ns via a small Multi-GNSS antenna (yes, serious overkill for home needs, but I already owned it for a use case that used to need it).


Unfortunately, having more Stratum-1 servers does nothing to help clueless device manufacturers who hardcoding the IP address of an individual server, and damaging the community service. But yes, when we talk about the NTP community in general, we certainly needs more Stratum-1 servers. Nowadays, running a Stratum-1 is easy, the hardware is cheap, a single broad computer w/ time source from a shortwave radio or GPS. And if one loves electronics, one can build a precision oscillator (evem a Rubidium standard can be used for 300 USD), or go straight to make a GPSDO to achieve highly-reliable timekeeping.

Nevertheless, there are two practical difficulties.

1. Static IP addresses are unavailable in most home connections. And many home broadbands throttle UDP traffic, dropping them when the pps rate is high. It makes one's home unsuitable for hosting a NTP server.

2. Unlike Stradium-2, you cannot simply use a "cloud" service for your dedicated server. To run a NTP Stratum-1, you have to physically host your server with the customized hardware in a datacenter, which costs 100 dollars/month in my city, not including network transit and bandwidth. I really want to run one, but I cannot afford it.

3. Shortwave / GPS reception is usually not available in a datacenter, and an antenna installation is usually not allowed. You can be creative, a good way is using the time provided by mobile basestations. But it needs experience.


Re: #3, many good data centers either provide a stratum-1 to sync to or let you run your own cable/antenna if you are at least renting a cage worth of space.


> a good way is using the time provided by mobile basestations

That would definitely not be Stratum-1, so I wouldn’t recommend it.


Thanks for reminding me. If a mobile basestation is not usable, it seems to me that the best way is trying a WWVB-disciplined oscillator. Shortwave is certainly not available indoor, but longwave reception is possible. Hopefully it won't be killed by NIST due to the funding issues...


I’ve never been successful getting long wave reception in a data center (didn’t try running cable to an external antenna though, as then why not GNSS if that’s doable).


That and also many DCs I’ve been in barely get cell signal on the floor.


It's referenced in the article, but one of the earlier ntp abuse cases:

http://pages.cs.wisc.edu/~plonka/netgear-sntp/

Makes for interesting reading.


In short, if you are ever going to make an embedded device, or an operating system distribution, or anything with NTP default on, please make sure,

1. NEVER, ever hardcode an individual NTP server (in form of a IP or domain). DO NOT just go to a list of NTP servers, then copy a few into your code. DON'T ping pool.ntp.org and get its IP address written down. DON'T DO ANY OF THESE! PLEASE!

2. DO NOT use Stratum 0 and Stratum 1 servers. Please use Tier-2 and lower. Practically, if you follow Rule No.1, then you are always following this rule.

3. If the scale of your system is small, in hundreds, or in a few thousands, PLEASE USE pool.ntp.org, this is the NTP community cluster backed by DNS load balancer. Always request the DNS, and make sure the IP is not cached locally for too long. If you need more than one servers, use 0.pool.ntp.org, 1.pool.ntp.org, 2.pool.ntp.org, etc (3 is often enough).

4. If the scale of your system is large, such as tens of thousand, or you are making a new system, you SHOULD request a customized prefix from pool.ntp.org, such as debian.pool.ntp.org, it helps the community to manage the traffic. If your system is a large commercial one, you ARE REQUIRED to donate some servers to the NTP Pool to compensate the community. Another option is running your own private NTP cluster. The policy is here: https://www.ntppool.org/en/vendors.html

5. If possible run a standard NTP implementation, like NTPd, chrony, or something else as long as it's written professionally. Nowadays even lightbulbs run Linux, then why don't you run a standard NTPd?

But If you can't, then make sure...

(a) implement NTPv4, DO NOT use NTPv1.

(b) Read the new SNTP RFC if you are implementing an SNTP client. http://www.faqs.org/rfc/rfc4330.txt

(c) DO NOT synchronize time on the beginning of an hour, or 00:00 UTC! Select a minute in a hour randomly for synchronization.

(d) Use an exponentially-increase retrial interval, DO NOT keep retrying when the server is unreachable, you are launching a DDoS attack!

(e) Support Kiss of Death packet, your client should immediately stop requesting a server, cease and desist, once a KoD packet is received.

(f) Make sure the client will stop requesting the builtin list of servers, once an alternative server is set by the user.

These should have been written in all textbooks related to practical networking lectures, but apparently, there aren't. People don't even realize that their actions are harmful, and we have all the problems...

--------------------------------

The NTP community is a complete tragedy of the commons. Even many government institutions cannot keep up with the abusive traffic, and stopped providing public NTP servers.

Today, if we don't count Microsoft and Apple's NTP, almost all public NTP servers are provided by the volunteers from https://www.pool.ntp.org. By using DNS, it forms a NTP cluster to distribute the load. These people provide time for the entire Internet, and they are the people who withstand all the abuses day by day.

People just assume they are some random super servers that always work, without being responsible for their actions, such as hardcoding IP addresses, writing abusive retry code (without exponential increment of timeout), and making a cronjob that initialize a synchronization exactly at midnight (without randomization), effectively a DDoS.

Usually, if a device comes with hardcoded NTP addresses, it, in fact, usually indicates their program is poorly-written, and the manufacturers are irresponsible. Those devices have the worst homebrew NTP implementation on the planet,

1. They send ancient NTPv1 packets, while the latest version is NTPv4.

2. They synchronize their time on the beginning of an hour, effectively making a flooding attack. Another larger flooding attack starts at 00:00 UTC.

3. They retry interval is around 3 minutes, if fails to reach the server, make even more traffic to the server, rather than an exponentially-increase interval.

4. They still try to talk to the default hardcoded servers, even if an alternative server list is set.

5. They don't support the Kiss of Death packet, nothing can stop them if they became wild.

Stratum 0/1 servers are the most vulnerable: they have highest accuracy, with reference clock. Despite the acceptable usage of ST-1 is only passing time to downstream, or for scientific purposes, since there's only a handful of them and often listed publicly, they are often spotted by those manufacturers, and put in their devices by default.

Stratum 0/1 are usually provided by universities, or unpaid volunteers for the public good of the Internet. If a single server got hardcoded in those mass-manufactured devices, serious consequences can happen, the volunteer may literally bankrupt: your whole institute/school will be kicked out from the Internet [0]; when you came to the manufacture asking to pay the damage they are responsible for, you are threatened by a lawyer from California. [1] The whole Internet community should honor the spirit of self-sacrifice of these NTP volunteers.

The NTP community pool is Stratum 2+, suitable for general use. It has similar issues of abuses - once you're in and became well-known on the net, there's no way out and you keep receiving bad-traffic, because some clueless people have hardcoded your IP address, or has put it in a cache that never expires. Fortunately given a reasonable bandwidth, it is often a negligible issue and safe to ignore. But there are exceptions. [2] One of my NTP server became DDoSed one day, because an ISP cached the IP address for pool.ntp.org with a large TTL, and the IP address happened to be mine! The traffic was 40 Mbps...

In contrast, NTPd has proper rate-limit mechanism built-in, such as KoD and good pooling interval, blocking NTP does NOT causes more user traffic. What increased is the abuser traffic. The damage caused by a standard NTPd and silly sysadmin is much less significant and is negligible compared to the Internet of Scary Things.

By the way, not only hardware devices can contains dangerous NTP code, but also software.

As long as manufactures still write broken code and unaware of the proper way to use NTP, nothing can be done to solve this issue. Many involved in these misuses and abuses are totally unaware what they are doing. The proper way to use NTP should have been written in all textbooks related to practical networking lectures.

[0]: Flawed Routers Flood University of Wisconsin Internet Time Server http://pages.cs.wisc.edu/~plonka/netgear-sntp/

[1]: Open Letter to D-Link about their NTP vandalism https://web.archive.org/web/20060423012837/http://people.fre...

[2]: Recent NTP pool traffic increase https://mailman.nanog.org/pipermail/nanog/2016-December/0895...


On the one hand, I completely appreciate this. When I set up NTP, I'm careful to think through consequences and tread lightly. After all, I learned my sysadmin skills at a university before the September that Never Ended, so it really was a collegial place back then.

On the the other hand, the internet is a much bigger place. Things are orders of magnitude more complex. The feedback loops that made NTP work well in a 1990s university environment are mostly absent. When a problem happened then, I'd see something in the logs or in packet captures, figure out what was happening, and get the responsible person quickly on the phone. That's not even hazily possible these days.

As much as I'd love to think putting a stern warning in textbooks would fix this, I doubt that would matter at all. What we really need is a major increase in observability or traceability. And failing that, what we'll get is common resources getting sliced up so they fit within domains of accountability.


I do agree. A complete solution is designing a new generation/revision of time protocol with accountability and anti-misuse as parts of its design, just like how people are implementing ASLR and NX in C programs and starting to use memory-safe languages. Or just see how TLSv1.3 removed all unsafe algorithms, so it will be little damage even under the worse case scenario.

Also, I think NTP needs more publicity. We need people to be aware of it before we could get feedback. The community then can have a watchdog team that spots misuses and publishes alerts.


designing a new generation/revision of time protocol with accountability and anti-misuse as parts of its design

It would be "time.google.com" and you would need a Google account.


Or perhaps RoughTime when security is needed?

https://roughtime.googlesource.com/roughtime


There doesn't need to be a new revision when you can accomplish all those goals by running your own stratum 1:

https://www.amazon.com/TimeMachines-TM1000A-maintains-broadc...


I have one of these. They are super easy to set up and highly reliable. I can't recommend them enough.


Everything you say is true, but this article is not really about "how to get NTP right". It's about how (not) to deploy embedded software.

If customers can't update their devices' software (or you can't push a remote update), then you need to get the software right in version 1. This seems to be a foreign concept to software developers nowadays, who are used to the world of endless updates and patches. It takes a different kind of development process and a lot of QA to do it right.


Couldn't you CNAME ntp.yourdomin.com to pool.ntp.org?

Baking in domains or IPs you don't own always seems like a bad idea.


Immediately thought of Bruce Sterling's article on anticonventional objects. I guess this company goes in the buildable, but neither desirable or profitable section. - https://www.wired.com/2013/10/design-fiction-anticonventiona...


Question for people who read the link-

Do you like the writing style and inclusion of gifs?


... There were gifs??

I guess I'm so used to useless ads and graphics in blog posts, I honestly, scout's honour, did not notice and have no recollection whatsoever!

A damning statement on the Internet of today perhaps, but it neither enhanced nor detrimented my reading of the article ‍️


same! I had to go back and look because I would have sworn there were no images at all! Now I realize my brain had classified them as ads and simply edited them out of my conscious awareness all together.


Same here, I don’t even remember there being any images!


I found that the gifs added to the article’s entertainment value. I’d find them annoying if used in a more serious article, though.


Yeah. It was pretty humorous and made for a light read. Definitely would read more stories from this guy.


They were quite effective at conveying the author's frustration humorously, fits the rhythm of the article well.


It's a trend that seems to have started on Buzzfeed, and I've been seeing more and more blogs adopt this style over the past couple of years.

I'm personally not a fan, as I feel it takes away from the content of the post, but I'd be curious to see how it impacts the readership metrics.


Yes.

Also, I used to work in IoT, and you are fortunate with your outcome. So many OEMs are worse.


I’m okay with it. It wasn’t excessive, like some blogs tend to become with these


If relevant I like the inclusion of media. This is ... meh.


I found the headings a bit weird to read, and apparently my brain thought the GIFs were ads, because I didn't notice them at all.


I'm considering browsing with a default style of img { display: none !important; } and disabling it when something breaks.


I enjoyed it.

Is it practical for you to only play the animations on mouse over, and pause them otherwise? I think that might allow us the entertainment value of the gifs, without the annoyance of them looping endlessly while we're trying to read.


If the author had done that, I'd say there's a 50/50 chance I never would have known they were animated. I had no reason to roll over the images with my mouse.


The writing was good, but I honestly found the gifs as irritating as ads. They were just irrelevant meme content, and were just extra crap to scroll past.

Keep writing, it was an excellent post, but only include pictures if they are relevant.


I use reader mode because my preferred mode is what I've seen referred to as "wall of text"

I rarely (~5%) find pictures of any sort improve the understanding, even in "mainstream" news sources.


I liked both. It kept things light and breezy, a comedy of errors.


I don't dislike them nor did they stop me from reading the article but they were all superfluous. That said, so are most words in blog posts anyway so it's all good.


The writing style, yes. The gifs, not really.


I hate gif inclusion.


I didn't like the gifs, because I dislike all technical posts with gifs and memes, but the writing is good.


I detested the writing style.


What RTC drifts so much in a week (or even a year) as to throw off lighting or HVAC controls timing?


Ugh. So many.

There are tons of different RTCs you can get. Sometimes you don’t really have an “RTC” as much as you have an RC oscillator and a guess at what its frequency should be. A simple crystal oscillator could be as bad as 200 ppm, or five minutes per fortnight. RC oscillators are worse.

Decent wristwatches generally have temperature compensated crystal oscillators (TCXO) which can be calibrated at the factory, often by adjusting the counter values periodically (e.g. every N counts, add X to the counter). NTP is better than this, but only if you run it as a daemon.


An RTC chip without temperature compensation in an extremely variable temperature environment.


How far would a lighting control or HVAC control have to drift before it became impractical to use? I could imagine 45 minutes could still be “eh, who cares?”

45 minutes in a year is under 1 part in 10000.


I can imagine 5 minutes being well into I care territory. Could be the difference, say, between pulling into a dark drive way or having light.


Program the lights to come on 30 minutes earlier then. An overcast day and a new moon vs a day with high cirrus overhead and clear skies to the west almost surely has more variability than 5 minutes' difference.


If it drifts at 5 minutes per week you still have to reprogram after 6 weeks.


Sure, but that's ~500ppm drift, which is fairly extreme.


HT1380 drifts 10 seconds every 20 hours under normal conditions. I don't think this is outside the realm of possibility if they chose a cheaper chip or had weird temperature fluctuations.


1. 70 GBP/hr seemed like a low rate for a consultant (what’s an electrician or a plumber cost on a subscription contract?). I’m glad author doubled it.

What happened to NetThing’s customers after they ceased trading? Who took over the lighting management of the car parks etc. ?


Author here. I'm sure you're right. I pretty certain I could have charged them much much more and they'd still have accepted it.

In conversation with the software eng it was implied that they intended to send someone on site to each of over 500 sites to reimage the devices. That must have cost them way more than £70/month and the way that after ~10 months the number of devices actually went up to over 1,000 suggests they were happy to just keep paying.

The thing is, it was essentially no work. All I did was remove a firewall rule. I had to run NTP anyway for my regular customers. Initially more time was spent just in email back and forth and honestly I was enjoying that.

Because of it being basically no work, I had a moral problem with trying to find the absolute highest amount of money they would bear.

I know that is wrong and it does me no good, but I couldn't get past it.

What did annoy me was their inability to pay bills on time, and time I spent chasing invoices and creating custom late payment paperwork that is never relevant for my usual customers.

That was the main impetus for doubling the rate, and despite me jokingly suggesting that their product was not good enough to be profitable (I have no real data on that either way) I suspect they had much bigger organisational problems to be consistently paying late and ending up insolvent.


I like your attitude. If you were stuck fixing the mess the software engineer had to fix, you'd appreciate a random stranger making your life easier when they didn't need to. Maybe they'll pay it forward. Think of the money you didn't charge as your donation to making the world an ever so slightly nicer place!


Presumably the customers were SoL. That’s the problem with everything being run “in the cloud.”


I should really put some time into writing this up, but my first script that went corporate-wide was an NTP catch up script.

Now that I think about it, I really don't know why an NTP catch up script was needed.

Basically VMWare time was not reliable. Windows will by default not catch up unless you get to around 5 minutes off. My script checked every day to see what the drift was and correct it if it was more than 5 seconds of drift.

The underlying reason for concern was logging - we wanted to make sure that our log times were comparable.


> "their innovative reluctance to pay for anything on time"

Thanks great and educational writeup.


It's pretty cheap to embed a tiny GPS receiver and crappy chip antenna these days. Probably enough for time sync in many situations.


Is it? The cheapest gps chips are still a couple bucks a pop (not counting the extra costs of more complex integration.) And from my experience these cheap gps and pcb antenna solutions are basically useless indoors. NTP is probably still the right solution for these devices, being essentially free to add to a device that already needs an internet connection. Care should just be taken to implement it correctly, obviously.


And from my experience these cheap gps and pcb antenna solutions are basically useless indoors.

This is true. I know a TV station that has a tiny satellite antenna bolted to the outside wall to run its internal NTP for all of its wired and wireless devices because GPS simply doesn't work inside the building.

I'm not sure if that's a problem because of all the electronic equipment, or the construction of the building, or the fact that the building sits almost underneath a 500-foot-tall tower with several 10 to 20KW transmitting antennae on it.


all too common. this happens for the simple reason that such devices are created by people that have never deployed anything to the field, and never worked under anyone who has. that and of course the pressure to get something to market quickly.


Man I really enjoyed your writing style.

Got a huge kick out of that.


lesson: politely tell shitbag customers them to go f’ themselves.


BWAHAHAHAHA thanks both for the nice post and for politeness you prove to have.

That's not much IMO a problem of "modern IoT" but a problem of modern managerial-driven society that lead to a proliferation of Ford-model workers at ANY level, not only the lowest one.

People simply can't reason autonomously anymore, at any level, can't really understand "the big picture" of pretty anything: think only at periodic "cry" for $FamousFreeServices down and the relative cue of polemics that follow...


Interesting subject - but I found myself pretty quickly lost in the technica terms. I do not even know what an NTP server is.


I don't really understand comments like this on HN. I've derived a lot of value through the years from thoughtful questions asked to dig further into an article but I've never understood why somebody would assert "I don't even know what X is." Search engines exist, yes?


His argument is likely that you shouldn't have to when a simple explanation could fit into the article.


But it's not an "article". It's a blog post. An informal post, written informally, for an obviously technical audience.

I very, very rarely explain what "AWS" is when I'm casually writing about cloud stuff. It's table stakes. You should know, or you aren't gonna appreciate reading it anyway.


Logged in just to upvote your comments. This is a place of intellectual curiosity, and I don’t understand those who expect knowledge to be spoon fed to them. If you don’t know a term, search engine it and work your way down the stack.

You might find yourself pleasantly surprised you’re providing an NTP server in the NTP global DNS pool.


> search engine it...

That is, go to google.com or DuckDuckGo.com and search for it...

I mean, open the browser, click on location bar (on the top) and write...

That is, if you are on Windows, click Start menu (which is now 4 rectangles),...

Nevermind... /s


I use Safari on macOS.

When I need to google a term, I highlight it, and then press ⌘C, ⌘T, ⌘V, enter. (Copy the search term, open a new tab, paste into url/search bar, search term)

I've gotten quite fast at the keyboard sequence; it takes maybe one second total. I imagine I could make this process even faster with a plugin, But I see no need.

I would like to think that most Windows machines would let you be similarly performant by default. But if not, that's further evidence in my book that Windows just sucks...

I will note that some acronyms can be annoyingly un-googlable, as the same one stands for a wide variety of different terms. This problem does not apply to ntp, however, which comes up right away.


Can't you just right click and search from the context menu, since you just highlighted the term so the pointer ought to be in the vicinity.


...y'know, I actually forgot that existed. IIRC, at one point, that opened a search in the current tab rather than a new one, so I got used to my little keyboard shortcut instead.

The behavior appears to have changed at some point, though, because it now opens searches in a new tab. I'll probably change my behavior now. Thanks for the reminder. ^_^


To downvoters: thanks, always appreciate that.

My point was that the OP can't guess their readers' level of knowledge, and it would be impossible for them to cater to all levels (as my attempt to explain searching... failed to show :-/) If readers don't know what NTP is, they should be able to either ignore the blog post or find the missing bits of their knowledge by themselves.


I agree with you but if you take the time to just maybe write out an acronym once its much easier for us who aren't quite there yet technically to understand the context instead of having to search for 4-5 terms. This gives us a chance to enjoy the write up and gives it maybe a broader audience.


This is a technical person blogging for a technical audience. If you are not tall enough to ride this ride and aren't interested in growing, maybe find another ride.

If this were a project blog post explaining their latest features, I'd agree with you. If the point is outreach, then yes, they should make it accessible. But he's telling a story. A story that requires a relatively deep understanding of the history and practice of operating internet services. Him writing "NTP (Network Time Protocol)" will not make the story much clearer. And if he explains the whole background, then it's no longer a story, it's a general-audience essay. That's a lot of work for you to expect from somebody that you aren't paying.


Then reading what could be an interesting article becomes an exercise in being treated like an intellectual toddler.


Network time protocol

Obscure but critical way servers set their accurate time, but maintained by even more obscure people with limited recognition and reward.


You don't actually need to know what NTP is to enjoy this story. I knew I had heard of NTP but had forgotten about it. But it ends in P so it must be some protocol or other, right? Just create a variable in your mind called $protocol and keep reading. It could just as easily be Zuqwatny Protocol at that point for all I care.

That said, I went and looked it up after. Because that's what you do.


If you hover over the first instance of the word NTP on the post, it'll show you that it stands for Network Time Protocol.


I do agree the article could have been much improved by adding a sentence that defined what an NTP server was upfront.


As I noted in the sibling: it's not an "article". It's a blog post, written informally, for a familiar audience.

If you're stepping in on somebody's semi-public journal, it's probably incumbent upon the reader to do their spadework if they care.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: