Ha, I guess I made a typo when I wrote the script 6 months ago. At least I cannot remember that I had some clever idea back then, why a day should have 25 hours.
In case anyone is looking for a good option, I've had great success with InfluxDB's telegraph utility which automatically checks both uptime and certificates for HTTPS endpoints, combined with a grafana dashboard that sends alerts when the deadline approaches.
What shell are you running this under? Unquoted exclamation marks will do things that may be outside your intent for this script. It’ll still fail, just perhaps not in the way you expect.
Indeed, together with "deleted production database" it's a rite of passage, which is fine. Problem becomes when same issues pop up over and over again (looking at you Microsoft, re ssl renewals) and nothing is seemingly done to prevent it in the future.
I remember a couple of years ago I was on the top of some old volcano on an island, and I was dictating shell commands to a colleague so that we can quickly fix our expired certificate. Good old days.
Maybe browsers should put up a warning when a certificate is about to expire; say two weeks away. Nobody should let their certificate get that close to expiring, but if it does, you'd rather it generate a lot of visible warnings before simply ceasing to work at all.
Browsers are very careful in introducing warnings to users that are meant to go to admins. They dislike the indirect "but users will complain to admins" route and I can't blame them - every warning a user will see and not understand leads to warning fatigue.
Crowd sourcing errors by relying on user generated reports is sometimes a good idea, however in this instance it's not. Cert expiry is something that can be automated and monitored because the expiry is deterministic. Crowd sourcing is for the non-deterministic.
I don't agree. A cert that is just about to expire is as valid as one that's new. It only has two states: expired or not. For the user is either expired or not, they needn't be concerned until it actually is expired.
Or how about the responsible engineers/testers/managers/whatevers setup proper testing infrastructure for the multi-million projects run by their huge international corporate? Feels like it's not a lot to ask for the most basic testing like checking when certs expire, setup automation to renew it and have fail-safes/error reporting in case the renewal fails.
Indeed, that's true! Not only you have to write the tests, you also have to verify it's working, and have monitoring connected to ensure it's continuously working.
But still, multi-million companies should surely be able to handle that.
If you change something, don't you manually check that it's actually doing what you want it to?
If I was an engineer changing a test regarding SSL certs expiry time, after a change, I'd test it with a cert that has expired, about to expire and one far in the future. Manually or automated doesn't really matter, but test your changes after you've done them. Really basic stuff.
My point was, there is no way to guarantee bugfreeness in any way. You can reduce bugs by automating and another test layer, but each new layer can have bugs too. Even if you prove the code to be correct, you only proved your assumptions of the requirements which might be wrong (both your assumptions and the requirements).
Because the comment said something like: There was a bug, so they didn't have a test, why don't they? And I replied: Tests can have bugs too.
perhaps a plugin which admins can add to their browsers to act as a reminder for themselves? having core browsers warn the general public about this all the time will lead to chaos.
That warns only after the fact, not in advance... depending on how the infrastructure (and especially the cert procuring process, for those not using LE for whatever reason) is set up, too late.
Nagios checks have thresholds for WARNING and CRITICAL. With these kind of checks it's usually 'days until expiry', so it certainly can be used to warn in advance.
Once identified, I would expect their "Incident Report" in some way would mention the root cause of the incident. However, that does not seem to be the case here: https://www.githubstatus.com/incidents/4mzhxxpwgvqg
Maybe someone could get fired for this or does it have more to do with Microsoft's stock/public image?
certdays.sh somedomain.com 14
If the certificate for somedomain.com is valid less then 14 days, it will fail:
https://github.com/no-gravity/certdays