Hacker Newsnew | past | comments | ask | show | jobs | submit | ksmith14's commentslogin

I’ve been using OpsGenie’s free tier for a number of years as part of a home automation/monitoring project. Guess it’s time to shop for alternatives.



depending on what you're using it for, Pushover [0] might fit the bill. it's what I use for my home monitoring setup - low-priority alerts just go to an email folder, but a high-priority alert (such as a water leak sensor firing) will get pushed to my phone.

it's a "no frills, in a good way" type of product. dirt-simple API with straightforward pricing ($5 lifetime subscription) and limits so generous I've never worried about hitting them (10k messages/month).

0: https://pushover.net/


I was actually using Pushover before I switched to OpsGenie. I typically have my phone set to do not disturb overnight and Pushover didn't have a "critical alerts" option to punch through - OpsGenie did. I may revisit it since I can dredge up the integration code from my Git repo.


Take a look at ilert, also have a free tier: https://www.ilert.com/pricing


Google staffs SRE teams as either 8 in one location/TZ or two geographically distributed teams of 6 -- often some pairwise combination of U.S., Europe, and Australia to accommodate reasonable on-call shifts.

The on-call compensation varies depending on what tier of service they're offering. Tier 1 (5 minute response time) is 2/3 of your effectively hourly pay for on-call time outside of local business hours and 1/3 for tier 2 (30 min response time). Or time off in lieu.


Note that this is at a minimum, I know some teams with 10-12 folks per location. That just also has downsides since you can end up oncall once a quarter which most people in the role don't like since the extra vacation is nice.


I have a house with 3 levels and when I moved in, 2 of those levels had analog thermostats. I'd be watching TV in the basement with the heat set to 68, go upstairs to bed, and realize I'd left the heat up so I'd trudge back down stairs, change the temperature, and then go back to bed. I ended up replacing everything with Nest thermostats within a few months so I could adjust them from my phone if necessary. Then Alexa came along and I could just lay in bed and tell it to do things, which was even more convenient.

I also have kids that are terrible at turning things off, so I've stuck one smart bulb in each of their rooms and quietly conditioned them to prefer that light to some of the switch-activated lighting so I can just centrally switch them all off once they're gone or when leaving the house without trekking around.


Having lived at worked in Poughkeepsie at IBM for several years found it strange to see it on HN as well. I've since moved to Massachusetts but something my wife and I really miss about the Hudson Valley are all of the historic sites along the river (Locust Grove, FDR's house, Vanderbilt mansion, Mills mansion, Olana, etc.) that were often free and were great places to just go sit and visit, as you say.


When car shopping a free months ago I learned that Kia wasn’t offering its UvoLink service in Massachusetts and I put it together pretty quickly. I was surprised though to discover the its sibling brand, Hyundai, _does_ offer its telematics service, BlueLink, in MA.


It's good you learned that lesson about having someone else announce the server, but isn't always that simple. With things like Apache Aurora it would often announce jobs as soon as the process started. With something like Finagle, it tends to do it as soon as the component is initialized, instead of waiting until the server is fully initialized and ready to handle requests.

Something that I implemented at my last job and others rediscovered as part of my current job is to implement an "administratively up/down" API as part of the control plane and only have the server announced if it was "up." Decoupling the announcement from process start/initialization complete allowed us to roll out new versions of software in a disabled fashion and then "flip the switch" (red/black deployments). It also enabled us to take individual instances out of service without killing them, enabling developers to debug issues/anomalies more easily.

Load shedding/backpressure/rate limiting at various layers is also extremely helpful, whether at the load balancer/API gateway or at individual servers. That has saved our bacon numerous times.


This brief excerpt really does not do the Manager Tools' feedback model justice. All of the background has been stripped from it in order to fit in a pithy blog post and without that context it does sound jarring or shallow.

Over the past 3.5 years I've used the Manager Tools "management trinity" (including their feedback model) with my directs and found it works extremely well. If you've got a professional relationship with a direct and have briefed them on changes before enacting them, it doesn't come as a surprise and it's not seen/received as a "drive by."

I encourage folks to listen to a few of the episodes on feedback before judging it: https://www.manager-tools.com/2005/07/giving-effective-feedb...

In addition to Managers Tools they have a second podcast, Career Tools, which is geared towards any working professional. I recommend that one as well.


I worked for VMware from 2007-2014 on things other than the core virtualization products.

I think you’re overestimating the situation. There is a group that did have a fair number of CS PhDs to develop and maintain the core piece — the virtual machine monitor (VMM) but then there’s a large number of plain old everyday SWEs developing the bulk of vSphere and the desktop virtualization products.

I would expect VMW’s policy to apply to just about everyone. A few old timers that have reached the principal engineer or fellow levels might be able to avoid it but that’s a few dozen out of 20,000+


From the blog post announcing the FitbitOS 4.1 SDK it sounds like custom always-on clocks are in the works:

https://dev.fitbit.com/blog/2019-12-19-announcing-fitbit-os-...

(Disclosure: I work at Fitbit but not on any of the SDK or device stuff. This is my own personal opinion and should not be considered the official position of my employer.)


When will the spo2 feature ever materialize on existing Fitbits that have the hardware for it?


The Google SREs mentioned this in their book; the Chubby locking service had uptime that was so high that folks started to neglect making their own services resilient to Chubby failures: https://landing.google.com/sre/book/chapters/service-level-o...


+1 for this book. As a junior DevOps engineer this book has been super helpful.


the book is structured in a way that makes it pretty easy to jump around and pick and choose which parts you want to read or skip, so it's not a very large commitment to read it


Mine just came in the mail today. Pretty stoked.


Still that's bad design on the clients' part. E.g. - Just because malloc "never" fails doesn't mean it can't fail :) so better error check for it.


Doesn't matter. Engineering around human failure is part of the profession.


That's a beautiful way to put it. I'd read that book.


Well, I'm a Google SRE so...


Failure of malloc() might be a bad example to pick because on linux, by default, most distros overcommit, so malloc won't fail, generally. Instead, malloc will succeed allocating the address space just fine, but the RAM will get allocated upon first use, meaning that even though malloc gave you a supposedly valid pointer rather than NULL, actually using that pointer will crash your program.


Other distros may have this differently and return NULL. It's not portable and also just bad to not check for it.


Is there a way to fix this/switch it off? I never got the rationale for this behaviour.


There's a sysctl: vm.overcommit_memory=2

What most people don't realize is that you will get more OOMs if you disable overcommit.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: