Hacker News new | past | comments | ask | show | jobs | submit login

Then they get fired instead, well done ha.



Comments like this make me wonder if people really expect engineers to be fired because of an outage? I do not work at Google, but none of my workplaces would fire engineers because of a failure. Mistakes happen. As long as they are not repeated, everything is good.

If your company fires people in situations like this, run away and never look back.


Googler here, not speaking on the behalf of the company, my opinions are my own

People do absolutely NOT get fired over incidents. Making mistakes is human. An incident will prompt a review of the systems and safeguards in place to prevent such an incident, much like an airline incident investigation -

basically "somebody fat-fingered it" is never the answer, postmortems are always blameless

EDIT: now that I think of it, the opposite thing happens after a major incident - a systemic failure should be identified, people are being hired to fix it :)


> Googler here, not speaking on the behalf of the company, my opinions are my own

Why do employees at big tech names (FAANG et al.) are so often so cautious as to include this as a foreword everywhere? Twitter bios are full of that, for instance.

It is crazy to me; who would expect anything else that our opinions being your own and nothing more? Who would expect that your word (with all due respect) is worth anything with regards to the company's PR?

Is there an actual risk in the US? Have there been trials or anything that push people to add such statements?


Can't speak for other companies, but this is covered in basic training at Google - if you're not authorized to speak on behalf of the company, you must make it clear when your writings may be mistaken or constructed to represent the company.

Basically the company has specially trained people that speak on behalf of the company, and that message should not be confounded by personal opinions of other employees. For example, on the recent FB outage, there was an employee posting inside information on reddit - media companies just took it at face value and ran around with it reporting as it was what FB itself was saying about the outage.

I'm not aware of any actual risks in the US, but then again I'm not in the US. For me this seems a minor point, and I actually enjoy separating my public persona from the company for which I work, being it Google or a small startup.


> On the recent FB outage, there was an employee posting inside information on reddit - media companies just took it at face value and ran around with it reporting as it was what FB itself was saying about the outage.

To be fair, I think the media would have done that even with a "speaking only for myself" disclaimer.


Every company says that, the obvious solution is to never name it when you speak. Why do these people need to say "Im a googler" and then immediately "but forget it, I speak on my own"... obviously there s value in the fact they re at Google and it will color their discourse which is already probably forbidden.

Dont name your company if you intend to speak for yourself.


It's the same at all large companies - its CYA boiler plate.

Though I almost got to be the official spokesperson for British Telecom responding on the alt.2600 news group about the the Met police VMB hack - press office was cool but the internal security was not.


Would you say: "I live in [city], opinions are my own", or "I am married to [person], opinions are my own"?

If no, why are you doing this for the company you're trading skills and time against money?


I work at a large tech company, and they do mention in the on boarding materials that we represent the company, so we should be careful in our social media profiles. My solution to this is to not associate my social media profiles with my employer. This is technically not really what we’re supposed to do, and I might have to change that approach at some point if I move high enough in the org to start getting attention from people, but this works for me better than disclaimers on all my posts.


if they wanted to control this so bad they'd provision you a managed account like how email addresses are managed.


Yet all these googlers breached it immediately in the first sentence by naming their company...


Sarcastically, "I'm a googler, opinions my own" reads a lot like "I'm a googler, just so you know".

I didn't want to emphasize on that on the first comment yet to be honest, I find it pedantic because it's pointless, legally speaking.


> Why do employees at big tech names (FAANG et al.) are so often so cautious as to include this as a foreword everywhere?

This isn’t just ‘big tech’ - I work at a relatively small tech company, but I’d never want anything I say about the company to be mistaken as some sort of ‘official statement’ especially if it related to an incident that possibly had a financial impact on external parties, and could conceivably be misused in that context in the future.

I go as far as never writing private emails from my work mail for the same sort of reasons - although that is from a possibly over-abundance of caution.


It's in the spirit of full disclosure, which some, including me, appreciate.


Part of it is because the company asks us to. Part of it is because I think it's reasonable to tell people your biases, and it can avoid the situation where substantive conversation gets derailed by "gotchas". If I make a comment about how I think Google Meet has the best noise cancellation of any video chat software, even though I don't work on Meet or anything adjacent to it, it's still a bad look if someone can dig through my comment history and pull out a previous comment about how I work for Google.


> It is crazy to me; who would expect anything else that our opinions being your own and nothing more?

Rumor mill journalists will mine social media and forum comments and write entire articles about "so and so FAANG employee gives hint at future merger" when some dev comments how much they've enjoyed using some library recently.


They want to mention their employer to gain authority in the discussion, but since mentioning their employer is a legal/PR risk, they need to follow it up with a disclaimer (this only partially mitigates the risk, but it’s worth it to get the brag in).


It s because they only hire idiots at Google. Im from a big company, I just dont name it and assume humans understand my opinions are my own :D


Especially when it's not an opinion.


In fairness, 'opinion' is a horrendously vague and ill-defined word. It does double duty as (i) 'normative value' and (ii) 'personal understanding of the descriptive facts', which two senses are constantly confused - for example right here.

That's why we constantly get "it's just my opinion" used in reference to type-ii opinions (personal understanding of descriptive fact), when it's only really appropriate to type-i opinions (normative value).

Many conversations would be far clearer if it were abandoned in favour of more precise language, IMPUOTDF.


"Many conversations would be far clearer if it were abandoned in favour of more precise language, IMPUOTDF"... um whats IMPUOTDF? I did try to google it, but only this post was found.


Sorry, I was joking: 'in my personal understanding of the descriptive facts', referring back to that second definition of 'opinion' earlier in the comment.


Yeah, this 'blameless' ethos has definitely trickled down from FAANG to decently-sized decently-reputed places I've worked at - and certainly to #EngTwitter.

I think it's a bit over-applied in some cases. Does it not commit you to the theorem that every process can be made so perfect as to be completely invulnerable to one human being making a mistake? (At least, in the form exemplified by the common tweets to the effect that "your processes are to blame for $incident, not your interns/engineers/etc".)

Even if you required two-person auth for every single thing, two people will make a mistake now and then, and in reality - due to our being social animals - the two probabilities are not truly independent.

I just don't see how this is feasible in reality. A more realistic principle feels like: "people will infrequently make mistakes, and that's of course natural and human and forgiveable, but far fewer incidents should be vulnerable to human error than currently are".


I of course agree that mistakes are inevitable. That being said, the point of blameless culture is not to make a process invulnerable to mistakes. Instead during a post-mortem, we look at how to prevent that particular incident from happening again.


You're totally right and the SRE book by Google goes over this - the company's culture does not allow firing people for outages. If you're somewhere this still happens, run away (or you'd better be getting paid more than top-end ICs at Google.


Why would you fire an engineer you have just spent millions to train?


What tech company spends millions on training anyone?


It's reframing lost revenue, not talking about literal training cost.


The other comments already explained it, but I'm wondering how you haven't come across this 'saying' before. It's so overused and also cheesy in my opinion.


People are born every day. Every day tens of thousands of people will hear about hacker news, the pyramids, Darth Vader being someone's father, for the first time.


It was not meant ill-willed. I was just wondering. Regarding what you said, I think I understand what your point is. On the other hand it makes some difference among whose people you ask - some things are much likelier to be common sense (or at least heard of) than in other places. Whatever.


Lol maybe if people actually did spend millions training their people up front we could do better?


139,995 employees at Google * 1,000,000 = $139,995,000,000

$140 billion dollars. On training.

On the one hand... you know what, I'd love to work in an environment like that. Seriously.

On the other hand... what's the argument you make to the CFO in support of this? Honest question, interested to hear answers.


I work part time in the Army. In the Army when you go from their equivalent of junior to mid-level they take you out of your job for eight months of dedicated personal development, before you start your first mid-level job. When you go to their equivalent of senior they take you out for a year.

I don't know how much that costs all-in, including the salaries, instructors, facilities, but might be starting to approach a million.

That's valuing training!


But the army is a cost center! The workers have some money shaved off their salary to pay for an army that allows delinquents and half-disabled to pew-pew guns in the forest, leaving them in peace. It's not comparable to a productive entreprise that needs to build things for people or perish.

For instance, if Google fails and cant profit, it cant just shoot at their client until they pay. Your organisation can.


> It's not comparable to a productive entreprise that needs to build things for people or perish.

Well we had to build an Army to win against fascism in the Second World War or we all would have perished.

And 'perished' means literally dead or subject to fascism, not just going out of business.


WW2 was 70+ years ago. The USSR fell 30+ years ago. Military budgets are still incredibly high. The army as it is today does not need to be that efficient. And even during WW2 times the US did not face a credible threat of invasion. The last time the US faced an invasion on the mainland was during the war of 1812.


> The army as it is today does not need to be that efficient.

You want the Army to be... less efficient? Spend more for less capability?


No I mean that for the US army today the downside to the army being inefficient is that money is wasted. Not great, but not a disaster. For a different country that could mean the country gets invaded or the government collapses (like Afghanistan).


Middle and upper management get there via connections and picking things up on their own. It's unsurprising that they don't want to be subjected to competition with lesser people who can "merely" be trained to do their job as well, or better.


Some jobs can be learned on the job.

But I'm glad that learning to kill people (military) is not taught that way.


If your job is something like a staff officer in a Brigade, you could learn that on the job, the Google way, because exercising is also 'the job', but they don't get you to do that - instead they take the time to fully pull you out of all work commitments for dedicated personal development. These periods of personal development are about personal skills rather than combat training, which you've already done by this point.


Someone who just cost their company millions in revenues is gonna be _extremely_ careful not to make the same mistake again. Hence, million-dollsr training.


The GP is a reference to the anecdote about IBM’s Thomas Watson not firing a executive who had made an error costing the company a substantial amount of money.


The implication is that millions were spent training the person who made the mistake when they cost the company millions by making that mistake.


An outage can be pretty expensive but it is training for those whom triggered it and/or those that fixes it.


I'd guess because Europe-wide outages are costing more than millions


But firing someone doesn't undo an incident. It just introduces other weird incentives. People become afraid to change things for fear of breaking something, or when something does break they try to hide it rather than feeling like they can immediately ask for and receive help to fix it.

The only time someone should be fired for causing an outage is if they're negligent or sloppy or mess things up all the time. This is rare. Almost always outages in large systems are the combination of many factors — latent bugs, design flaws, abnormal load, etc, any one or two of which wouldn't take the site down. But when the combine in a perfect storm that nobody foresaw things fall over.


But now you have someone in your team who will never, ever make that same mistake again and should be your new go-to guy for all X related changes (X being DNS or what-have-you). Firing someone with that type of experience does not lead to success.

100% of all devs make huge mistakes, at least once.


> But now you have someone in your team who will never, ever make that same mistake again and that should be your new go-to guy for all DNS related changes.

I'm not entirely sure that's always true. For example, i've seen people introduce N+1 issues into a codebase, spend evenings fixing them and refactoring code to fix production issues... just to later introduce those very same types of issues.

Sure, you can learn from mistakes, have post-mortems and so on (provided that your org even does those and that anyone listens and cares about the conclusions from those), but to me it feels like the most foolproof way is to ensure that no-one can make these mistakes again, be it with a checklist (which tend to be ignored, honestly), or better yet, an automated CI step or a new test suite.

In my eyes, it's basically the same as with unit tests - everyone agrees that you need them, but people rarely write enough of them. So if you introduce something to prevent them from not doing what they should, e.g. a quality gate within a CI step which will disallow a merge once the coverage falls below a set margin, suddenly things are a lot better in the long run.


N+1 issues aren't nearly as devastating as N^2. Commend them for not putting your systems to a complete halt, then teach them how to reason about this properly.

>> a quality gate

Yes, this, also.


> N+1 issues aren't nearly as devastating as N^2.

Depends on the project, i guess: if you're unlucky enough to be working on a monolith and suddenly a page takes 5'000 SQL queries to load as opposed to 100, because someone thought that initializing data through service/DB calls in a loop is "easier" than writing views in the DB, it might still kill the entire system anyways, depending on the count of users.

And once this data initialization is sufficiently complicated and convoluted for you not to be able to rewrite it and them not wanting to rewrite it, all while "the business" is breathing down on your neck, you might either want to introduce caching (and possibly run into cache invalidation problems down the road), or just freshen up your CV.

I guess i'd also like to expand on the previous suggestion and advise others to consider performance/load testing as well, especially when coupled with APM solutions like Skywalking or even Matomo analytics, both of which can allow you to aggregate the historical page load times, CPM and overall performance of your applications, to figure out what went wrong when.


Still, that engineer (if its engineer's fault) is extremely unlikely to make that issue again. IMHO, the problem is systemic in that why the system allows such errors (if its human error) to happen. Given Google's scale I think a lot of the generally known scenarios are covered and what you see is tens of services interacting in not obvious ways. Those unobvious scenarios manifest in situations like this.


It makes no sense at all. After the outage you have not only a review of the causes and appropriate remedies, but also more experienced people who are now more aware of possible consequences of seemingly unrelated actions and will take extra care not to make these things happen in the future.

Also, such cases are rarely the "fault" of a single person. Or, the direct/immediate cause is often not the main one.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: