As an engineer, I've never personally understood the desire to release anyway, even when a system has known critical deficiencies like this one. Sure, perfect is the enemy of good, but I'd also argue that inoperable is the enemy of good - it only serves to erode trust in your product, if your team chose to release an unusable product.
Some years ago, I worked on a system not quite as bad as what the author described, but close. We released a new product with a known, quite bad security vulnerability (I'd made sure our product team was _extremely_ aware of this), as well as no monitoring to speak of. The deadline had been communicated for around one year, but nobody had ever really discussed the significance of the date, other than it was what we were all death-marching to, and we needed to deliver.
What did that date turn out to be? The head of product management's birthday, which was revealed to the rest of the company on the highly-celebrated the launch date. People were just kissing ass. I left several months later.
It feels unconscionable to me that a company could have launched an incomplete, insecure, customer-facing product just to give a birthday gift to a leader, but I suspect this sort of thing is common.
There's a whole management philosophy around deadlines; that setting a hard deadline will make the project predictable. For non-engineering projects this is often true. For engineering projects not so much.
Non-technical managers have a hard time with software engineering projects because they're opaque and unpredictable. Opaque in the sense that no matter how many progress reports they get, the manager will never understand what progress is actually being made (because they can't read the code). And unpredictable because development is inherently unpredictable.
By setting a hard deadline managers seek to control the project. Engineers will come to them with "we can't release this by the deadline because $technical_gibberish". But the temptation is always to just stick to the deadline and refuse to extend it, because once you extend it once then everyone knows it isn't actually a hard deadline and the deadline will get extended again and the thing will never get released.
My worst case was one project where the management team decided that we could skip the testing phase and just test on prod after launch, because that would allow us to meet the deadline. Obviously the application completely failed because there were thousands of bugs in it that hadn't been caught. The post-mortem was a total blamefest. The project got canned and people got fired. All that effort for nothing, just because they wouldn't extend by a few weeks. But we met the deadline!
> But the temptation is always to just stick to the deadline and refuse to extend it, because once you extend it once then everyone knows it isn't actually a hard deadline and the deadline will get extended again and the thing will never get released.
Personally, I've seen so many "super duper important must-meet" deadlines which in the end turned out to be completely unimportant, that I've decided long ago that I'm not gonna do any more overtime to make any deadlines.
It's quite a sobering experience as a developer to stress over making a heroic effort to meet a deadline only to see after the deadline passed that actually it wasn't really so make or break after all, and that there was plenty of time to fix the issues afterwards. It's really not worth it to abuse your body and your mind by working overtime and spending too much time in a bad posture behind a computer just to make someone else happy.
Of course, I'll still do my best (I always do), and I might even agree to moderate amounts of overtime where it's warranted (as in: not because it is necessary because of shitty planning), but I'm not putting any emotional energy into deadlines, ever again.
Last time I did, we got the product shipped by the deadline. Only to find the marketing team had messed up their Google account and couldn't do any of the planned launch advertising or tracking. Took them two weeks to sort that out. No hint of an apology or any of the fireworks that would have happened if we'd been two weeks late delivering the product. So yeah, I'm with you. Never again.
I work better when I have a deadline, but i also expect my manager to allow for slippage if something is wrong with the product.
Currently we have an issue where we are telling the customer that we need a 4 week pilot to make sure that when real data comes in it behaves as we expect, and so we can get back to providers on how to fix their data. If you go live with only synthetic data ever going through your product then you are asking for trouble.
They are trying to negotiate that down to 2 weeks. Why? so their KPIs are met? Maybe it's because we are external, and if it fails they can blame us for the problem rather than them nickle and diming over a couple of weeks.
I want to consult in a different sector after this :/
Whole industries get wrecked by this dynamic. I heard a story once about a well known bank. They tried to develop a new customer facing e-banking system (or was it a trading system, customer facing anyway). They started by outsourcing it to Thoughtworks, a consultancy, and those guys pushed for an agile methodology. They wanted regular feedback, iterations, etc and a contract that was essentially "keep paying us until you're happy and reach product acceptance". Non-technical management thought this sounded insane. They wanted something more like: you tell us how much it will cost and the exact day it will be complete, we pay you, it's done by that day and any delays trigger compensation. So they fired Thoughtworks and outsourced it to India.
India delivers on a fixed price contract, on time. The bank are happy. Launch day gets near. They begin loading customer data into the live system in preparation. At first things are OK but then it starts slowing down. Slower and slower. By 100+ customers the system has slowed to a crawl. Investigation shows ORM abuse to be the root cause. There's no quick fix: the developers didn't understand ORMs or data model design and the whole codebase is just FUBARd performance wise. The project has to be completely written off.
In the wake of this, said large bank becomes terrified of technology projects. They no longer want anything to do with them. They decide to buy a platform from a competitor and rebrand it. Internal execs warn that this strategy is suicidal as it ensure they can never differentiate, but the warnings go unheeded. It apparently took over a decade for the organization to recover its confidence enough to try software projects again.
You see this everywhere outside the tech sector. Non-techs can't abide technical people and the way we do things. It's low key culture war. They see tech people as weird and unreasonable (sometimes true, but not moreso than the other way around). Nerds like Star Trek and get passionate about their tools and board games and the right way to interview people, and many other things that non-tech people find baffling or sometimes outright offensive (especially hiring). Non-tech can't tell the difference between BS and not, so constantly fall for yes men and grifters who just tell them what they want to hear. Then projects randomly explode and nobody can understand why or how to stop it happening again.
One thing they will not countenance under any circumstances: learning enough tech to be able to manage technical projects. If it's ever brought up the idea will be laughed off. Programmers will learn banking/logistics/medicine/tax law/etc if necessary to understand a project, but the reverse is not true and frankly deep down that seems to scare them.
These days most non-tech orgs have given up in my experience. They just want to buy a SaaS and if it means they can't differentiate from their competitors, oh well, too bad so sad but unless a tech startup takes an interest in your line of business it's not like any of your competitors are going to do it better.
> I've never personally understood the desire to release anyway
I work alone with no deadlines. Sometimes I have to release something or I'll never finish. Perfect is the enemy of the good, as the saying goes. I often realise that half the "blockers" are actually nice-to-haves that I'm willing to complete much later.
However I don't consider the thing finished, only released. In a dev team, the post-release bug-fixing and refactoring never takes place. Managers immediately move on.
I used to work in a company where important dates (inauguration of new development building, important product release etc.) were set to the birthday of the owner/CEO's late husband.
Come what may. If your product was to be released somewhere in that half of the year, this date it was.
"Fall back on blaming the process when failure comes around and avoid ever pointing fingers or owning it."
Some organisations can do this, but I've see plenty that might outwardly try to avoid pointing fingers, but you can tell that despite warnings given, they do blame.
I told several leaders that one of their systems was literally on the brink and we were fighting fires on it every other day, and the processes in place were horribly broken.
12 months later shit hit the fan, my feedback was that I wasn't proactive enough, and they basically threw me and my team under the bus.
This despite budget under cutting, limits in hiring, enormous optimisation, education of other teams, research. Just getting 'No' or no real outcomes all the time on any escalation.
The leaders did this serially across the business, and lacked ownership on this aspect to even give people the resources and autonomy to do anything about it, yet would come looking when it came time, to throw other teams under the bus like this
Go on record and keep a record so you can see how proactive you are and demonstrate it to others. It may seem you are loud and clear but in reality they didn't hear and you just didn't repeat it again. I am guilty of forgetting that the problem (and that I told about it) is top of only my own memory but for the boss it's very remote and needs repetition to sink in. Also guilty of trying to teach a lesson to make people listen better to me the first time but that's a bad idea too. And if people blame you later you can always throw email transcripts at them.
I don't think any amount of record showing would prevent the problem that GP had. Regardless of how proactive GP were, management would probably simply say “you should have pushed harder”. Hindsight is 20/20 after all.
The difference can be big if you want to sue for unlawful termination or if you are sued if company claims you caused them damage. The bosses may have reasons to have some things off the record but you may have reasons to keep paper trail
I don't think not having evidence was the problem that GP had. Upper management asked GP to be more proactive. So, they expected GP to be more assertive or take measures regardless of management's approval, which would be doomed still in other ways.
They set themselves up for success no matter what (take credit for hard arbitrary deadline being successful or blame whoever can't meet it), which clearly leads them to be in those leadership positions.
>I also learnt to only become as involved and care as much as the customer. If they aren’t willing to go that extra mile then why should I?
There was probably a good portion of that army of ops and developers that felt the same way. And even if you can move mountains to point fingers and clean out all of the bad people, train the average ones, and keep the good ones around, you will likely get little compensation out of it, so no one does that. Even then, it's hard to blame incompetent people getting jobs and holding on to them. There's a lack of standards for skill and training so naturally people slip through the variety of bespoke vetting processes that companies come up with. There's also some economic reasons why people hold on to jobs for far longer than they should, even if they know they're bad at it.
Large amounts of bad complexity slowly erodes morale, but bad decisions like this one can set it on fire. Accruing bad complexity may not have any solid failures to trigger reassessment, which is why it often becomes a problem and stays a problem. It's an obvious problem for new people with fresh perspectives, while incumbents might have gotten used to it.
> I learnt a few things from this. The first is, some companies or organisations within them need failure in order to progress.
Encountered a similar situation a few years back while working for a government client. My advice to the team was that we needed to “let the train wreck happen.”
I agree with this, but I think you also have to ready to document the course of events and (in the most diplomatically possible way) say “I told you so”.
Otherwise the people who are accountable for the failure will say “oh who could have seen this coming” and refuse to learn anything for the next instance.
I think the reason for this is that observed production failure is _certain_. Hard fails are undeniable. They obviously need to be addressed and obviously deserve resources. The amount of deserved resources can be clearly calculated by projecting the concrete, observed costs of the failure forward in time.
Before prod goes down, there is much more uncertainty:
- even an expert engineering assessment has some level of uncertainty
- engineering may not have a full appreciation for the business context of the work, and might over-weight technical issues relative to other concerns
- if the engineers are contractors, or otherwise organizationally distant from the experience owners, that inserts a trust gap which further increases uncertainty
- the business owner’s projections are themselves uncertain. Is the expected launch volume really that high, or is it aspirational?
- the costs of failure are uncertain too… if the system goes down, how hard will it go down? What will that actually cost in lost revenue? Fuzzier stuff like brand reputation is even harder to quantify.
Meanwhile the costs of paying the contracted development team another 2 months on the same project are quite concrete. The team already spent significant political capital to force a change on an incumbent team. Now they’re saying they want more money because it still doesn’t work??
The big open question is - what was the cost of the failed launch? How long did it take to get the system back up and running at scale? What did it cost in terms of user loyalty? How does that compare to the concrete cost of holding launch until the auth system was upgraded?
Different people will answer those questions in different ways. What matters is how the customer answers those questions, whether their bosses believe that answer, and their bosses judgment of the overall situation.
Yea this is a good take, but it’s not as pessimistic as it sounds. “I told you so” never works, even if you explicitly reserve the right so use it, and even as a joke.
What does work though is having a prepared refactoring plan ready to go when it all blows up. “Do that risky refactor we’ve been avoiding because of downtime?, well it’s all down now isn’t it”
As a newly minted bad manager, it rankles her that there is somehow a moral judgement of bad managers.
Developers, QA, product can all be incompetent, but when a manager is incompetent, he’s “evil”. In all seriousness, incompetence should never be rewarded and bad behavior should be called out, but in nearly every case we’re all really trying to do our best.
The difference is that it's managements job to identify, correct, or quarantine all those bad reports. There's also easy and cheap ways that devs, QA, and product are labelled as "incompetent."
Proving that management is incompetent is a fairly large task in and of itself, and the people that can hold them accountable are all peers who generally wouldn't dare to "rock the boat" so to speak.
I've termed this whole set of phenomenon as a "managerial crisis", and it's been written about extensively since the last 60's.
> nearly every case we’re all really trying to do our best.
If you sitting in the same boat this is a reasonable approach most of the time. Even then you may have to deal with an occasional irrationally acting person. However if you are in another boat e.g. providing services or products there are a lot fewer barriers for misbehavior and some may be even rewarded. Win-win and win-loose are different games and deluding yourself in what game you find yourself in a quick way to fail big.
The difference is that a very natural and common reaction by some in positions of power is to blame down the hierarchy.
That's the "evil" part. The unethical "I'll save myself" if things wrong and having extra power to sell that narrative due to hierarchy.
You may be trying to do your best but saying nearly every case is pretty false for anyone who has observed the power dynamics for a long time. Hey, the same happens with developers. Depending on the org, your best for the team and your best for the career rarely meet and it's harder to have them meet as a manager.
Companies usually don't have all their parts aligned or even functioning properly. And you cannot change that.
Your work isn't just the product, it is navigating an imperfect structure to achieve what you need to, despite the fact that the imperfect structure is the one contracting the product/service, and is (mostly unwittingly) standing in the way of that.
When warnings are taken seriously, and no disaster happens: Clearly, the warnings were unnecessary.
When warnings are not given and a disaster happens: clearly the team was incompetent and should have seen it coming.
When warnings are not given or ignored and no disaster happens: they were overblown and pointless, why do we even have a disaster recovery team.
When warnings are taken seriously and a disaster still happens: it's the worst of both worlds.
You can't win in this situation. No combination of warning/not warning and disaster/no disaster results in success. If you can't trust your managers to cogently weigh the risks and costs, there's nothing you can do.
My point is that engineers sometimes view anything that goes wrong with their tiny corner of the system as a disaster, which is not always the case.
My (engineering) boss is more risk-tolerant than I am, but after 30 years in the business I respect his judgement that a particular risk is worth running.
A strangely, poorly performing authentication system with a good one a few weeks behind the launch. An oddly configured circular fail over strategy. The thing I’ve noticed about entrentched employees who don’t perform is that while they are not so good at creating/building things, they are really good at pulling strings to make themselves look good at the expense of others.
I don’t think it was a great look for them that you were brought in to build this, and once you start pointing out flaws in other people’s systems you become a walking target. I’ve made this mistake too many times in the past, determined not to do it in the future.
The situation is you are brought on board to do work, and you have in-house collaborators and you'll take the blame for their mistakes?
Step back and find an outcome where your work doesn't harm their reputation/security. So how do you make them into allies? Or how do you not matter to them? Figure that out...
> The situation is you are brought on board to do work, and you have in-house collaborators and you'll take the blame for their mistakes?
People really are like that, yes: Monday the boss says "X is now responsible for Y" and lo and behold, Tuesday EVERYONE already says "Y is not working well, X, do something, it's your fault!".
> Step back and find an outcome where your work doesn't harm their reputation/security. So how do you make them into allies? Or how do you not matter to them? Figure that out...
I have figured out part of that and I was genuinely asking if you have additional insights.
> I also learnt to only become as involved and care as much as the customer. If they aren’t willing to go that extra mile then why should I?
This is what I use at work to keep myself sane. If the highly paid VPs, directors, managers don't care about something, why should I?
This leads to a lot of friction with manager where they want to put blame on individuals, months after they ignored the warnings. But realistically, it is their fault and they will get fired soon enough if they try to make everything a problem of engineers.
In my experience the engineers get blamed and often fired if they push back. The managers seem to acquire immunity if they can pass the blame and play politics.
You are right. Engineers are the ultimate scapegoat. Which is why engineers hate management in this industry. Nobody likes "leaders" who will throw reports under the bus when the time comes.
The only way out is for engineers to spend more time recognizing a manager falling off the cliff and switching teams/companies. Doing actual work is not enough to protect one from politics.
I don’t think lambda is the problem. But my guess is that using a highly scalable system that forces you to make convenience tradeoffs for that scale isn’t the best solution here. A monolithic app could handle the scale required with better devex. But I also don’t know much at all about what’s going on in that system. If the serverless tradeoffs has no real downsides for their specific use cases then all I said is moot.
Yeah - I’m not personally a fan of the specific tech discussed in the article - 100 proxy requests/second could be handled by a RPi, but I suspect it was more complex than a simple proxy - but it was pretty clear that this wasn’t really the issue.
The point was that the launch was predictably going to fail, it was predicted, and then it failed. And this was due to the kind of arrogant, disconnected management decisions that are sadly very familiar to many of us here at HN.
I have worked with well over a dozen organizations with ERP systems from SAP, Oracle and IBM. When organizations are large enough to need such systems, the management has grown to the level of being driven by politics and hubris. Any warnings by technical staff are rejected because they contradict the management's belief that they know better.
When you see that people are going to get thrown under the bus, it is prudent to not be in the vicinity. You can't stop managers from doing it, but you can stop volunteering to be one of them.
Just commenting on the "If you succeed you will fail.".
Failure is envitable and to succeed at something you have had to have had experience encountering a problem. If you succeeded first time encountering a problem it is likely you'll never know the solution to the problem and thus it is envitable failure will eventually become of it.
It is enough to say success is built on a mountain of failures. Only difference between something being a failure and it becoming a success is that attempts made to produce a desired result.
You keep going until something stops you. So if you do everything right and solve every problem thrown at you, that just means you're going to push harder. Eventually something's going to break. This isn't 'failure', it's just how development, business, really everything works.
It’s interesting how such things impact an employe.
There are times where things have to fail in order for change to happen, but employees also generally want to make things that are meaningful, and work. It makes the process to inevitable failure really frustrating / demoralizing for the employees.
Some years ago, I worked on a system not quite as bad as what the author described, but close. We released a new product with a known, quite bad security vulnerability (I'd made sure our product team was _extremely_ aware of this), as well as no monitoring to speak of. The deadline had been communicated for around one year, but nobody had ever really discussed the significance of the date, other than it was what we were all death-marching to, and we needed to deliver.
What did that date turn out to be? The head of product management's birthday, which was revealed to the rest of the company on the highly-celebrated the launch date. People were just kissing ass. I left several months later.
It feels unconscionable to me that a company could have launched an incomplete, insecure, customer-facing product just to give a birthday gift to a leader, but I suspect this sort of thing is common.