Hacker News new | past | comments | ask | show | jobs | submit login

In the last engineering team I managed (by far the most successful one up to date), we discarded most velocity measures in favour of a simple solution: every Friday, each team sends a brief "here's what we delivered this week" email to the whole company. It contains some screenshots and instructions like "agents can now update the foobar rate from their mobile app without having to call in". After a month or two of these going out like clockwork, it gave both management and business stakeholders a level of comfort that no amount of KPI dashboards ever could. KPIs compress away details that stakeholders need to form an understanding. A full blast from the task tracker is overkill. This solution worked for us. But of course, required a solid feature flag practice so that half-built features (story slices) could be released weekly for testing without affecting production. That said, we did maintain a quarter-level sum-of-t-shirt-size based KPI.

I’ve watched this backfire, too.

It works great when everyone is delivering day-long or week-long incremental features that lend themselves to nice screenshots.

But then you slowly start accumulating a backlog of difficult tasks that everyone avoids because they won’t sound good for the week. Technical debt accumulates as people cut corners in the interest of getting their weekly presentations out.

You can theoretically avoid it with enough trust, but the trigger for our collapse was a round of layoffs where the people with the most visible features were spared while people working on important backend work and bug fixes got cut. After that, it was game over.

Without an outstanding company culture, most KPIs are almost useless.

While working at a big corporation we had a velocity initiative supposedly aimed to lead the company toward continuous integration.

"How long a PR stays open" was one of the KPIs in a dashboard.

I said: "Be careful with that!"

People started to close PRs and reopen new PRs with the same code.

Middle managers and sometimes the person in the division that was the point of contact for the velocity initiative were asking to do that.

The script measuring this KPI was improved to look at the branch name and the code diff. Result? People changing branch name and a change of EOL encoding in the new PR.

Learnings? B and C players with questionable ethics screw companies quite rapidly.

In this climate KPIs and aligning them with company values is futile.

When I worked at a FAANG ~15 years ago, a new VP came in and heard, correctly, that our group didn't have enough automated tests. He created a requirement that every developer commit two new tests to the code base every workday. He had automation put in to monitor our compliance.

Within a couple of weeks, scripts were circulating to auto-generate and auto-commit tests. If the JVM we were using had a bug in adding random numbers together, we'd have known about it very quickly.

That's about the time I decided I should move on. I'm glad I did. The company I joined treated developers like adults and developers acted like adults. And we had great automated test coverage.

Obligatory mention: https://en.wikipedia.org/wiki/Goodhart%27s_law

Accountability is a big part of leadership, though. The anecdote basically just says the VP used the wrong tool for accountability, not that "adults" shouldn't be held accountable.

I think what was missing here was lack of developer buy-in into the actual changes implemented, and making sure that they were reasonable and sustainable. Sounds to me like a blanket mandate was handed out without buy-in and also wasn't achievable, and the developers felt they had to work around it to get their real jobs done.

I agree. At the very least, it seems like he didn't communicate the "why". It certainly wasn't "to make as many tests as possible". I suspect the developers knew that, which is why is also a little unprofessional on their part to treat it like that was the goal.

I think the bigger problem here was inappropriate expectations: "two tests a day". That's not a reasonable way to increase test coverage. And so the developers quite reasonably just tried to minimise the time spent on it.

>Number of unit tests isn't the best proxy for that goal…

However, I don’t think mindlessly creating tests is acting in good faith. It’s bordering on malicious compliance. I doubt they were thinking they can just knock out that metric so they can otherwise create better test coverage. (The OP conceded their coverage wasn’t good). Better employees would work to create a better understanding/goal. All of that points to some cultural problems.

Malicious compliance is exactly what you get when people think your target is stupid and you put some automated, non-negotiable measurement in place.

If I was a developer there, I'd totally started adding those generated tests. I have a job to do (ship products) and you put in some stupid requirements which actually interfere with it (since my time is limited I can either work on tests or my daily tasks like developing new features or fixing bugs), but we both know that if my primary job suffers, I'll pay for it. So the best solution, from my point of view, is the one that takes that requirement away and lets me go on actually working.

When we had a similar problem in a previous company, we just created an epic, assigned a couple of people to it and have them churning out tickets for specific improvements, like "Class X has a coverage of Y on this method, add tests for the missing execution branches", which were clear, non-generic and fully integrated in our flow. If anybody complained about our velocity or whatever, we could show them the full trail which lead us to choose to work on that ticket and how much we spent on it. The coverage issue got solved in less than a month.

>If I was a developer there, I'd totally started adding those generated tests.

If I were a manager there, I probably wouldn’t hire you. You want people who are actively solving problems that matter, not just automatons beating their chest about how stupid everyone else is while they add to existing problems.

I you were a manager there and would approve that metric, no worries, I'd manage to fly it past you.

Any smart manager, having to deal with this kind of policy, would either push back or approve any way to game it and then get the work done. But I doubt that a company which allows this kind of BS can retain smart managers for more than a couple of months.

If you read any of my posts on this thread, it’s pretty obvious I don’t agree with the managers approach. But I also don’t agree that the devs are doing the right thing either.

It’s telling that you show such dichotomous and defensive thinking.

So you disapprove of the decision, but would have enforced it in the most asinine way at the expense of actual productivity.

I just hope to never work with you and, especially, never use anything you helped building.

What made you draw that conclusion?

I made the point that the metric used (raw number of tests) is a bad way to enforce a good goal (higher code quality). I also made the point that metrics aren’t inherently bad, but they have to be chosen judiciously.

You seemed to take those two points and extrapolate an entirely different story as a personal affront and apply motives that, frankly, reads as a bit unhinged. So i also hope we don’t cross paths, because in safety critical code development where I come from where bad practices and toxic teammates can kill people.

If I was a dev there, I certainly wouldn't be approving those PRs either.

But if the devs at the org all (or even substantially) tend towards malicious compliance, that's a sign that something has gone wrong, relations-wise

Agree. That’s what I mean when I say it’s more indicative of cultural issues.

Would adult developers not have made sure they had enough automated tests?

It probably depends whether they had been given specific instructions that directly conflicted with that.

And it also depends on whether they're given a timeslot for that and/or incentivized for it. Automated tests need times to maintain, especially if they're some changes on flow / specs, tests need to follow.

I scrap all incentivized metrics when working on something urgent (important and soon), which is often the case. If the metrics somehow incentivized, we'll start gaming it.

Now if a dev has part on production support and automated tests are purposed to reduce the workload for support then it has time slots for that, I bet everyone will start doing it.

If a civil engineer is given specific instructions to make an unsafe bridge, that engineer does not obey those instructions. I think adult developers should behave similarly.

The analogy starts to break down. Most software is less mission-critical than the structural integrity of a bridge. Meaning the consequences for failure are probably annoyance rather than risk of death. Not always. I would imagine the software for nuclear reactor or aircraft controls is in a different category, but most software is not bridges.

Okay, but if developers decide that then at what point do you say it's okay to not treat them like adults?

I’ve often wondered what is the solution to Goodhart’s law. Obviously a business can’t abandon metrics. Perhaps this is where qualitative management skills come into play - humans in the loop making good decisions instead of blindly marching to the output of an automated reporting system.

I'd argue that the point Goodhart's law isn't that metrics are bad, but that all metrics aren't created equal. In the case above, it seems like the real goal was improved code quality. Number of unit tests isn't the best proxy for that goal, so it wasn't the best choice of metric. You don't want developers creating tests for the sake of creating tests. (There's some irony here in that it's not the type of behavior I'd ascribe to professionals). The "solution" is a metric that's a better measure of what you actually want.

Even professionals have limits. I've worked in companies with the kind of management which kept adding this bad proxy metrics and pushing initiatives which had a totally expectable bad effects on the product quality. Most devs used to fight the management on this, but grew progressively tired of this continuous fight. At some point the experienced devs either left or just gave up and started giving the management what they asked. Us juniors followed suit. The management was happy, the actual workload diminished because we let go of "low priority" tasks and we even go a juicy bonus at the end of the year because of how good we were doing.

The company tanked six months after that, now it doesn't exist anymore.

There's only so much you can do when the management is hellbent on doing stupid things.

You might be misconstruing the point. I certainly wasn’t insinuating more and more metrics. If anything, it’s the opposite: a core understanding of what’s really important helps you focus on the few metrics that matter.

In that context, I’m not really sure what point you’re making, unless it’s just to share a personal anecdote. Are you implying that management shouldn’t have any quantitative measures and should only be qualitative?

You need good quantitative measures, not just random numbers.

If you sell, say, water bottles, you probably want to know how many of them you can sell at any given moment, in order to not overbook and have to reimburse people. In this case, keeping track of how many water bottles you do have in stock probably helps, keeping track of how many labels with funny jokes you can stick on a shipping box in an hour doesn't. But if you start tracking the latter and handing down bonuses and layoffs based on it, people will max that metrics out - at the expense of your actual stock capacity.

Quantitative measures are dangerous, especially in the hands of people who believe they are better than qualitative ones because they're "objective" or whatever. Because not only they aren't, but they are also better than qualitative ones at hiding their biases and soothing your own.

> Are you implying that management shouldn’t have any quantitative measures and should only be qualitative?

Many managers would do a lot better this way. They'd still make stuff up, but would at least be forced to admit it.

You realize that you just reiterated the same point I made in the OP, right?

Goodhart's law is always in effect. It can't be solved because it's not a problem, it's a fact of nature with annoying implications. It's the echo of the observation that efficiency is fitness as its consequences ripple from the lowest levels of reality through systems made of people.

You can make an engine as effective as the laws of physics let you, but you can't solve the limits of thermodynamics. You can only do your best within them. Same with this, for maybe the same reason.

In my experience, there is no good solution long term except changing the metrics themselves every so often. When new metrics come in, as long as they are not totally boneheaded, they improve things. Then people start learn to start gaming them and some of the more sociopathic folks start doing so, then others copy by example and soon enough the metric is useless at best or detrimental at worst and it is time to move to a new metric.

It helps to have someone with a hacker mindset think about the metric being designed so the obvious ways in which it could be games are taken care of and their own metrics/incentives are aligned with the company goal.

If you didn't have enough tests you weren't acting like adults. Why should he treat you like adults?

Maybe you are jumping to a conclusion too quickly.

How do you know what was really going on?

"He created a requirement that every developer commit two new tests to the code base every workday" seem a stupid requirement if you do not control the quality of the tests.

The same big corporation I wrote above had a goal of 80% code coverage reported on dashboards.

I saw people writing tests just to run lines of code, without effectively testing anything.

Others people were "smarter" and completely excluded folders and modules in the codebase with low coverage from coverage testing.

Code coverage percentage numbers on a dashboard are a risky business. They can give you a false sense of confidence. Because you can have 100% code coverage and be plagued by multitude of bugs if you do not test what the code is supposed to do.

Code coverage helps to see where you have untested code and if it is very low (ex: less 50%) tells you that you need more tests. An high code coverage percentage is desirable but should not be a target.

The real problem is again the culture.

A culture where it is ok to have critical parts of the code not being tested. A large part of the solution here is helping people to understand the consequences of low code coverage. For example collecting experiences and during retrospectives point out where tests saved the day or how a test may have saved the day so people can see how test may save them a lot of frustration.

But again, when you give people a target and it is the only thing they care about, people find a way to hit it.

He said it himself.

> When I worked at a FAANG ~15 years ago, a new VP came in and heard, correctly, that our group didn't have enough automated tests.

There are a thousand reasons why reasonable people might have found themselves in that position. Maybe they inherited a code base after an acquisition or from some outside consultancy who didn't do a great job. Maybe management made a rational business decision to ship something that would make enough money to keep the company going and knowingly took on the tech debt that they would then have some chance of fixing before the company failed. Maybe it actually had very high numbers from coverage tools but then someone realised that a relatively complex part of the code still wasn't being tested very thoroughly.

If a team has identified a weakness in testing and transparently reported it, presumably with the intention of making it better, then why would we assume that setting arbitrary targets based on some metric with no direct connection to the real problem would help them do that?

had to mention https://fs.blog/chestertons-fence/

if team does not have automated test, but still manages to deliver working software - maybe tests are not adding as much value as VP thinks?

the most important is feature delivery, and integration test, not automated unit test where you test getters and setters with mock dependencies - absolutely useless busywork

Tests aren't exclusively about asserting current behavior -- they also help you determine drift over time and explicitly mention the implicit invariants that people are assuming.

Chesterson's fence isn't saying that the fence/test isn't necessary. It's saying you need to take the time to understand the broader context rather than take a knee-jerk assumption. To be more clear, just because developers don't see the need for better testing, doesn't mean more testing isn't needed. But it may indicate the VP didn't doing a good job of relating why, which leads to the gamesmanship shown in the story.

Schedule isn't always the most important thing either. It's possible delivery the software may just mean you've been rolling the dice and getting lucky. The Boeing 737MAX scenario gives a concrete example of where delivery was paramount. It's a cognitive bias to assume that "since nothing bad has happened yet, it must mean it's good practice"

This might be relevant if the original comment didn't say "correctly".

Also "not testing a lot" is not a chesterton's fence. "not testing a lot" can't be load-bearing.

Without tests and instrumentation you won't even know if it's not working.

> The real problem is again the culture.

The culture lead to fake tests instead of adding tests that were legitimately lacking.

Is that so different from saying they weren't acting like adults?

The phrasing could be called dismissive but I give that a pass because it was mimicking the phrasing from the post it replied to. The underlying sentiment doesn't seem wrong to me.

> Code coverage percentage numbers on a dashboard are a risky business. They can give you a false sense of confidence. Because you can have 100% code coverage and be plagued by multitude of bugs if you do not test what the code is supposed to do.

Or, conversly, I've been in charge by really awkwardly testable code.. which ends up being really reliable. Plugin loading, config loading (this was before you spring boot'ed everything in java). We had almost no tests in that context, because testing the edge cases there would've been a real mess.

But at the same time, if we messed up, no dev environment and no test environment would work anymore at all, and we would know. Very quickly. From a lot of sides, with a lot of anger in there. So we were fine.

Need some tests for those tests!

Insufficient test coverage doesn't necessarily mean lack of self-discipline. It can also stem from project management issues (too much focus on features/too little time given for test writing).

An adult would have started to write the tests themselves so they'd understand what was going on around them. You don't just frown at people and hope for the best.

Yep, I worked at a place that was focused on everyone completing 100% of their jira tickets each sprint, just to get the metrics up. You didn't have to actually finish, it just had to look like you did to the bean counters.

If end of sprint came and you weren't done, the manager would close out the ticket, then reopen another similar one named "Module phase 2" or something similar for next sprint. One guy was an expert at gaming the system, and his ticket got closed and opened anew for about 3 or 4 sprints.

> Learnings? B and C players with questionable ethics screw companies quite rapidly.

No one should be surprised when employees respond to incentives, and blaming them seems a clear indicator of managerial failure: failure to tend to morale, failure to reward actually useful behavior, failure to articulate a vision.

> Without an outstanding company culture, most KPIs are almost useless.

Also, with an outstanding company culture, KPIs aren't really necessary.

So, when would they be useful?

I'm not as negative on KPIs as the previous line suggests though. They can be useful to shape direction when used carefully. But don't make them too long-lived, discard and create new ones as soon as they become gameable.

Fire those folks and move on. If you're a subordinate and your leaders are not firing those folks, quit and move on.

Gameable KPIs offer windows into the souls of your colleagues.

Some examples of incrementally delivering tech debt and communicating it to stakeholders:

"This week we found the root cause of why operation X sometimes fails - it's a slow database query which we plan to fix next week. For those interested, here's a command-line demo of the issue."

"With 60 pull requests being submitted per week, and each one triggering a 12 minute automated test run, we were wasting a quite a bit of developer resources. This week we brought that time down to 4 minutes. For those interested, here's how we parallelised the tests."

"Remember how the last accounting feature broke a lot of different parts of the system and it took 4 extra weeks to fix? This week we migrated 4 out of 18 core accounting functions to a service completely separate from our main code base. Once the remaining 14 are moved over during the next three weeks, accounting features can be built and tested independently without affecting the main system."

This is why management needs to be technical. Otherwise you just end up with feature, feature, feature … death.

I find management only really ends up being a problem if they get too deeply involved in the contents of technical quality work. If they're limited to budget allocation it's actually better to have them involved.

I almost always tell management that by default I'm spending, say, 30% of my time on technical quality and ask if they'd like to dial it up or down temporarily because of a holiday or a deadline they can. I will track this and show it to whomever asks (e.g. their boss and boss's boss might be interested if they've explicitly asked for 8 straight weeks of 0% work on quality).

What falls under technical quality label?

Refactoring, CI maintenance, test automation maintenance, that sort of thing - everything except user visible features or bugs, essentially.

This is why management needs to be competent. Average CEO is 58 years old. Their first jobs were in the late 1980s.

The economics and tactics around technology has been revolutionized a dozen times over in the last 4 decades. Now, maybe a few rare individuals have kept up, but most likely, they all rely on outdated strategies from out of touch MBA programs & buzz words.

Its the same reason the market kills public technology companies' innovation and they rely on acquisition.

Its like gunpowder has just been invented and leadership still wants large formations marching against each other. "what if we invested in medical training and add washing our hands so we can keep more people alive?" "nah, more guns and marching"

Funnily enough, the best managers I've had are all in the late 40s - mid 50s that all did 20 years in the "trenches" and decided that management was the new path forward for them as they started to prefer people work to technical work.

I think its work experience that matters most with management. You need folks who are people oriented but understand the job that the folks they manage are doing, and the challenges that come with it, both obvious and non obvious, and can communicate the importance of such work to the broader organization

There was an article posted here a month or so ago about how this guy was seeing a new class of worker emerge who both understood the necessity of absorbing knowledge from other domains and enjoyed the process. It reminded me of a dev I worked with who'd go on smoke breaks with the art department and then change data structures we'd been using forever.

I was under the assumption this type of worker is what dominated before the industrial revolution constrained people to a small domain in a process. Some may call it artisan, dilettante, etc. Maybe the narrow-domain specialist was the anomaly and not the rule when we expand our lens across history.

I assume these "smoke breaks with the art department" involved more than just tobacco.... :D

> The economics and tactics around technology has been revolutionized a dozen times over in the last 4 decades. Now, maybe a few rare individuals have kept up, but most likely, they all rely on outdated strategies from out of touch MBA programs & buzz words.

Ah, the cult of youth. It is cute, when it isn't killing startups with rookie moves.

It almost sounds like you think management is something like practicing law - like there's a set of rules that are revised on a schedule and they need CLE credits to keep them current.

> Its like gunpowder has just been invented and leadership still wants large formations marching against each other.

Incidentally, marching large formations against each other has been the dominant strategy for most of Gunpowder's history. Only in the last 150 years did guns become lethal enough to make this a bad idea.

>Its the same reason the market kills public technology companies' innovation and they rely on acquisition.

The counter-theory is that as companies mature, they have to devote a disproportionate amount of time to maintenance. From that perspective, it's a risk-based approach to maintain the products and services that already have a market and instead effectively out-source the high risk operations of innovation. New companies don't have comparatively much to maintain, so they can focus on innovation. It's not quite such an either-or, but a balancing act. It's just easier to balance when you can outsource risk.

>Average CEO is 58 years old. Their first jobs were in the late 1980s.

I believe there's some neuroscience theory that may illuminate this. When we're young, our intelligence is more plastic and can more readily innovate. But as we get older it becomes more crystalline. What we may lose some of that novelty generation, we can gain in understanding the greater levels of experience. Experience is necessary for putting ideas in their appropriate context. I would argue that understanding context is more important at high, strategic-level positions.

You had me with competence, but you lost me with blatant ageism

Maybe it's just me, but basically all the best senior managers I've had have been in their 50s and with plenty of experience under their belt. I've had one or two in their late 60s who where starting to lose touch with the field, but most managers in their 50s I've found have kept up pretty well.

Your gunpowder and large formations metaphor doesn't hold. People aren't idiots, on the whole. Do a little more military history research.

Particular people can be utter cretins, they've crashed the world economy in 2008 and nearly caused global nicl3ar war by sheer stupidity:

> the Russian nuclear sub B-59, which had been running submerged for days, was cornered by 11 US destroyers and the aircraft carrier USS Randolph. The US ships began dropping depth charges around the sub.

Officers aboard had every reason to believe that their American counterparts were trying to sink them.

Cut off from outside contact, buffeted by depth charges, the most obvious conclusion for the officers of B-59 was that global war had already begun.

The submarine was armed with 10-kiloton nuclear torpedos and its officers had permission from their superiors to launch it without confirmation from Moscow.

Two out of three serior officers onboard agreed to launch, but unanimous vote was required.

I did say, "on the whole."

I was specifically thinking of the idea that large formations being idiocy after the invention of gunpowder. Brett Devereaux's 4-part series "The Universal Warrior" had an excellent discussion of why a unit of soldiers might fight in large formation. https://acoup.blog/2021/02/12/collections-the-universal-warr... He's verbose, but thoroughly enjoyable.

Reminds me of the ill-fated cavalry charges in WW2. Many older officers insisted that "the fundamentals of war are the same" despite new technology.

I can imagine a modern tech leader living in those times and giving the orders to charge the German panzer formation, sabers overhead.

This says more about the kind of places you've worked than anything else.

Even technical management can succumb to this. Agreed with most of the replies to your comment.

Technical and smart. Otherwise you end up with fix tech debt, fix tech debt, fix tech debt ...death.

I have also seen refactor, refactor, refactor, oh actually it's now no better than before. Whoops!

It's like imaginary tech debt.

Or even worse than before - and the sad part is that no-one intended that to happen. All bespoke software is effectively an experiment that may fail. I've become a lot more wary of the "failure is good" mantra that SV is famous for - it's not good in a zero-sum game where labor is spent on stuff that gives no value, and refactoring for DX is definitely expensive but only possibly valuable.

Here is the thing, in a high-trust environment, KPIs work fine too. In fact, almost everything work on that condition.

But modern organizations are quick to destroy trust on any whim.

A hallmark of strong high trust environments I've found is that management beyond the team (or squad, or pod, or whatever you call it) is only concerned with Milestones, not sprints or other velocity metrics, and the check-ins with them revolve around the progress on the given Milestone(s) the team is working on.

It is up to the team to decide on the best way to approach this, and what works best for that team, and the team is free to do the work as they see it, with the only requirement that if something does negatively affect the Milestone(s), it gets raised quickly and early, in case of re-adjustment.

This however, means:

- Product management has to be competent enough to present a relatively elastic vision that is not so concrete its essentially a waterfall, but not so vague that its unclear whats being built. Wireframes are usually a good signal here.

- Engineering Management has to be competent enough to communicate (or allow others to communicate) technical challenges that may be involved, and more importantly, what may be unknown, to Product management

- Everyone has to agree that demoing work is more important than talking about work, whenever possible

- Trust in everyone doing the right thing needs to be high, constant interference and meetings will kill this from working right.

The place I'm consulting for at the moment shut down offices for years due to the pandemic. Some people moved out of state and continued to work because there was no signal of ever returning to the office. Then last week they said everyone had to return to the office 3 days per week or be fired with no exceptions. We're losing an MVP on our team due to this. He now lives over 500 miles from the closest office.

The way around this is to open an office just for them. It sounds dumb, but it meets the business requirement, and could also drive further recruitment around their location.

It seems like it'd be way more sensible and efficient to just make an exception for that particular employee (and maybe some other important ones). Though, in my opinion, the best option is to just let anyone work remotely as much as they want. Especially if you go years without ever indicating you intend to return people to the office.

I think I'm luckily grandfathered in due to working fully remotely for years before the pandemic, but if that happened to me I would absolutely start looking for a new job.

That can be more expensive than it appears at first, especially if it's in a different state due to taxes.

I've asked before if I can work remotely for a few weeks at a time and have been told yes, as long as I don't leave the state. In some places, just moving to a different county or city can trigger different tax requirements.

I don't think expense is the driving force here. It would be even less expensive to simply let the worker work remotely. The reason a company wouldn't do this is that it would send an unacceptable message about worker power. If some workers could get exceptions to a new unpopular rule, then why can't everyone? And if everyone could, why have the rule at all? This does not serve the interests of the company's power structure.

If you have competent accountants/payroll, this is no more complicated than anything else. There might be a learning curve to get things set up properly, but the rest is reasonably normal.

It can be quite complicated.

> State corporate or other business activity taxes can apply, if even a single employee is working in a state. In effect, if an employer did not previously have a recognized office in a state, but one employee starts working from there, this can trigger entirely new registration requirements and tax liabilities. It may be necessary to register with the secretary of state and relevant tax authorities, provide a registered agent address, and pay corporate and business activity taxes, sales taxes and employment taxes, including employee withholding. There are often state and local licenses and business permits as well.


Sounds like business as usual for most accountants that do SMBs. Worse case scenario you do this 50 times. The same thing happens in the EU when an employee works in another country, only worse. Sounds like you’ve got it pretty easy compared to there.

I think you underestimate just how small most businesses are. In the US, something like 90% of businesses have fewer than 25 employees. Dealing with setting up in another state is a significant burden.

I don’t think you understand how this works. At some point, you will surpass the sales tax threshold[1] for most states. Then you’ll be reporting and filing taxes in that state anyway. Adding an employee in a state is not complicated by that point.

So sure, if you’re a company doing only local sales. Or a company doing less than 3-400k of revenue a year, this is probably complicated. Once you hit ~1m in revenue, there is probably AT LEAST one other state you are paying taxes to. Adding states should already be a defined process (or being defined) and an employee moving there is only a one or two lines different.

1: https://salestax247.com/sales-tax-thresholds-by-state

In the premise Consultant32452 provided, the employees had already moved and been working from different states, so presumably all of that would already have had to be done.

A lot of states made changes during the pandemic to allow remote workers. Those have all been ending over the past year and I assumed that’s what they were talking about.

What? I do not see how any state could not allow remote workers.

And all states have long required employers to register with the state government for tax purposes if they employ someone working in the state, as well as comply with the state’s labor laws.

I didn’t say that right. What I meant to say was that a lot of states allowed remote workers to live there without having to pay local taxes.

They tried this one of my client's office too, but the middle-management all stood firm that if they and their teams (I'm on one of the teams) would not be returning. And to fire even one person on any of the teams would mean certain death for the company since it only has a handful of developers and everyone has 7-14 years of in built knowledge. I am the most recent consultant hired, and I was hired in 2017.

They back peddled pretty quickly and switched it to anyone within 10 miles of the office has to come back. There is only 1 person that close. He's the "office mom" and he's been in basically every day since the pandemic started.

Even if half the teams hadn't moved further away, there is no way most of us would have gone back to commuting 2-4 hours per day.

That seems rather shortsighted to just assume offices won't reopen instead of actually negotiating a transition to fully remote.

I've found the same.

We used goals and velocity metrics on the highest performing team I worked on. This was also a high trust team that happily raised concerns and adjusted priorities/velocity/etc.

The goals and velocity were still extremely useful for getting everybody on the same page for what we were looking to accomplish and how long it would take. We needed to land in the general area, but never got caught up in meaningless drivel over metrics.

The problem is management wants consistency and expects an explanation when things change. I've found it a team is perceived as failing if they're actually realistic with goals and capabilities.

Yeah, it's like the way in a competent team, just about any software development methodology will work well. The hard bit (honestly, the impossible bit) is making things work adequately with a mediocre team and mediocre leadership.

When you give people a target and it is the ONLY thing they care about, people find a way to hit it.

So be careful what you measure and you reward.

For every company cash flow and profits are undoubtedly the most important metric. It is almost impossible to argue that maximize those numbers should NOT be a goal.

At the same time when that becomes the ONLY target that matters, the consequences are dreadful.

At least in US, that is how we ended up with appliances that only last a small fraction of time that used to last 40 years ago.

And even worse it is how we ended up with the food industry creating more and more addicting food resulting in 70% of the population be obese. And it is how we ended up with a heath system that costs multiples of what costs in any other country in the world, that, instead of healing people for good, make them "less sick" addicting them to a few pills for the rest of their life. Because there is no money to be made with a healthy person.

When you build KPIs, make sure you "think a few moves ahead" and you put other correcting metrics and checks in place. At least make sure who establishes the metrics has a way to become aware of the possible shortcomings and plan corrections in a timely manner.

> For every company cash flow and profits are undoubtedly the most important metric. It is almost impossible to argue that maximize those numbers should NOT be a goal.

Except for VC backed startups. Actually, there are a lot of exceptions. But those mostly revolve around caveats surrounding riskiness and timelines.

There are two types of devs at my job, those that care about shipping lots of new things fast, using new frameworks and pushing metrics. Then we have me and a few others fighting fires that the fast shippers leave behind. I’m fine with it, I enjoy analyzing database queries and figuring out what to index. As long as manager understands where the fires are coming from and why we need to spend time putting them out there’s no problem.

Unfortunately, at far too many companies, those "ship new things fast" developers are the ones who get promoted, get rewarded with the best projects, and grow their careers and influence, while the firefighters just keep fighting fires.

I'm curious if anyone has stories about making this work in the long run.

It seems like such a natural division of labor that it appears everywhere. But I also feel like I've never seen a company explicitly optimize processes around it.

So I'm curious to hear any battle stories.

The problem that I'd imagine is a variation of "Why does Sales get paid so much?". Basically, the more obvious the connection between your contribution and the company making more money, the easier it is to reward you with a slice of the pie.

In a codified "trailblazers and firefighters" model, the trailblazers are much more obviously tied to the company's bottom line. Feature X is required to close deal with company Y, this software team built this feature. Meanwhile, the people frantically cleaning up the mess the first team left is only nebulously connected to the company raking in more cash, relegating them to a second-tier position despite having harder work.

Yes, our team asks for demos every sprint which I find to be dumb and a remnant of every sprint needing a MVP when in practice that is just hilariously naive. Especially considering a screen that took 10 minutes to make is considered much more interesting than code that is more detailed really doesn’t lend itself to a “demo.”

>You can theoretically avoid it with enough trust

I think it ultimately comes down to human values. What's important to the founders and the team? Do they have a clear articulation of those values? KPIs are useful if they're grounded in values that the organization absolutely won't compromise on; KPIs should be a means to an end, not the end themselves.

Whether those values are "we want to make the most money" or "we want to make customers happy" or "we want a sustainable lifestyle for our team," KPIs will only help you pursue them, not define or prioritize them.

Nothing in OPs comment says it should only be completed stuff that goes into the status.

Also, if your team is optimising for the status report, your manager has already failed you.

Aurornis didn't say that. He specifically said "because they won’t sound good for the week".

With time and experience, you'll learn that those who can quickly churn out features and bug fixes are deemed extremely valuable to the company.

Let's say we have a wicked bug that has very little chance of happening. But if it does, it'll bankrupt the company, no questions asked. Spending 3 weeks fixing it is nowhere as impressive to the company as Joe who's churned out 10 features per week while your status updates are "Hunting for the wicked bug", even if you describe it it more details.

You are describing a situation where management has failed. No amount of better practices can help in a situation where your management chain cannot evaluate your work correctly.

Even if the detail was something like, "if this wasn't fixed, we were at the risk of losing half our customers" ?

> Even if the detail was something like, "if this wasn't fixed, we were at the risk of losing half our customers" ?

Yes, because this one task appears as a single line item for the manager's manager. They don't care for the complexity or consequences until the fire actually happens. And if the fire happens, they just blame engineers.

If the industry incentives were changed such that manager heads would roll for mass outages, managers would start appreciating big fixes.

If you spend more than a few hours looking for a bug, you should start writing code to go around the bug completely. Even if it results in technical debt, if the company can go under if the bug isn’t fixed, you’d be stupid NOT to charge that TD card.

That’s why devs who understand that survive layoffs and the ones who spend three weeks “hunting for an extinction level bug” don’t.

> if your team is optimising for the status report

If you're providing a status report, you're optimizing for the status report. Period.

Eh, my personal experience has been that just as often we have a mandatory status report that nobody cares about (including my direct and skip level managers). It's just something that needs to be done for contract compliance.

I admire the confidence with which you hold this opinion.

Engineers are not providing status reports. The manager is collating them. How is it any different to a sprint summary?

Yes. The manager has failed you. Doesn't mean you should put your ass on the line to fix things for the business.

That's why these kind of incentive structures are dangerous and why things like OKRs put such heavy emphasis on regular, company wide, failure. (I.e. they punish 100success rates)

This industry already has a huge problem with prioritizing shiny new features over fixing bugs, improving performance, verifying correctness, etc. This seems like a great way to enshrine that into your company culture.

Team A: Added a wizzbazz button that says "Wizz!" when you push it with a cool noise! Look at this cool screenshot!

Team B: Worked on fixing an elusive bug causing rare data corruption, but couldn't figure out what was causing it.

Team A: Re-colored the header to the CEO's favorite color! Look at this cool screenshot!

Team B: Fixed bug causing rare data corruption. Started looking into strange performance bottlenecks in UI.

Team A: Added spiffy animations that play every time the user goes to the next page. Watch this cool video!

Team B: Improved page responsiveness issue introduced by the wizzbazz button. Look at this graph; pay attention to the ninetieth-percentile time-to-first-render (dotted blue line) for dashboard users. It does actually go down quite a bit if you look.

Team A: Added a thousand lines of code to the fluxborg module! Look at this graph! It went from FIFTY lines of code to a THOUSAND lines of code! Look at how much that is on this graph!

Team B: Removed two thousand unnecessary lines of code from the wizzbazz module. We decided not to show a graph because the line goes down and we know you'll all think that's a bad thing.

Which team is getting the axe when it comes time to lay off employees? You know. Come on. You know it's Team B. You know it's true.

This is accurate. My director is obsessed with visible status updates aka UI changes or metrics. So anyone reporting, "fixed a 3 year old bug by refactoring old libraries, that improves maintenance" just doesn't cut it for him.

And unfortunately for engineers, they have to dance to this director's negligence even if it comes at the cost of their own sanity.

Corporate dynamics make shiny object raise up even if it fool's gold.

The real gold is in improving what in bad need of improvement.

The companies that stay afloat are often those that are abundant in people that know what really matter and find a way to do it regardless what the middle management thinks.

> The companies that stay afloat are often those that are abundant in people that know what really matter and find a way to do it regardless what the middle management thinks.

Which is hard to come by, because those people aren't rewarded.

Think you perfectly described teams working on Windows 11.

KPIs encompass more than just team performance; they also encompass the performance of the product itself. Even if you excel at delivering, a poorly performing product resulting from flawed ideas renders delivery insignificant.

In the above model, product performance could be easily addressed, too, as well as demonstrating that the features were data driven, it doesn't need to be a boring feature list.

For each delivered feature, there could be an overview, e.g.

"Users can now print reports easier. On our user request tracking platform, this issue had 321 upvotes, and 5 Premium clients requested this as well. We placed the print button THERE because amongst the three designs, this performed 23% better according to METRIC. For more info on the experiments, see link or ask on #mobile-team. The feature was released for two weeks behind a feature flag, it performed great, and we made it available for all since last Monday. 5%, of users who visited the report page used this feature and we generated already 5000 PDF reports. Big client A already complimented the feature, and it unblocked the sales process with Client B."

At the end of each email, there could be an analytics overview of the product with overall useful metrics, significant changes etc.

This format also helps with people being on holidays, sick, etc.

During my tenure as a product manager at Rakuten, I consistently grounded the initiation and culmination of all product endeavors in data. This approach dictated that prior to embarking on any project modifications, the presence of data was imperative to validate (or refute) the underlying hypotheses. This mean that if we just have an hypothesis and no data, we first add hooks to understand the situation throught data (ex. GA events, etc.).

For instance, when executives contested the efficacy of our registration form for X reasons, it became incumbent upon us to ascertain its current standing. This involved computing the ratio of registrations to site visits and conducting an in-depth analysis of the errors that users encountered while interacting with the form. We achieved this by integrating Google Analytics events with error messages, among other techniques. We provided first a picture, then we proceeded to make the changes.

This systematic approach enabled us to gain a comprehensive understanding of the situation, facilitating the implementation of purposeful changes. Subsequently, we gauged the impact of these alterations using key performance indicators (KPIs) such as the registration-to-visits ratio, and the previous events tracking changes too to understand if we really made a difference.

What about things like "Stuck on XYZ because no one fixes it because it doesn't show any good metrics and management doesn't care about it. So I fixed it at the expense of my own time."

>What about things like "Stuck on XYZ because no one fixes it because it doesn't show any good metrics and management doesn't care about it.

I'm confused by your comment. How did you decide this was something worthy of fixing if it "doesn't show any good metrics"? If you can't quantify the issue in any manner, how do you determine it's worth doing?

Generally, using one's brain.

This is especially important when metrics would not be expected to be available -- for example, if you're designing a nuclear reactor, you need to think hard about ways to prevent a meltdown in advance, rather than collecting meltdown statistics and then fixing the piping problems that correlated with the most nuclear meltdowns.

This is also necessary when the true metric that matters is very hard to evaluate counterfactually. For example, perhaps your real task is "maximize profit for the company", but you can't actually evaluate how your actions have influenced that metric, even though you can see the number going up and down.

And necessary as well when a goal is too abstract to directly capture by metrics, resulting in bad surrogate metrics: for example, "improve user experience" is hard to measure directly, so "increase time spent interacting with website" might be measured as a substitute, with predictable outcomes that bad UI design can force users to waste more time on a page trying to find what they came for.

All of these problems are faced by metric designers, who need to pick directly-measurable metric B (UX design metric) in order to maximize metric A (long-term profits) that the shareholders actually care about, but they cannot evaluate the quality of their own metrics by a metric, for the same reason that they were not using metric A directly to begin with.

(See also the McNamara fallacy, which parent comment is a splendid example of: https://en.m.wikipedia.org/wiki/McNamara_fallacy )

There is also the classic story about returning war planes. You can try to make sense of the damage, the bullet holes, and try to create strategies around how to improve affected areas. The problem is that the damage you actually want to inspect and prevent is on the planes that did not make it back, the ones you do not have an abundance of data on.

The classic war planes story was about Abraham Wald - https://en.wikipedia.org/wiki/Abraham_Wald

Let's say you've got a logging system that sometimes drops lines. And this sometimes makes debugging things hard, because you can't say whether that log line is missing because the code didn't run, or because the log line was lost.

Impact on end users? Nothing measurable. Impact on developers? Frustrating and slows them down, but by how much? It's impossible to say. How often does it happen? Well, difficult to count what isn't there. Would fixing the issue lead to a measurable increase in stories completed per week, or lines of code written, or employee retention? Probably not, as those are very noisy measures.

Nonetheless, that is not a fault I would tolerate or ignore.

> I'm confused by your comment. How did you decide this was something worthy of fixing if it "doesn't show any good metrics"? If you can't quantify the issue in any manner, how do you determine it's worth doing?

Because sometimes some things without metrics are incidental to the actual thing you set out to do.

E.g. a large refactor that switches libraries which is necessary for your new service that give 10% lower latency. But that library refactor will need to be done and it will take 2 months.

This is a challenge often encountered by product owners. At times, what appears to be a problem may not actually be one. Conversely, you might discover that a perceived issue holds greater significance than the current circumstances or priorities suggest.

KPI should do that. However it is very hard to measure the useful things in advance. The real measure is the bottom line after the product is released until it stops production. If you have an apprentice plumber cleaning out clogged drains it is easy as you work for a couple hours and then you get a check a month later. (this is a reasonable example of something easy to measure, but I doubt any plumbing company ever operated this way). However a lot of software is on a very long release cycle with hundreds of other developers. Even if you are Google and deploy to production often it can still be a long time before you start work on some feature and it is actually useful to customers, and then you are constantly enhancing things for the next version so that is even more noise making it hard to measure what has a useful impact.

Measuring is not difficult, unless you are attempting to measure something immeasurable. The appropriate sequence would be as follows:

1. Determine whether there is supporting data for the proposed change. 2. If no data is available, implement tracking to establish a reference point. 3. Measure the reference point and ascertain what could lead to success. 4. Implement the change. 5. Review the new data and compare it against past values.

Sure, but the feedback loops for what I want to measure are 10 years long. You can go from fresh out of school to senior engineer in that time if we can measure your contributions faster than 10 years.

But I am not talking about measuring people's performance, but product performance.

This sort of practice killed me as a platform engineer when it was an org-wide expectation that we do these weekly demos of what a feature looks like to a user. Sure, let me just screen record a new environment buildout or a suite of applications running for ten hours and you can observe how often a problem occurs and what happens in response. Even better when an activity for a week is something like we upgraded a storage driver across all environments. What does a user see? Nothing, but the old one had bugs, including at least one CVE. Now we don't.

Lean into it.

"What does a user see? Nothing, but the old one had bugs, including at least one CVE. Now we don't."

That seems like a reasonable content for demo and with just the right amount of self-deprecating humor it can be a slightly entertaining 2 minute presentation.

The smartest folks will recognize "User sees no change" as a good thing. Keeping things stable is extremely important.

Unfortunately, too few people think that way.

This seems perfect to me.

It reminds me of deviantArt tech dev meetings when I worked there (~10 years ago)

Every Monday, each tech/product team did a small demo for the whole tech org (maybe 50 people, growing to nearly 100 by the time I left).

Teams without an interactive demo would just put together a web page with a few pictures and text describing what they accomplished and team lead would present it to the whole org.

Teams competed for dubious accolades like most lines of code deleted, most embarrassing feature, best meme. Prizes were arbitrarily awarded by the VP of technology in the form of credits to buy art from their prints shop.

Similarly to what you describe, this practice relied heavily on feature flags enabling us to release features for testing while they were still in early stages of development. This worked out really well for getting feedback and QA testing early on, while also keeping everyone up to date about everything that was happening outside of our immediate area of focus. It was fun and motivating as well. deviantArt did a lot of things really right though back in the early 2010s. IMO it was a really incredible engineering org and probably the best job I ever had.

That optimizes for Demo-driven development though. I've seen a lot of flashy "clever" stuff created for demos which were later totally abandoned. And a lot of basement-level code wrangling that is difficult to demo.

How did the team report out things that weren’t new features? Performance improvements and bug fixes are often more important than adding more stuff to the product.

I had a similar report structure, and generally we would relay the same information. "Here's a graph of measurements from before and after the code change, showing a 20% reduction in resources saving us $X/year".

Generally that was good enough.

I like this approach with a graph. In the past I’ve run into people ignoring text bullets about the “boring” development wins.

Is there some reason you can't report on performance improvements and bug fixes?

Because a shiny new feature that takes 1/10th of the time gets 10x more praise and attention.

Because naive senior vice president MBAs believe there never should have been bugs in the first place, and every fix is an admission that the bug existed at all.

Because every email is now a weekly contest between teams, and when hard times come the ones who reported on performance improvements and bug fixes will lose their jobs long before the ones who made the header the president's favorite color.

Because every email is your team trying to convince the company of your own worth and it's much harder to show a pretty screenshot of a performance improvement than a new feature (Graphs of before/after performance every week? Every week?)

Because bugs don't always follow your bullshit schedule made by a bullshit manager who is looking for a pat on the back by replacing bullshit A with bullshit B, instead of anyone anywhere in the company ever just trusting anyone.

>Is there some reason you can't report on performance improvements and bug fixes?

Exactly. What does "performance improvement" even mean if you aren't measuring performance before and after?

Not GP but what about:

- Improved p50 latency of endpoint /very/important/endpoint by 22% [attached mini-graph showing the change]

- Fixed this bug that had been reported by 37 clients in the last 30 days as seldomly affecting them


E.g. "Foobar page for lists larger than 1000 now load in under 3 seconds, down from 10 earlier", "You no longer have to log out and login again when we update config Y", "Now when you report money mismatches, you no longer have to send screen shots (we have added logs)" etc.

I like how the most convincing examples of "agile transformations" almost never use the word agile.

In those cases (the majority) the word Agile is used to describe it. Not actually agile, just whatever the hell Agile is.


> KPIs compress away details that stakeholders need to form an understanding.

Comms. It always boils down to comms. Specifically, comms that drive understanding.

KPIs = knowledge.

But what you did, increased understanding.

And what I like to say is: "(Knowledge isn't power.) Understanding Is Power."

But doesn’t knowledge imply understanding? Otherwise it would just be facts without context.

KPIs are facts. Facts without context don't lead to understanding.

IMO KPIs are metrics to optimize (if they are objective). A set of values is already encoded in them by the specific formula being chosen.

At the extreme, watch Jeopardy. People spitting out their knowledge. Unfortunately, 99 out of 100 times, they have very little *understanding* of the answers they offer.

Knowing the dots doesn't mean you can connect them. And thus, "Understanding Is Power."

I totally get what you mean here, but I don’t know if Jeopardy is a great example. I think of that like trivia which is a skill that you have to practice and exercise. Rote memorization can increase knowledge and definitely helps. (Here comes the but :) ) But I don’t consider someone who has a head full of facts(only) to be knowledgeable. They have to be able to talk about the concepts and theory to back up the facts. The very act of connecting the dots demonstrates knowledge.

Anyway I think we are probably saying the same thing and discussing semantics.(and to be fair it is probably just me:) )

I agree. And that is why I began with "At the extreme..." :)

Was your sum KPI burned down on a report during the quarter or was it only reviewed once the quarter was complete?

I did keep an eye on it because it was a good indicator of problems. But I never drove the team based on it, any more than I would try to increase a car's speed by grabbing the speedometer needle.

> every Friday, each team sends a brief "here's what we delivered this week" email to the whole company

I am not sure this scales very well with company size. ;)

It doesn't. I've recently had to add about 30 mail filters to keep all the weekly/bi-weekly area newsletters trumpeting every random "milestone" for every team out of my main inbox.

When it comes time for me to move on from the company, one of my antics is going to be to reply-all to each and every one of them with the word "unsubscribe."

They're brief emails with a few lines and screenshots that can be read in under 2 minutes. Most stakeholders will read through emails from teams developing features for their business area, and skim the rest. They can also opt out entirely if they want to. But then they lose the right to complain that they didn't know what features are being shipped and when, or how to use them correctly (remember: the mail works as a "how to" too).

It would be too much in large software companies. There are thousands of teams and several products have absolutely no relevance to each other. So such a kind of self-inflicted spam attack would not event remotely work.

Have each squad send the email to the platoon, and each platoon send a summarized email to the company.

Depending on company culture that can lead to uniformly very positive feedback on the lower levels (amazing new features, developers show excellent performance, all of them), which does not seem to fit the mess the top management can see at their level (customers complaining, sales dropping).

"the Ministry of Plenty's forecast had estimated the output of boots for the quarter at 145 million pairs. The actual output was given as sixty-two millions. Winston, however, in rewriting the forecast, marked the figure down to fifty-seven millions, so as to allow for the usual claim that the quota had been overfulfilled. In any case, sixty-two millions was no nearer the truth than fifty-seven millions, or than 145 millions. Very likely no boots had been produced at all. Likelier still, nobody knew how many had been produced, much less cared. All one knew was that every quarter astronomical numbers of boots were produced on paper, while perhaps half the population of Oceania went barefoot. And so it was with every class of recorded fact, great or small.

[...preamble] it was not even forgery. It was merely the substitution of one piece of nonsense for another. Most of the material that you were dealing with had no connexion with anything in the real world, not even the kind of connexion that is contained in a direct lie. Statistics were just as much a fantasy in their original version as in their rectified version. A great deal of the time you were expected to make them up out of your head."

- George Orwell, 1984.

All one knew was that every quarter astronomical numbers of features and improvements were produced on paper, while perhaps half the users of SaaS products complained or left.

The irony is this is exactly what we did (albeit not the whole company) before adopting agile.

I was about to say the same. Before Agile, we had one Friday meeting with development manager where we pretty much did this.

So you planned, built, tested (and fixed any issues from testing) and released features (and supported them) each week?

Edit: I see the team is made up of 50 people across multiple teams. That makes a bit more sense.

Interesting. I did the same within just my team. We used Confluence to publish those, describing accomplishments of every IC. This initiative was started to appreciate the work of the team, so I still was concerned if anyone were reading those reports apart from my team, but I never had a question from the Steering Committee "what all of these peeps doing there?" since I started.

Question: what the format you have used (number of words)? Kind of a release notes? Was anyone actually reading those?


"Agents can now see their up-to-date balance on the app (previously they had to wait for the weekly email). Here's a screenshot. Here's a direct link to the QA environment so you can try it out yourself."

That's usually it for a given feature.

that works well when the deliverable shipped and the value can be easily understood by the audience is targeted to, for example product team shipping updates to non-technical people. When working on a backend refactoring, CI/CD pipelines, infra work and other things it's only valuable if the audience is engineering, other people aren't gonna understand the value of a new CI/CD pipeline for example

I run the devrel team at Ultra.io, and tried this for 6 months or so at the beginning of the year.

These types of updates only work in an environment where people actively read them.

Overall this is my standpoint too. But it depends on how KPIs are set. I think it is more destroying to not have any general directions at all.

Non frontend engineers probably despise this.

Did anyone actually read it? The problem with this is the frequency is so high that it would just get tuned out.

My first thought was that I would create a rule to dump these in a folder I never looked at. However, if I were on the team generating that email I would like it if it could truly replace the old ways of providing updates to leadership.

Using velocity as a KPI is to be prevented at all management levels.

scale of team?

About 50 engineers spread across 7 teams

I’m assuming each team has a PM because it’s impossible for an engineer to communicate therefore this couldn’t have happened without a PM being in place to play “telephone”.

All emails were drafted and sent by engineers. They did require some coaching initially, but eventually they got the hang of it. It's mostly plain English.

Good to hear!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact