
I deploy, therefore I am - mrdonbrown
https://www.sleuth.io/post/i-deploy-therefore-i-am
======
Ididntdothis
I really wish people would stop trying to reduce software development to
simple metrics. Number of deploys is as meaningless as lines of code or number
of story points as a metric for judging people's performance.

~~~
not_a_moth
It does indicate a culture where there aren't massive barriers to getting
changes in.

I worked at one startup on the AI team, and we were allowed to deploy at will,
we iterated constantly, pushed multi times a day, and had a flat hierarchy. We
weren't under the political umbrella of the engineering and product groups. It
was definitely risky but no major issues ever came from it and we gained the
trust of DevOps team and our CTO.

The back end team, different managers, was very hierarchical, had to schedule
deploys well in advance and inside certain time windows, and non technical
engineering managers would monitor and send summary emails to everyone after
each deploy.

Guess what, that back end team sucked. Their timelines for features were
embarrassingly long, they watched the clock, their managers loved to booze,
they overcomplicated the architecture and never produced anything innovative
for the company.

Deploys can signal culture.

~~~
onion2k
_Deploys can signal culture._

I worked at a company that deployed multiple times a day too. There were no
automated tests and the QA process was terrible, so most of the deploys were
bug fixes.

Deploys can signal culture, but not always in a good way.

------
dvnguyen
When number of deployments becomes an evaluation metrics, I can imagine an
engineering culture like this:

\- Dev rushes to make merge requests with the cost of thoughtful design and
testing.

\- Tension around code review turnaround time: how dare you take hours to
review my code when I'm far behind in the leader board.

\- Deployment has a bug? Awesome, now I'll have 2 deployments.

\- Deploy faster and break everything.

Frequent and continuous deployment is good. Making it a metrics is bad.

~~~
StavrosK
[https://en.wikipedia.org/wiki/Goodhart%27s_law](https://en.wikipedia.org/wiki/Goodhart%27s_law)

------
mnm1
So if I deploy one critical, long-term project this month and my coworkers
deploy around ten each say, how does this system know that the one thing I
deployed was actually more valuable and impactful than all those little
deploys? Or do I need to deploy the unfinished work in progress constantly to
keep up the metrics because this is just another useless measure like lines of
code or commits?

~~~
mrdonbrown
Assuming your project supported continuous development, I'd probably say you
should break that project up into multiple deployments, ideally hidden behind
a flag and each easily revertable. This would reduce the risk of each step and
hopefully eliminate a whole host of potential incidents.

~~~
inertiatic
How does code hidden behind a flag become less risky if it's deployed in
increments?

You will only find out when you allow it to run.

~~~
mrdonbrown
Well, for example, say you were adding a new preference page. First, I'd do a
deploy of the new database table(s), even though they aren't used. Then, I'd
add a link to the page in the UI and put that behind a flag, maybe with a
small non-functional UI so that the designer could play with it. Then, I'd
implement the basic functionality and open it up to my team to start playing
with it. Then, I'd ship other deploys for things like tests, more edge cases,
UI tweaks, etc.

At some point in there, I'd open that page up to my beta customers, trial
customers, or whoever is less risk-adverse. Once I'm happy with it, I'd do a
percentage rollout to ensure any issues don't affect all customers at once.
Just an example, but the idea is to reduce risk of each step and make each
step easily reversable/hidden if something comes up.

------
bluehatbrit
There's a reason the execs thinks in terms of revenue, cost, and profit. Why
would you want your engineers, designers, or anyone abstracted away from that?
From my experience, every time you try to do that you end up with the business
and the workforce misaligned.

It's not always easy to quantify value, and sometimes the return on an
investment is slow and that can be unattractive. However, that doesn't mean we
should try and hide away from figuring out the true value of work in business
terms.

In my opinion, we should judge the impact and effectiveness on the base line
metrics as the rest of the org. In a for profit company that's measures such
as increasing customer spend, increasing conversion rates, reducing churn,
reducing operational costs. All of those measures can be directly tied to
revenue, costs, and profit. It's the language the business speaks, why would
you want to speak a different language to the rest of the business? More than
that, why would you want to start measuring performance in a different way to
the rest of the business?

~~~
Spivak
Because, although it oughtn't, the culture that way of thinking produces some
of the worst code because it's more difficult to quantify the direct value of
a lot of the work that goes into good software engineering.

Security ends up just being a cost, nobody bothers to actually quantify the
risk because nobody really knows. This leads down the path of "the only
security that's quantifiable are my legal requirements."

You can't know the productivity gains/losses from a refactor until long after
you've done it.

You can't know how productive different ecosystems are until long after your
team adopted them. Your only measuring stick is how excited your devs are to
use it. What's productive for one team might not be for another.

Nobody quantifies acquiring tech debt so it piles up while people "deliver"
and the productivity losses come so gradually from it that teams don't even
realize they've slowed down because their velocity says constant it's just
that cards gradually get more difficult.

Performance becomes a cost because there's a huge grey area (Jira) between "so
slow I consider it down" and "general annoyance that slowly nudges me away
from the product." Unless you're planet scale or whatever your sample size
isn't big enough to notice these things.

This way of thinking creates feature factories.

I agree 100% that we can do better than being totally disconnected from the
business but this ends up being so much worse.

~~~
bluehatbrit
I do not disagree that it's really tough to quantify a lot of those things and
most of them will be estimates and may be wrong, in particular over longer
periods of time. But in the context of the article (a larger business with
profit generating goals) I don't think you can walk up to management and say
"lets switch technology, we're excited about this new thing" and expect to be
taken seriously.

It's not dissimilar for security and code quality, they are just part of doing
the job properly. They're also part of continually identifiying future
business risk. If you find a vulnerability you can absolutely identify the
risk of leaving it there, just like you can identify the level of risk of not
thinking about it up front with a feature.

We absolutely can provide estimates to backup why we should tackle some tech
debt, it may not be incredibly accurate but if you're unable to estimate any
gain from tackling it, is it really a problem? When you see a security issue
you've got legal risk and brand reputation risk. Both of those are very
tangible, while they're not easy to estimate and there's a lot of room for
error I don't know many businesses that would rather take the risk of EU data
breach fines vs paying an engineer to spend a few days fixing it.

That doesn't lead to feature factories, it leads to informed decisions.
Sometimes the business is going to choose to take on the risk of slower
development in an area if they think they're unlikely to revisit it vs doing a
refactor. Sometimes they may also take on a risk of security issues if they
feel it's outside of their threat model. It's up to us as product/engineering
teams to give a clear picture and side our professional advice in a way they
can understand. In many cases we may need to push to do a refactor or learn a
new technology, but we can't expect to persuade someone while we're speaking a
different language to them. Saying that they're only interested in business
value and that makes them a "feature factory" doesn't make sense in that
regard.

Perhaps the business and leadership really enjoy the taste of risk, and if
that's the case then trying to qualify value in some other language still
won't help. There we have a larger problem with the practices of the
leadership and company as a whole.

I agree that the areas you've mentioned are ones where it's hard to qualify
value in terms the business understands, but it doesn't mean the system is
broken. It's the language used to describe the goals of the organisation. If
you're lucky enough to work in a company which places a goal as team happiness
as well then you can also speak in those terms in some cases. However, in the
context of the article (Atlassian) I struggle to understand why a team lead
would try to abstract away the core business value a team is generating. A
team leader should be rushing to quantify it as accurately and clearly as
possible to give the team a leg up in the orgs career ladder.

------
grensley
That scoreboard sounds particularly toxic.

~~~
choward
Pure garbage. I still can't believe they actually think this is a good idea.
That honestly is once of the worst metrics I've ever seen.

------
koz_
Reminds me of when my team at bigcorp noticed that you got given a badge on
your personal page for being quick to respond to code reviews. Cut to suddenly
every code review being immediately responded to with the comment "got it,
will review soon".

That said, it really did change the team's behaviour. Code reviews became a
priority instead of something left to batch up and do later. Whether that was
a positive change or not it's hard to imagine an edict from management
achieving the same result.

~~~
MatekCopatek
Yep, +1 from me.

They put a PR reviews scoreboard on the TV that displayed a random metrics
dashboard and we immediately started teasing each other about it and did more
PR reviews as a consequence.

We liked it because it wasn't seen by anyone else apart from engineers taking
a coffee break. Many people ignored it and noone thought less of them. If they
told me a part of my bonus would be tied to my PR high score, I would be
seriously demotivated.

But all in all - I think that just proves gamification works. What you're
using it for is what matters in the end.

------
shruubi
So what happens if I'm working on a feature that I can't deploy in small
chunks meaning I make fewer but larger commits and deploys? Even if I'm
deploying good work by this system I'm ranked less than my co-workers by the
nature of the work.

So now according to the companies public ranking system that is viewable to my
bosses and coworkers, I look much worse than everyone else and the nature of
the work keeps me in that situation.

Instead of feeling motivated, I'd go into the office every day feeling like
crap because I'm doing my job and being told I'm less than my coworkers for
it.

But hey, so long as we "move fast and break things" who cares right?

~~~
wolco
So you pick less complex stuff and look good and get promoted.

Or you do more complex stuff. No one else will want to do that so you become
an expert in certain area and are above the leaderboard in a way. You answer
to no one.

Or you get on the committee who decides these items and you push for a
multipler for complexity.

Or you hack it.

------
OJFord
I really despise this kind of 'pointed-her' writing style. It was popular in
some older academic texts, and seems to be gaining popularity in blogs and the
like in recent years. I have never written a sentence about a generic 'boss'
as 'boss asks blah what do you tell _him_ ', nor was I taught it, nor would I
expect to read it.

The genderless generic has always been 'they/them', it doesn't require
positive action, nor discourse about a hypothetical generic's
'self'-identified pronoun. It has always been third-person, since long before
anybody gave a shit.

~~~
satyrnein
Wow, I guess you must feel compelled to call out unnecessary male gendering
all the time!

~~~
OJFord
As I said, I consider them both incorrect.

Perhaps it's wrong, but much like incorrect use of 'I' vs. 'me', one annoys me
more because, rightly or wrongly, it reads to me more like deliberate
thought/effort went into it; that the author thought they were getting it
right.

I should also say I don't object to gendering _specific_ hypothetical/imagined
characters in a story-telling sort of way - 'let's say Sally is a software
engineering manager, and her direct report Bob ...' \- but the generic case
should always be third-person, even if interspersed with such story-telling.

~~~
satyrnein
I actually agree with you that the singular they should be used instead of any
gendered pronoun, but I think your focus on only female pronouns is (very
mildly and probably unintentionally) sexist, similar to how applying law
enforcement unevenly can be racist.

~~~
OJFord
Like I said, it just 'perhaps wrongly' stands out more.

I have said the same thing in threads where people are arguing over whether or
not an author's sexist for saying 'him' or something though. Particularly on
HN those cases are more likely to draw other top-level complaints from a
political or societal perspective, I don't care about that, I just think it's
annoying and bad grammar.

------
0xCMP
Focusing efforts on smaller+faster deploys (along with some measure of deploy
quality to weigh those deploys) seems like a far better indicator of a team's
velocity than simply tickets closed.

~~~
mrdonbrown
Exactly. Agreed to all the comments that trying to find a simple metric for
developer productivity is futile, but there is value in encouraging people to
ship more often, and in doing so, ship smaller things that generally have less
risk. Also, ownership is key here as the person writing the code should be the
one that pushes it to production and owns its impacts, as I've found this
results in higher quality, more sense of ownership, and quicker incident
response and frequency reduction.

------
mpoteat
As other people have specified in this thread, the underlying problem is
Goodhart's Law, i.e. these performance tracking systems break down when
pressure is applied to them.

~~~
jacques_chester
There's a difference between "measurement will fail when there is poor
management" and "measurement will always fail".

But having said that, a _leaderboard_ seems like a handing out poor management
tokens.

~~~
choward
I don't understand your point. It's not the measurement itself that is the
failure. It's making that measurement a goal that's the problem. The real
metric that matters for any company is profit. Anything else is just an
arbitrary measurement that is believed to help achieve that goal. The metric
of profit is also gamed which is why a lot of people and companies do shady
things.

------
theamk
Wait, what?

> It didn't matter [...] what tickets I closed, only whether I fixed that bug
> that was bothering that one big customer

That's what "closing the ticket" means, right? You close the ticket when the
bug is fixed. This is what matters.

The deployments are much more roundabout metric -- maybe you fix 5 bugs with
one deployment, or maybe your bugfix is big, so you had to deploy schema
change first. Only JIRA tickets (of the right type/severity of course) show
the business impact.

------
wrnr
They sell CI tooling, they blog about CI tooling, what Jenkins just isn't cool
anymore.

