Hacker News new | past | comments | ask | show | jobs | submit login
An epic treatise on scheduling, bug tracking, and triage (apenwarr.ca)
183 points by lostdog on Dec 14, 2017 | hide | past | favorite | 65 comments



I’m the author of the article. I apologize for its massive length but wasn’t able to make it shorter without losing my favourite bits :) I’m happy to answer any questions people have here.


I thought this was very insightful and spot on for the most part, but I had one remark.

I object to the notion that it’s ok for bug ingress rate to be higher than bug egress. For me that’s symptomatic of underlying problems. Either (A) those bugs are on important features, and the team is favoring novelty over functionality by prioritizing new feature dev over feature maintenance, or (B) the broken features are unimportant and the team is failing to weed out irrelevant functionality from their codebase (it is important to remove features while you add them to not get in a zero-progress situation, unless you grow the team along with the codebase), or (C) the team has bad engineering practices causing a high ingress rate, or (D) many bugs are based on misunderstandings, which points to documentation or UX issues. No matter how you slice it, I see it as never acceptable to let a bug pile grow indefinitely.

Do you see this as a sort of compromise, as in it indeed pointing to deeper problems but needing a workaround in the real world, or do you disagree that a growing bug pile is symptomatic of deeper problems?


I disagree strongly with your take on story points:

First, if story points are an indirect measure of time, then the "psychological game" you're playing will be immediately revealed if your engineers are as smart as claimed. There is no reason for me to point something a 2 over a 3 unless you're measuring the time it takes to deliver software based off those measurements. On the opposite end of the spectrum, my confidence in whether a story is an 8 or a 13 becomes significantly weaker as the numbers get bigger.

Using numbers, specifically, is a hint that whomever is handling process just wants to predict when the project will be done. Numbers trick you into thinking they can be added, margins of error are not additive. My go-to question in these situations is why don't we just estimate with abstract sizes? (Small, Medium, Large, etc) Surprisingly I'm often met with resistance.

Second, if story points are not an indirect measure of time, then why are you pointing stories? What does the pointing gain your team other than fluff? If you say it's for prioritization then you're just invalidating the premise and we're back to an indirect measure of time.

Finally, I have not seen, and you have not presented, evidence that engineers are good at estimating. In fact all that I have read seems to indicate the exact opposite, that engineers are very bad at estimating (in fact that they are largely too optimistic). One could argue that this can be trained, or that you'll get better at estimating as you gain more experience. Which I will concede that you will get faster and better at estimating and implementing the same exact feature, but that is not what we do. We implement new features, things we likely haven't done before.


> First, if story points are an indirect measure of time, then the "psychological game" you're playing will be immediately revealed if your engineers are as smart as claimed.

Adding a level of indirection removes some implicit biases. Points might be indirectly related to time, but only incidentally. Points sound more directly related to task complexity, which is itself indirectly related to time.

It's kind of like how lines of code seems like kind of a poor measure of program complexity, and yet, study after study has shown that regardless of language, framework, etc. the number of line of code is the best measure we have for latent bug count. This correlation doesn't always make sense if you look at specific, contrived examples, but the property seems to hold in aggregate.


> Second, if story points are not an indirect measure of time, then why are you pointing stories? What does the pointing gain your team other than fluff?

The points can (and should) be assigned before the work arrives. Agile planning is a little more nuanced. There's business valuation (points from business development) which then become stories that are estimated during/over another sprint. The points don't correlate directly to hours because you haven't assigned points or know what resources will be on the estimated sprint. That is the responsibility of the Scrum master to handle. If people are vacationing, sick, replaced or there are new hires, you put a variable time-value on the points. Most importantly, you put a few stories in the current sprint off the top of the queue and pull as time allows. Telling the engineers that "all these stories must be done by the end of the sprint" negates the whole process.

> Finally, I have not seen, and you have not presented, evidence that engineers are good at estimating.

In general, nobody else is capable of coming close. That being said, I've blown up estimates to maximum by inquiring about specific details (what file, function, library, what repo, do you have creds for that, how long are the code reviews, etc) in a majority of tasks because some engineers are good as estimation for specific implementations, otherwise they concede on identified complexity/meta-process. Small, Medium, Large seems like the correct approach.


So what? Even though engineers realize that, as long as you're talking abstract points, and not holding me to a deadline, I have no reason to modify my estimate. And even if I -do- modify my estimate, as long as I continue with that new estimation mechanism for some period of time, velocity will change accordingly and the PO can -still- make accurate determinations of when something will be delivered. If I consistently under or over estimate things, I am predictable. The more constistent I am, the less time it takes to be uniform, and thus predictable. By not pinning it to time, the PO can determine the average velocity, the current estimations, and have a largely accurate idea of when things will be done. The only time this ever breaks is when I have a reason to change my estimation strategy, which only happens when you start applying incentives for me to change it. Every time my incentives change, my estimates will change; don't change my incentives. Don't give me a date or a deadline.

We could do this with hours except hours actually translate directly to time. From them I can determine when you think I should be done and thus the implied goal; points I can't. Because of that, I have incentive to overestimate when it comes to hours, to give myself extra time 'just in case'. With points I don't; they don't imply an end time. Even if the PO's estimates are wrong, they can't get onto me, because I never had a due date. Because of that, my estimates tend to be consistent (even if inaccurate), and consistency = predictability. That's the goal, making it predictable when things will be finished (within a given tolerance; hours never give us that). Whether points are big, or small, the PO can determine "This team has an average velocity of 20 points, whatever size those points may be. That means I can expect around 20 points per sprint in terms of stories. I have 50 points I need to get done for this next release, that means I can expect it in three more sprints". All done without the engineers ever having a goal they are trying to make, no incentive to change their estimates, and literally the best predictability (not accuracy; you don't need accuracy) you can achieve, which, it turns out, is actually pretty damn good in practice.

I think you're missing the fact that points only matter relative to each other. There is no absolute meaning. The author hints at that but I don't think is explicit about it, assuming you already know that. As such, a team tends to be consistent with them, even if not objectively accurate. And their mistakes average out into something predictable, even if not accurate. Because there is no incentive, conscious or unconscious, to massage it.

Some people -do- decide to use t-shirt sizes for sizing, rather than points. It still works. This is generally equivalent to 3, 5, and 8 using points (fibonacci). Most people who use points say anything less than a 3 is wrong (make it a three), and anything larger than an 8 should be broken up because otherwise it's difficult to estimate (with -maybe- a 13). So use whichever you like. S, M, L, and maybe XL, if you want to map to four options, as you have with points. Though you need a way to aggregate them to determine a velocity; how many S = 1M, etc. That's why people tend to use points.

Basically, the author said, flat out, it's a psychology game. AND IT IS. What a PO needs from a dev is consistency in how they estimate, not accuracy. From that you can measure the actual work completed over time, and get a measure of velocity, which can be used to accurately predict the delivery of future stories, within a pretty good tolerance. Ensure the psychology for that consistency is there. Points are part of it. No goals, milestones, deadlines, etc, are part of it.


You're missing my point.

My point is that no story points need to exist. If you want to average the amount of work done over a period of time then just count bugs and stories completed. The Central Limit theorem applies equally well to stories over the long term, so just collect data. Automatic.

We add story points, presumably because we don't want to wait to gather enough data on stories, but the neutral position is to not use them because they incur a cost (meetings, training to get consistency right, etc). So why have them? I have not seen a convincing argument.


Sure, we could treat all stories as being of a single size and rely on the central limit theorem, but that takes far, far longer to converge. On the order of years, I suspect, not weeks or months, which is what you get out of pointing. "We should have this done sometime between now and 2022" is not a useful metric for a PO.

We add story points because given consistent incentives, estimates tend to be consistent. Maybe consistently under or consistently over, and obviously there's a bit of wiggle room from estimate to estimate, but they tend to converge quite quickly, and to within a week or two's uncertainty of what we'll have done at any given point within the next six months, and within 6 weeks within the next year. Quite a far cry from not using them.

Meetings? Some, sure; you'd have them anyway just defining what it is you're doing, the additional burden to assign points takes up maybe half an hour per person per week or two. Training to get consistency right? There's no training involved. The hardest thing is to get people accustomed to picking a number, relative to the others. But that's not that hard either; we basically just took the first sprint's stories, organized them into a line (much like his rectangles) going from least to most complex, then discussed where to draw three vertical lines, separating them into four distinct sizes, 3s, 5s, 8s, and 13s. We then made sure the largest didn't feel too large compared with the other 13s (or else it might actually be a 21, just compared to what we had agreed a 13 was, and so we had to split it up), and from there we always had 'reference stories' to decide whether it felt more like a 5 or an 8, say. And while we sometimes differed, we could always hash out why we differed and come to an agreement on exactly how large the story was. Again, per the OP, so long as you are consistent with how you address those discrepancies, your overall estimates will be consistent, and give you predictability.

One half hour per week or two is hardly a huge cost to pay when it gives the business the ability to accurately predict when we'll have something delivered.

I don't care if the argument is convincing. I've seen it work. You're free to do whatever you want; I know what I've seen work, and what I've seen not work. I've yet to find something that works so well.


Your claims are outrageous with no supporting evidence.

If you just want to throw out anecdotes I'll give you my own. I collected data on all the pointed stories at one previous job for three years and found a negative correlation between story point and time from a story being started to being completed.

Sure, you might claim that it all averages out in the end, except that our velocities were wildly in flux for that entire span as well. But we were just "doing it wrong" right?


Outrageous? Seriously? Hyperbole much?

I doubt that you did, if I understand you. Because it sounds like you're saying that overall 3 pointed stories took the most amount of time, and 13 (or whatever your max) took the least.

But let's say it did. I'd look to see why your estimates fluctuated all over the place. Or whether stories were being closed when they were actually finished (i.e., being accurately reported). Did you have deadlines? Did you have delivery pressures? Because that right there is a good reason; as you near a deadline you start padding estimates more. Did you keep having things come up that broke the sprint? Etc. All manner of things can cause estimates to be wrong. But not negatively correlated, -especially- with velocities constantly in flux (and I mean seriously in flux; you take an average because it can and will vary, especially if there's unexpected stuff, like someone getting sick, that you didn't account for when planning); that to me, yes, definitely sounds like you were doing something very, very wrong.


If you look at the first Central Limit Theorem slide in the original article, it compares story precision vs precision of unestimated bugs. The short answer is that if your stories are big, not estimating them causes more error (weeks or months) than most people are willing to accept. Not so with small tasks (bugs) which average out, as you suggest.

However, it’s hard to make long-range estimates using only many tiny stories/bugs, because you don’t want to break the job down with such granularity for months or years into the future - plans will change by then, and all that design work will have been wasted. That’s what makes big stories useful; you can estimate months of work in a few minutes. But because they’re so big, you can’t treat them all as equal sized.


Except you can't, as I've said, engineers are not good at estimating. They get significantly worse when you start estimating months out instead of days out.

The problem is not that they're engineers, no one is good at estimating work they've never done before. This is a well known problem in pretty much every single software shop I've ever been in. Teams never deliver what was planned on time, only functional teams cut features for releases. This is not a "win" for estimation.


> Except you can't, as I've said, engineers are not good at estimating. They get significantly worse when you start estimating months out instead of days out.

Because you keep focusing on time estimates instead of point estimates. People have intrinsic biases related to time and their productivity. Like how most people implicitly assume they're above average in intelligence, looks, etc.


Wonderful article, read it like a blockbuster with a constants clicking in my brain. Every time I tried to impose some sort of sprints for myself and the team it always end in Friday superdense coding with a half task moving to the next sprint. Thank you for a great argumentation why precise estimation is a bullshit bingo, I tried to show to PM's that only a steepness of slope is important and managers can plan work without exact dev-hours, but without much success, your explanation is clear and easy to understand.


Super interesting points here. You mentioned that the real-world Kanban board forces you to empty slots to make room... I've never seen that in any of the software Kanban-style systems have you?

I've played around (in spreadsheets and basic apps) with trying to create systems that scaled available slots to team size as a way to force correct granularity.


No, I've never seen it enforced by tools. Physical (index cards) kanban boards have an implicit space limit though, and this is one of the too-seldom-acknowledged reasons why they work as well as they do. Unfortunately the software clones of physical kanban boards copied the unnecessary part (visual appearance of index cards) and not the necessary part (limited space at each phase).


Rally lets you set card limits in each column, though in practice that always seemed too easy to change (just add a couple more to the limit; just "temporarily" turn off the limit) to be entirely beneficial.


Are you talking about work-in-progress (WIP) limits? I've seen tools enforce this so you can't pull another story into the current list. Other tools let you pull but highlight the fact that you're going over the WIP limits by making the whole column eg. red. Or do you mean something else?


I've used LeanKit's Kanban board to enforce per-lane WIP limits. It works quite well and is, in my opinion, one of the most important aspects of using Kanban boards for software development.


First, thank you very much for the write-up, I sure found it interesting as well as entertaining.

When using bug trackers, I find the most frustrating aspect is the "non-linearity" of the workflow. By that I mean, how do I answer the question "What am I supposed to do next?". You can sort by project, or by priority, but what I typically end up with is a list of items that I already looked at a dozen times. And even though I don't want to look at them a second time, I haven't found a way for a bug tracker to do that for me. Ideally, I would want to look at each task at most two times: Once for triaging, and once for working on it. That's it.

So the way I understand it, you're trying to address this, at least partially. A task starts out as untriaged, then you tag it as triaged, and that means you only had to look at it once for triaging. Which is great, because it's a linear workflow.

Some tasks are obviously critical, and will end up in the next open release milestone. Some belong to a feature that is not released yet. But what about the stuff that ends up in the backlog? These smallish, nagging bugs that are not super-critical. That is the big, ugly pile that keeps growing and growing. How do you keep that big pile manageable? Ideally, that pile shouldn't become big in the first place, but how do you prevent that from happening?


The way I like to do it is to keep multiple backlogs, broken down roughly by feature group (ie. groups of things that might eventually become a milestone or story of their own). Then when I decide to prioritize a given story, I can go to the relevant backlog and re-triage only the bugs in there, which should hopefully be a relatively manageable number.

Another benefit of sub-categorizing this way is that it makes it easier to resolve bugs as duplicates. When a new bug is filed, it's hard to see that it's a duplicate when you're comparing it against 10000 other bugs, but it's easier when you're comparing it against only 100 other bugs in the same category.

I doubt you'll ever be able to get it down to "at most twice." But it needs to be much easier than ever resotring to "looking through the whole list."


>its massive length but wasn’t able to make it shorter

It's always possible to make it shorter to help communicate a main idea. (However, it does takes extra effort to extract the essence of a long piece.[1]) Your essay is ~15,000 words and desperately needs a short "elevator pitch" of its main points. I'm a very verbose writer so when I think others' writing is verbose, it means everybody is going to drown in the text.

Here's my summary of what I think you're trying to say:

1: There psychological problems with deadlines and large bug lists that cause counterproductive results

2: I discuss 2 psychological "tricks" in software development to counteract the unwanted behavior

2.a: Estimate software by abstract units such as "points" instead of concrete units such as "time/hours/weeks". The abstract units bypass the human biases that lead to bad estimates. Use the points to determine "relative" sizes of each "story". (E.g. Developers vote to converge on the "size" of each story point.) The last step is to multiply the points by a unit of time to derive a finish date.

2.b: Do not have a big global list of bugs to burn down. The size would be overwhelming and demoralizing to teams. Instead, triage bugs into smaller "hot lists" so they "see" a smaller manageable queue to work on. Also, measuring bug fix times will eventually let you derive an "average bug size" that's reasonably accurate

The tldr would be something like "Here are 2 counterproductive management techniques with setting deadlines and assigning bug fix work -- and here are 2 ways to counteract it with management ideas that take advantage of human psychology."

Somebody else can wordsmith it better than I can but that's what I think your essay is basically about. The 15000 words are mostly examples or background ideas leading up to your recommendations (SLO vs SLA, Tesla, what I like don't like about Agile, Kanban, etc).

I recommend that you put your strongest main points at the very top to give your readers the mental scaffolding to hang the rest of your 15000 words on.

[1] "I would have written a shorter letter, but I did not have the time." -- Blaise Pascal : https://en.wikiquote.org/wiki/Blaise_Pascal


I know what you mean, and in general I try to follow that advice. In the particular case of project management though, I'm frustrated by the huge amount of too-short and contradictory advice floating around on the Internet; adding one more unjustified summary to the pile doesn't help. So I think the extra details are important. And when you have that many details, the tangential expository fluff helps keep it interesting. I hope.

This is also why I didn't summarize everything at the top: that would encourage people to just read the top and stop there. They can do that with project management advice anywhere on the Internet. There's a place for that, but there's plenty of it already.

At least it's shorter than a book. :)


FWIW, I liked this format. It reads like a presentation and digresses and regroups back to the main thread throughout. It may make it less widely read but I think it may also make it more memorable and loved.

The value for the reader is in the act of chewing over a familiar problem along with your guidance. If the main points were made more obvious, then perhaps it also becomes more boring.

If you want to propose a more tangible recommendation/guide/process then yeah, give people hooky, easy to remember bits.


>So I think the extra details are important. [...] This is also why I didn't summarize everything at the top: that would encourage people to just read the top and stop there.

There's an opposite way to look at it: a good summary acts as a "hook" and entices readers to read the rest of 15000 words. I wasn't suggesting you delete the extra details. Instead, the bullet points at the top give the reader a "road map" to the rest of the long article.

>They can do that with project management advice anywhere on the Internet.

Well, you said the other articles out there are contradictory ("doesn't work and makes things worse") so there's your hook: you have a superior method.

If you prefer to write in a style that "unfolds" that's understandable. A writer can hold an opinion on the best way to present his ideas.

That said, I'll offer some counterpoint. A web surfer may have 20 browser tabs open as a "todo list" of unread blogs. The email inbox has a bunch of unread messages. There's also a stack of new candidate resumes he's supposed to read. That random person then clicks on your blog and sees the shaded rectangle in the scroll bar get real tiny which visually indicates it's a very long piece of text. 15000 words is ~1 hour of reading.

Since you're not a household name among famous authors, a lot of people just won't start reading it on faith alone. They don't trust you enough yet that it will eventually unfold with an amazing insight. Instead, many busy people will just ignore it because there are so many other items competing for their attention. In particular, the project managers and business executives you want to reach and internalize your recommendations are especially prone to skip long articles. One hour articles are really making a huge demand from multitasking managers so they need a nudge to see if it's worth their time.

There's a glut of information overload out there and long articles can act as "RADIOACTIVE - DO NOT ENTER" signs to the people you most want to convince of your ideas.


I agree. This is the internet, after all. The first question is: Is this worth my time and attention?


I decided it was worth my time and attention because the HN comments were mostly positive. Once I skipped the personal intro, the argument was sufficiently engaging to keep me going.

I think the author is partly right that a breakdown might be more harmful because you lose too much information. HN and reddit comments are how I decide whether something is worth my time, so a summary isn't necessarily beneficial.


TL;DR thanks. ++good writing advice


Amazing stuff. It was a real pleasure to read. Despite its length I couldn't get myself to skip ahead or skim.

We're now in the process of switching to a structure similar to Basecamp's 6-week cycles[0]. And those cycles obviously do have a deadline. However, I would still say this kind of deadline is better than your typical one for a couple of reasons:

1. The team is self-managing. Typically the team was involved in the pitch process for the project, so they have vested interest in getting this done. I think this is key to avoiding the Student problem. It's no longer an assignment dropped from above, but something you are keen to push forward.

2. The cycle is 6 weeks rather than a sprint of 2... So this feels more like a slower-pace mini-marathon. And the team has autonomy to drop features or make adjustments. I admit that's a weaker argument for it than the one above.

I wonder what's your take on this?

[0] https://m.signalvnoise.com/how-we-set-up-our-work-cbce3d3d9c...


Agile is meant to be used for non-software delivery projects as well.

In your view, with the parts you crossed out (including the physiological/motivational structures)-- would Agile still be widely applicable outside of the software-oriented projects?

Couple of, say, hypothetical examples, perhaps for non-software projects

-- developing & submitting scientific grant application

-- organizing a non-trivial longitudinal survey

-- looking for college for kids

-- designing a motorcycle with unique frame/engine layout

-- planning and shooting a movie


In my opinion (and you should take it as an opinion :)), the "good parts" of Agile are the same for both software and non-software projects. Estimation, strict prioritization, and (automated) progress tracking are the keys to any successful project management.


Very interesting article, thanks for writing it and sharing it, I forwarded it to my PM :) . I do think it could have been edited without losing your favourite bits though :)


Nice work. It should be three blog posts/chapters however.


Interesting article, I have a few comments/questions

Story points - there is a lot in theory I like about these and you hit on those points. But in practice where my teams have struggled is still in the definition of a point. People understand the relative sizing concept but they still want a definition of what one point means and that invariably winds up being some kind of a time unit, which means all points get thought of in time units. What techniques have you used to solve this problem?

Defects - some good points were made but you never really discussed how these are managed as work. Stories and defects are getting worked on at the same time, but how exactly? Are defects just lumped in with stories and the PM prioritizes them? I do not think so, but you are not clear in this area. If a PM prioritizes a story but the team spends all of its time resolving defects then how is value being delivered as expected? You seem to have left this out completely, or I missed it.

Finally, also on defects, while your points on not estimating them makes some sense, someone has to decide what to work on, and generally you need some idea how long things will take to make that decision. Even in your wording, you acknowledge some defects take a long time to fix, where as others are super-quick and in the end they all average out. Still, someone in the organization still cares about dates and delivery. It made a lot of sense to me where you indicated that the story point estimate can be valuable to the PM in terms of prioritizing. When they see a story with a high estimate that might cause them to lower its priority compared to other stories they can get delivered quicker. But seemingly defect fixing would have to factor into this somewhere too, and then wouldn't the same concept apply? If defects are not estimated, how can the time it will take to resolve the defect inform the decision making process?


You're right, those two points are inadequately covered in the article. Thanks for reading so carefully :)

Story point size: the usual thing to do is to have "baseline tasks" (which the whole / almost the whole team did in the past) that you choose way back at the beginning, and then continue to use them as reference whenever estimating in the future. To do this, a couple of people who know the "approximate size" of a point estimate some, say, 2-point and 5-point past tasks, but don't tell anyone else how big a point was. And after that, they try to forget how big a point was during the baselining process. But you never, ever let people ask about time; you just say "was it bigger or smaller than the baseline 5-point task"?

Defects: in the model as it's being discussed, we assume (perhaps too optimistically) that we generally fix bugs before adding new features. This is why, in the second simulation slide [1], each subsequent feature takes longer than the last. Eventually, you cannot sustain this method if you still want to launch new features and your team size hasn't grown and you haven't contained the number of new bugs somehow; I can't tell you what to do when that happens. It's hard.

If you have mostly small bugs (which is common; just fix them) and a few large bugs (also common), then if the bugs are important, they can probably be described by writing a story. At that point you can elevate them to the estimation and PM prioritization process.

Beware, however, that in general, if a bug is introduced by adding a new feature, you should almost always fix it before launching that feature. Otherwise you have basically lied about how long it took to implement that feature, and as that gets worse over time, it progressively upsets your estimates.

[1] http://apenwarr.ca/log/?m=201712#slide16


I actually reject the idea of story points as a unit-less number, mostly because people can't help but think about them in terms of time anyway (whether they want to or not) [1].

Rather, I like to use a discrete list of story point values as scalar value that represents a probability distribution for how long the task might take. As the story point gets bigger, not only does the mean time get bigger, but the variance grows as well.

For example, a 1 is 2-4 hours, but a 13 is 2-3 weeks, and a 40 is 1-2 months. The idea is that not only do more complex tasks take more time, but the precision of our estimates goes down.

This makes engineers happy because they get to be more honest about estimates, and it makes managers happy because they only have to deal with one number.

[1] Unit-less measures might be easier in an environment that is more concerned with true productivity than with deadlines, but that is not most environments.


Thanks and interesting. If you have written up your full system anywhere I would be interested to read more.


In terms of mapping story points to time ranges, here's the full list:

1 point -> 2-4 hours

2 points -> 4-8 hours (1 day or less)

3 points -> 1-2 days

5 points -> 3-5 days (up to one week)

8 points -> 5-9 days (up to one two-week sprint[0])

13 points -> 2-3 weeks (1-1.5 sprints)

20 points -> 3-4 weeks (up to two sprints)

40 points -> 1-2 months

Anything beyond 40 needs to be aggressively analyzed broken down into steps, even if those steps are not deliverable features per se.

[0]For a two-week sprint, you have to account for at least one day each sprint for demos, retrospective, and planning. Thus, you have at most 9 days for implementation and testing.

EDIT: Fixed formatting.


The whole "point" of story points is that they are meaningless until after something has been actually accomplished. After working for a few weeks, it's pretty easy to say something along the lines of "the team got 52 points finished and deployed last week" - this is useful to know, and over time you'll develop an intuition that may be useful for forecasting what's likely to get done over the next few weeks or months on this particular project with this particular team. That's it, that's the most you can do. Most of the other stuff people try to do is so noisy that it makes for nice spreadsheets of the "plan" but bears very little resemblance to reality.


I get that, but when you have that first estimation meeting with the team they inevitably ask what a point is. They get the relative sizing idea but you still need to size the first stories.


> What doesn't work is deciding when you're going to get there.

That would be an ETA, which is quite useful in many cases.

> Or telling salespeople they need to sell 10% more next quarter.

That's a minimum quota. Also useful.

> Or telling school teachers they need to improve their standardized test scores.

Which is not so much a 'goal' as it is a response to the need for kids to actually learn information. It's a goal in the sense that "you should be doing your job to a minimum degree of proficiency" is a goal.

> Or telling engineers they need to launch their project at Whatever Conference (and not a day sooner or a day later).

Which is absolutely an arbitrary goal that doesn't have anything to do with the product, but is also good business sense.

Sometimes you need stupid goals.


What all these wrong-goal examples have in common is that people have tried them over and over, and they never work the way you want. The reason they never work is that they don't specify the method. As one of my favourite Deming quotes says, "If you can accomplish a [numerical quality/goal] goal without a [new] method, then why were you not doing it last year? There is only one possible answer: you were goofing off."

What ends up happening is that people will cheat the system so they can hit the target you enforced, while sacrificing some other critical thing you forgot to enforce. In the educational system, for example, what happens is "teach to the test," which causes improved standardized test scores at the expense of actually learning things. And, of course, you get schools that outright cheat when scoring the tests: https://www.washingtonpost.com/news/answer-sheet/wp/2015/04/...

Re: ETAs, those are predictions, not goals. Predictions are good. Most of the article is about the difference between the two.


The goal is not the problem. The method is not the problem. The problem is the problem.

You will have to increase the educational system's budget. You will have to raise taxes. You will have to build more classrooms. You will have to hire more teachers. You will even have to feed the damned kids, and improve their home life.

There is no "method" that avoids having to do these things. These are the problems you have to solve to reach the intended goal. How you go about solving these problems does not matter.

Use whatever method you want. Sweet talk them. Bribe them. Hire a guy named Vito with a baseball bat. The method doesn't matter. Just solve the problems.


I think we agree here, except on definitions. To me, potential "methods" are the list of things you describe: taxes, classrooms, teachers, home life, baseball bats. And also the ones I described: teach to the test, fraudulent scoring.

A manager that just says "teachers will be evaluated based on their students' standardized test scores" will get nothing useful, because they did not provide a method, and the teachers aren't empowered to solve their problems in a productive way.


> A manager that just says "teachers will be evaluated based on their students' standardized test scores" will get nothing useful, because they did not provide a method

There is no method for a teacher to solve those problems. It's a systemic issue.


He didn't say anything that was contrary to that. Note that it's the "manager" that didn't "provide a method" in his framing, and note that "manager" is a shorthand for "the teacher's boss, the school board that created the policies the teacher's boss follows, the federal/state governing bodies that created the legislation that governs the school board's decisions, etc etc"


"They're sure they can get done in that amount of time, so they take it easy for the first half. Then they get progressively more panicked as the deadline approaches, and they're late anyway, and/or you ship on time ..."

I found that the more experienced/older I am, the less this happens. We even finished multiple such tasks sooner then was required - of course contributing factor was that deadlines were sane. I somewhat used to have this tendency when I was younger. Experience makes you better at managing time risks. It is actually one issue I have with cookie cutter agile. Everyone is so micromanaged by the system, that they don't get to get experience needed to learn the above.

There are also contributing cultural factors. In many companies, people who stay late last week tend to be praised and rewarded over those who work in predictable speed and kept eye on the deadline from day one (not in my current team). Incentives matter and finishing tasks at the last moment at cost of evenings/weekends makes you a hero.

I personally know people, including leads of agile teams, who believe that last moment desperate effort is somehow necessary. Like you are lazy if you dont. That means there will be last moment effort, whether it was needed or not. So if the team is going to be on time, an additional work is found to be done - or more often possible logical steps to make the deadline are not taken (a feature the customer explicitly said is not necessary is done anyway etc). I wish this would be a joke or exaggeration, but I have seen it multiple times already. While details were each time different, it really did boiled down to techies essentially organizing that desperate effort for themselves.

It does not matter what the process is, how exactly you estimate etc. As long as not doing the responsible thing is rewarded, people will do that.


I really enjoyed reading this, despite its length! It makes some great points about scheduling that resonated with me, backed up by interesting thought experiments.


TL;DR: Agile is silly but has some lessons to teach. Most people do it wrong though. Make decisions early and stick to them. Talk about effort not time. Make stories about users. Treat bugs differently to stories (they are on average all the same size, are written from a different perspective, etc.). Abandon stand-ups, steal things from SCRUM and Kanban, and don't talk about deadlines, ever.


I don't think a tl;dr like this is appropriate for this one, because it leaves out the best bits. I mean, this is a very long writeup, and you're condensing it to 7 sentences. For example, the interesting part is not that deadlines don't work, but why they don't work. And I feel like the "why" is missing from a lot of advice on methodology: Everybody talks about how, but why are we supposed to do these things in the first place? And maybe we can learn something while trying to find answers to these questions.

I've only read half of it, but I liked it a lot. What about you? Did you like it? Why?


I think a tl;dr is suppose to leave bits out…


Yes, but it is also supposed to communicate the point. Which is not always possible in a tl;dr because if it was the author probably would have just written a single paragraph in the first place and saved a whole lot of time for everyone.


I have not read the article (yet; it's bookmarked for later), so I'm speaking generally, here.

> it was [possible] the author probably would have just written a single paragraph in the first place and saved a whole lot of time for everyone.

This does not fit with my experience, even disregarding things like YouTube videos which drag on in order to justify longer ads.

Writing concisely is hard, and often much more time consuming than a lengthy brain dump. It's also tempting to elaborate on every point, however tangential. I have to constantly fight this in my own writing; my desire is to be complete, but really I'm drifting off-topic, diluting my point with irrelevance, making it harder to follow.

Based on the comments above, it seems like the tl;dr missed important nuance, but that isn't always so.


I don't understand. A summary should summarise. An explanation should explain, an argument should convince, a historical overview should retell events.

Obviously, a summary can't be an explanation + an argument + history. It has to be the most salient points presented alone. Think about an abstract for a paper: it should tell you enough that you can understand if you should read the paper now, and enough context so that you can remember to read it later if your situation changes.

If you feel like the tl;dr doesn't pay full respects to the article, that's fine but I feel like not many readers will mind.


A summary is always a compromise, a judgment on which 95% of the text to leave off. Because of this, an ideal summary is hardly possible, and someone may always complain that a salient point was left out. This is not because the summary is "objectively bad", it's because the compression is lossy, and what's seen as salient differs between observers.


I quite liked the tl;dr, except that it left out how most of the article is "...and why". We've all heard the advice, but the article is trying to explain why the advice is true. I think an ideal summary would mention that.


OK, I understand. Though I think the article did a terrible job of explaining why: like, why do deadlines cause engineers to become affected by Student Syndrome? It just mentions it at a superficial level, which is perfectly fine for its purposes.

edit: also thanks for writing that article, it's great.


> Talk about effort not time

I can't understand this. You can measure time, but you can't measure effort directly, unless you're maybe doing some physical work like pulling weights. Also, you can't usually increase a mental effort, except by spending more time (e.g. by working overtime).

I can only see points or other non-time measures as proxies to time intervals (with uncertainty), but we can't know the function points -> time beforehand, we can only measure and map it from experience.

If you could explain "effort" from some other angle, I'd be grateful.


Yeah, this is why I don't prefer the common "effort vs time" distinction. Instead I like to think of "relative size of task," which is a very easy concept for most people to intuitively understand. It's fairly easy to guess that a task that is twice as "big" will take twice the "effort" and thus also twice the "time," all else being equal. And it turns out that's all we need in order to estimate the schedule. See the detailed commentary here: http://apenwarr.ca/log/?m=201712#slide19


So .. what is the business substitute for deadlines? Unless you're genuinely a startup with unlimited funding and total freedom, eventually people are going to ask the question "so what are you going to deliver and when?"


That is often the first question I get asked when trying to explain these concepts to business owners.

My answer is that you can plan for work to be done on a deadline but you can't negotiate when it will be done with the people doing the work. It's going to take as long as it takes. It's poor management that sets unrealistic expectations and demands results.

You can plan for work by using data. You get data by tracking effort estimated versus completed and triaging bugs. You prioritize goals instead of setting deadlines. You measure and refine.

It sounds counter-intuitive to business people who think in terms of, I just sold customer X the product and they need it delivered by Y so that we can get the team paid by Z. This is where poor management decisions can sink your team. If Y is decided by the sales or management team with the customer and they didn't consult their engineering team... then they're working on another planet. The goal of processes like this are not to eliminate Y but to set reasonable expectations and objectives.

As I like to remind my business owners: you can have something that works -- it may not be the whole kit -- or you can have nothing at all. Winning is about prioritizing objectives.

One book I've read recently that taught me a lot about management is Extreme Ownership[0]. I think there's a lot of cross-over from this book into Agile methodologies that I think non-technical stakeholders can really understand.

[0] https://www.goodreads.com/book/show/23848190-extreme-ownersh...


>It's going to take as long as it takes. [...] it may not be the whole kit

If I'm interpreting that fragment right, your essay is treating "programmers development time" as a fixed rate of progress (the "9 women can't have a baby in 1 month" meme) so one answer to meet a deadline is to remove features until the deadline can be met.


Precisely.


That's basically what the entire article is about. The (very) short version is it really is possible to predict when things will be done, but if you tell the engineers or negotiate with the engineers (ie. turn it into a deadline), that counterintuitively tends to make the predictions worse instead of better.


Most people get way too wrapped up in jargon and staying inside the Agile box. The method is just an idea, a loose framework, a suggestion. Executives and management will always try to rigidly implement the method (because Peter Principle) and this should be circumvented whenever possible.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: