
Please stop breaking the build - luu
http://danluu.com/broken-builds/
======
jmckib
This isn't mentioned until later in the article, but it seemed important.

> The worst thing about regular build failures is that they’re easy to
> prevent. Graydon Hoare literally calls keeping a clean build the “not rocket
> science rule”, and wrote an open source tool (bors) anyone can use to do
> not-rocket-science.

[http://graydon2.dreamwidth.org/1597.html](http://graydon2.dreamwidth.org/1597.html)

~~~
dbaupp
A member of the Rust community, barosl, wrote Homu (bors 2.0) which Rust has
been using in place of the first bors for a few months now. It's even better
than the already-great first version! I'm seriously considering setting it up
for my own projects, since I believe it has support for using Travis CI as a
testing backend, in addition to buildbot (which Rust and Servo both use).

[https://github.com/barosl/homu](https://github.com/barosl/homu)

Example interaction: [https://github.com/rust-
lang/rust/pull/23381#issuecomment-80...](https://github.com/rust-
lang/rust/pull/23381#issuecomment-80858586)

Docs: [http://buildbot.rust-lang.org/homu/](http://buildbot.rust-
lang.org/homu/)

------
hurin
I'd very much like to have some more information about how the article author
gathered this data - are these all master branches of said projects, numbered
releases?

How does the data account for variation in development procedures (e.g. some
projects use master as their bleeding edge branch)?

~~~
mirashii
Also consider that if its checked on a commit by commit basis that doesn't
collapse merges in somehow, then you actually check every commit that was made
during incremental work on a different branch. If you're doing TDD or any test
first methodology, that's going to often mean you intentionally commit broken
tests to fix them.

Also if its a commit by commit basis, the percentage of time comparison is
completely invalid.

~~~
chrismcb
The author is talking about broken builds, not broken tests.

~~~
sciurus
In this context they're synonymous. A broken build is defined by the tests
failing.

------
aaronbrethorst

        Web programmers are hyper-aware of how
        10ms of extra latency on a web page load
        has a noticeable effect on conversion rate
    

I wish this was true, but I'm really skeptical that the average "web
programmer" even uses the phrase "conversion rate" on a monthly basis.

~~~
nullspace
I know neither yours nor my comment strictly related to the topic - which is
about not breaking builds, but the effect of latency on a web page only has a
tiny contribution to conversion rates or other user satisfaction metrics as
compared other factors - like stability and features. Hence, I think it's not
_important_ for most web programmers to be hyper-aware of the latency - as
long as it's reasonable.

------
heisenbit
Is 99.9% or 3h/month really the "professional" norm?

I remember a larger project where we struggled for a longer time to anywhere
beyond 20%. Eventually we got a lot better but nowhere near 99%.

I also wonder whether it is worth it? Are all developers really blocked when
the build fails? Some yes but if I could I would design the production chain
that most would not be.

Isn't there a trade-off in how much to invest into build tooling and
automation vs. in functionality? Does building a MVP include building a
perfect software production pipeline?

~~~
mbell
> I also wonder whether it is worth it? Are all developers really blocked when
> the build fails?

In our setup building the deployment artifacts is contingent on tests passing
so if the build isn't passing, no one can deploy.

~~~
pmahoney
But you could build and test the candidate master branch before it actually
becomes the master branch...

------
Animats
Welcome to distributed programming. Just because it works for one developer
doesn't mean it works for the others, at least not until all the conflicts are
resolved. One could have build systems which didn't permit a change to commit
in the master branch until everything compiled and all tests passed, but
Github isn't set up that way.

The build may also break because of a change in an external dependency.

~~~
ForHackernews
> The build may also break because of a change in an external dependency.

It shouldn't unless you've been lax in how you specify your dependencies, or
the library maintainers have been sloppy in what they claim for backwards-
compatibility.

Obviously in the real world, both of those things happen.

~~~
zeendo
Package repositories go down too (Maven, RubyGems, NPM, etc...).

Unless by 'lax' you mean 'didn't check in your dependencies'...I don't think
I'd call that lax, though.

~~~
ForHackernews
At my work, we run a local mirror/proxy for dependencies. It's polite because
we're not hitting those external services as frequently, and it has the
advantage that we aren't reliant on them to produce builds.

------
hardwaresofton
Am I missing something, or isn't this problem easily fixed by enforcing
testing before push? As long as the appropriate tests are available, seems
like this could be solved

~~~
jdlshore
It's not necessarily that simple, but it _is_ simple. You need to do two
things:

1) Integrate the master branch (or whatever your guaranteed-good branch is)
with the code you're about to push. This prevents integration conflicts from
causing the build to fail.

2) Test it on a reference machine. This prevents environment assumptions from
causing the build to fail. (Such as installing new software or setting an
environment variable, but forgetting to make it part of the build.)

These are both easy to do. My preference is to push to a testing branch on the
integration machine, merge in the master branch, run the tests, then merge the
testing branch back into the master branch. (There's a bit more to it than
that, to cover edge cases, but that's the gist.)

Sadly, most teams and CI tools aren't set up to do this--although, as the
article says, it's not rocket science. In fact, I'm surprised it's not obvious
that you should do it this way.

~~~
justincormack
Doesn't help if eg you have multiple reference platforms - many open source
projects support a range of platforms (eg OSX, Linux, Windows) - much harder
than in house software that has a standard platform. You really have to stage
commits then.

------
icefox
The only way I have been able to get away from Ben's law is to have an
automatic system that does builds before code can get in. This can be server
side, user side, pre-commit hooks, pull-request bots or anything that catches
build breakage before they can get into the main branch. But without something
automatic in place the project is destined to fall to Ben's Law.

Ben's Law - When every developer is committing to the same branch the odds
that a commit will break the build increases as more developers contribute to
the project.

[http://benjamin-meyer.blogspot.com/2014/01/bena-law.html](http://benjamin-
meyer.blogspot.com/2014/01/bena-law.html)

------
wyldfire
It's not completely clear to me how the build start times were used to
calculate uptime. Especially since a popular usecase for Travis/github
integration is to trigger a build-upon-PR. If that triggered build fails, the
PR is marked as such and normally the content wouldn't get merged. This is
designed to prevent the exact problem described, but are these Travis PR-
triggered builds filtered out of the uptime input dataset?

------
michaelochurch
I'd love to see the numbers on Haskell. I believe (and would hope) that it
would fare well.

------
scotty79
Why builds break on our team? Because we have no idea how to add precommit
hook to perforce intellij client so it tries to build and test the whole thing
locally before pushing changes to repo.

~~~
detaro
and your team isn't disciplined enough to do it manually ("nothing is stopping
us" is an explanation why things happen sometimes, but it is a bad excuse)

~~~
johan_larson
As I heard it, one of the late-nineties projects at Microsoft had a big
problem with build breakages. They finally instituted a policy of tracking
down every build breakage, and conducting a little humiliation ritual of
awarding the offender a "big sucker" award at the weekly project meeting. This
award had to be displayed on the developer's office door for a week or
something.

You could dodge the award even if you did break the build by showing that you
had run the full unit test suite before you submitted your code.

~~~
scotty79
Yeah. We considered that but that's horrible social solution for purely
technical problem. It's like beating your child so he won't put fingers in the
wall socket instead of childproofing it.

~~~
detaro
One shouldn't make a big deal about it, but something like "whoever breaks the
build on an important branch (and doesn't immediately remember and reverts it)
brings cookies/makes a coffee run/... for everybody in the same office" can
work if it happens to often. You are right, occasional mistakes happen, so it
has to be something simple.

