LLVM is one of the most critical pieces of infrastructure out there. And yet you...

ben_bai · on June 4, 2020

It's the most important rule in my opinion. Master must always compile and pass tests.

I'm glad sending patches or looking into bugs whenever I find them in OSS but having to jump through hoops just to get things compiled is a big no-no.

jfkebwjsbx · on June 4, 2020

If you are not even compiling a change, then why would you submit it?

The bot will catch it, yes, but you are still filling the queue of maintainers.

fluffything · on June 4, 2020

> If you are not even compiling a change, then why would you submit it?

For example, some people send doc fixes. Sometimes these doc fixes have "bugs" that actually make compilation fail. Like adding a line break and forgetting to mark the next line as a comment...

Compiling LLVM for such changes takes quite a bit of time, so of course people don't even do it.

jfkebwjsbx · on June 4, 2020

That does not answer the question.

fluffything · on June 9, 2020

Of course it does. Some changes, like doc changes, should not affect the program.

So why would you try to compile your software to verify those instead of just, e.g., generating your documentation ?

ben_bai · on June 4, 2020

Not what I wrote. I never send in untested patches, because I don't start writing patches if I cant easily compile it myself in the first place.

Also, this was not specifically targeted at llvm but all open source software which may not even have a build-bot or published regression checks.

All I'm saying is: Please keep master clean so it always compiles so people can easily pick it up and contribute.

dataflow · on June 4, 2020

Honestly "master doesn't pass tests" is just a naming problem. Solving is just a matter of renaming master to something else ('dev'? 'potentially-incorrect'?) and making a new version of it that does pass tests.

fluffything · on June 4, 2020

Having a version that doesn't pass tests is just a way of delaying the fixing of test until some time later, where the authors of the tests might have forgotten about them.

And obviously, under the hope that somebody notices the failure by then, tracks it down to the right commit, and fixes the commit, instead of adding a workaround that avoid having to find the root of the problem.

"master doesn't pass tests" isn't just about naming. Its about culture. Its about agreeing that there is some source of code that everyone agrees on should work, and having the culture of agreeing that all modifications to it should work as well, and that it is the responsibility of those doing the modifications to make sure that's the case.

choeger · on June 4, 2020

This.

tom_mellior · on June 4, 2020

That would mean that there would be a default branch with the meaning of "development branch that is as up to date as possible while still guaranteed to build and pass tests". Which would be a change from current practice, since there is currently no such branch under any name. So this is not just a naming problem. It is a problem of maintaining a development branch that is as up to date as possible while still guaranteed to build and pass tests.

dataflow · on June 4, 2020

Aren't there prerelease builds though? The branch of prereleases seems to fall under that bucket.

tom_mellior · on June 4, 2020

From a very quick search I can't seem to find such a branch. Judging from https://github.com/llvm/llvm-project/releases they seem to occasionally branch release candidate branches from master, every two weeks or so. That's very far from a well-tested master updated several times a day.

Continuous integration for compilers is a well-understood problem. There are posts in this thread explaining how Rust handles it. I understand that it's a royal pain to set up and maintain. I am a compiler engineer and am happy to work on compilation stuff but wouldn't want to touch our CI system with a ten-foot pole. But LLVM is a project driven by Google and Apple, there are really no excuses for not finding the people willing to do this.

_ZeD_ · on June 4, 2020

how can you make a prerelease build of a non-compiling code?

globular-toast · on June 4, 2020

OK then. Every commit in the dev branch should pass all tests.

If the commits don't even build they might as well be thrown away. It's going to make regression tracking next to impossible otherwise. The only reason to keep a branch around is because it's a small ephemeral branch under active development, or because it's an eternal branch that you can use to track regressions (using git bisect, for example).

namibj · on June 4, 2020

Yes, which is why bors-ng[0] is a thing. The fact that LLVM doesn't use it (or something similar) is strange.

[0]: https://github.com/bors-ng/bors-ng

notriddle · on June 9, 2020

LLVM doesn't use GitHub pull requests, so obviously bors-ng isn't an option.

egwor · on June 4, 2020

I strongly disagree. If you keep things tidy coming in then you have a known good base.

dataflow · on June 4, 2020

Don't you always have a known good base? It's just the latest build that worked correctly, right?

tom_mellior · on June 4, 2020

> It's just the latest build that worked correctly, right?

But how would you tell which one that is? I guess even within the current system one could set up a bot that monitors master, and whenever the latest commit builds and passes tests, it updates a "known working" branch. But this doesn't seem to be the case. And if such a system is set up, it would make a lot more sense for contributors to push to a "testing" branch instead of master and then only updating master for working builds, rather than the other way round.

jstimpfle · on June 4, 2020

How do you even tell whether something is right? That's not always a simple question to answer.

Sure, one way is to have an automatic test suite and an integration system that moves commits that passed it from one branch to another. If you can actually make a test suite that has no false negatives and finds enough negatives to be useful, that's great.

In my (limited) experience test suites are not close to that ideal. I've worked with systems where there was an automatic build test at least, and sometimes that catches a bad commit. But I think for many fast-moving projects there are few useful errors these systems can find, and sometimes they have annoying false negatives. The thing that I can think of that would most improve my productivity right now is a check that nobody added tab characters in the source code.

(I realize my comment might not apply to a situation like LLVM, where there could be a large test suite of programs to compile and run for example).

tom_mellior · on June 5, 2020

I think your comments apply more to other situations. Compilers are very testable: The input is usually human-readable text that can be stored in a file (source code), the final output is usually human-readable text that can be stored in a file (assembly code), intermediate states can usually be dumped to human-readable text that can be stored in a file (in the case of LLVM, bitcode). There is no nondeterminism, no network, no database state, no particular hardware or system requirements for most tests, etc., to consider during testing.

Compilers are also often very modular. LLVM in particular makes it very easy to take a bitcode file, apply a certain well-defined set of transformations, dump the resulting bitcode file, and check whether it fulfills the expected properties, e.g., "this computation has been optimized to this other computation". LLVM has a tool called lit that does this kind of test, and there is an extensive test suite that uses it. Of course none of this guarantees that other programs will behave differently, but in my experience these test suites really are very good and useful.

The only missing piece in LLVM is consistent enforcement of a policy that tests must pass in order for the repository to reach certain well-defined states.

mrjin · on June 4, 2020

Well, that explains everything...

jhoechtl · on June 4, 2020

Gcc is. Llvm is the wannabe.