That said, pull requests aren't a panacea. In particular, as the amount of code to review increases, the utility of reviewing the pull request decreases proportionately. It's important that new code comes in in bite-sized pieces, and the code gets reviewed changeset by changeset. We're still struggling with this, since often the code review happens at a late stage in a development project when a developer wants to get some code used in a research project merged into the public distribution. We have to strike a balance between code quality and not putting up arbitrarily high barriers to getting code merged.
I just had an idea about using deep learning to train a bot to judge the quality of a merge. Does it have tests? Does it pass lint? Is the subset of the grammar used in the patch the same as the rest of the source? If a patch is "clean" it should get automatically merged and only require human intervention to reject it.
Lint checking is handled by basic CI integration (e.g. Travis). Having tests is something a reviewer can trivially check (and many changes don't necessarily warrant test changes anyway).
> No amount of automation could ever address these concerns.
Watch out for when experts say no.
If a tests and coverage can be trivially checked by a reviewer, why not let a bot do it? Conversely if it doesn't meet basic standards, why not let a bot ask for changes?
What if I train a refactor bot to make trivial refactorings? Naming, extract method, extract interface, warn on over coupling? If the cost of backing out a patch in the future is near zero, the cost of applying a patch today is also near zero. If the person making the change has a good track record, why not automerge?
Further, the cost of backing out or applying a patch isn't zero. A bad patch landing (even one that passed all the analysis at your disposal!) can cause big projects to have to basically shut down their tree as they scramble to figure out what's wrong. This is a huge cost.
The classic example of this is creating a sporadic test failure, causing random patches to bounce.
I've definitely worked in projects where I'll basically automerge some PRs because I know the person is great and the risks are low, but this is generally conditional on the purpose of the PR. If it's cleanup and refactoring... great, ship it, I don't care. If it's adding some functionality, I need to review if that functionality should be added.
However a big value of PRs is to force us to uncut corners. We all cut corners (or miss obvious details) when we program, and I see human review constantly catching this kind of sloppiness and forcing great programmers (the kind I'd automerge on trivial changes) to do the job right. If you just automerge, you accumulate code debt much more rapidly. There's also the aspect that human review often helps ensure that at least two people understand the decision process behind some piece of code.
What if the cost of backing out a change actually was zero or close to zero? What if the prospective change actually was run on all platforms with all of the historic input the function has ever seen?
What happens when our undo buffer and our log of refactorings gets checked into version control. Not a log of pure text, but a log of semantic transformations.
Of course bad patches make it into the tree. Right now all the bad patches require human intervention. I am saying that it might be economically advantageous to let in some of the bad patches automatically so it takes less work.
For most codebases I see in industry, simply rejecting a patch on lint failure, lack of tests or documentation would effectively freeze most trees.
The smart deep learning bot can determine if a branch is "clean", but it surely can't determine if it's correct.
This is a huge mistake.
Efficient and succinct communication of ideas as expressed in writing transcends all boundaries. Many of the best programmers I have known were equally as skilled writing an essay as decomposing a problem into functions. An essay can be lyrical and it can also be computational. The lack of clarity in writing, in general, is the cause of bad code. This crushingly intense focus on STEM is shortsighted. The root of the problem is a lack of clarity of writing, both in english and in code.
I think lack of clarity in writing -- either in code or in natural language -- isn't the root of the problem, its a symptom of a lack of good, clear general analytical process. Lots of programmers I've met are great at analyzing how to implement a very explicitly described thing in code, but aren't great at analyzing domain information to determine how to solve a problem.
I don't think that STEM-focused education is bad for this, but I think that developing generalized analytic skills seems to be helped by having experience applying analysis in varied and not-closely-related contexts, particularly contexts where the problems are often fuzzily defined and you need to do work to discover, or impose, structured order to compose a response.
Collegiate English departments can be an important venue for that, but I don't think uniquely so -- so can Philosophy or social sciences departments. (But its all highly dependent on the particular institution, professor, and courses chosen.) I'm personally partial to legal writing, specifically (often addressed in undergraduate coursework in law-focussed political science courses) as a vehicle for developing this skill, but then I'm a programmer who's also a law geek, and who was a political science major, so, there's probably some bias there.
Not good, but great.
Their ability to organize ideas to write about them also serves their ability to organize ideas to code about them. I really think we give too little credit to analytical training available from English departments. I would train an English major as happily as a CS major, other factors being roughly similar.
Plenty of notable writers have English degrees; "None (or, equally, 'all', 'most', etc.) of the X that I know" usually introduces a statement that reveals more about the speaker's social bubble than the broader world.
Off the top of my head: Tolkien and C.S. Lewis (the creators of Lord of the Rings and Narnia, respectively) were both Oxbridge professors of English.
And after many late nights fixing documents just before their due dates, I've developed some basic guidelines that aim to increase clarity, including:
* Not using multiple distinct terms for the same thing in different parts of the document
* making responsibilities clear - especially in multi-party contracts! E.g saying "X will do Y" rather than "Y will be done"
* Not expecting the reader to guess what the writer means (e.g. by saying "there will be an impact")
* Not using 'this' after a discussion that references multiple distinct things.
The results might not look like the sort of literature I enjoy reading outside work, but they have much more clarity!
An hour and a half later, I had started conversation with the maintainer of an open source project, speccing out a portfolio project with him. It's a bit early to tell, but treating the writing as my only goal, rather than as a supplement to coding work, is doing things to my perspective...there's definitely a need for the humanities in technical environments. A really great piece of writing can do more than instruct clearly, it can motivate creative uses, put the techniques into perspective and lead people away from dangerous ideas. As well, in-depth writing about technology is a great way to review its real utility and add discipline to development that has gone astray.
And there is so much technology that would benefit from doing this better. I think it might be the proper antidote to cargo culting.
Example from the article:
We had three codes to review, they were written in IDL [...]
That is clearly talking about programs to review, but uses "codes" as if it is synonymous with "program". So annoying!
Is it just me, or is this usage very typical of the scientific community?
It is a well-established tradition in scientific programming to use "code" as a count noun. It's even acknowledged in the Jargon file:
I find it jarring too, despite being a scientific programmer myself, but you learn to go with it. It is a tradition that this linguistic community has established, and there's no reason to rail against it.
Consider it the same as humor versus humour in computing and don't read too much into it. It comes from a lot earlier time from what I can gather, probably the 60's or even earlier where codes were run on mainframes.
The author is a native Spanish speaker, and I've seen fluent but non-native speakers of English make similar confusions concerning other mass nouns and irregular plurals.
So could you be more specific by pointing to how a native English speaker uses it in a scientific software context?
http://arxiv.org/pdf/1510.08141.pdf - "The processes are simulated with the pythia and MadGraph 5 Monte Carlo codes" (authors are from Brazil and Switzerland)
http://arxiv.org/pdf/1510.04063.pdf - "They present new challenges to the simulation codes" (author is from Germany)
http://arxiv.org/pdf/1509.00209.pdf - "The codes and setups used for the formulation of these plots are listed below:" (from Italy) (Note the 'and setups'. This is a clear mistake made by a non-native speaker.)
http://arxiv.org/pdf/1508.00589.pdf - "The Cambridge variable MT2 can already be reliably computed with one of several publicly available codes" (3 of the 8 authors have a US address, though only one of those three is a common name for a native English speaker)
http://arxiv.org/pdf/1507.08764.pdf - "a benchmark for the hadronic interaction Monte Carlo simulations codes" (1 of the 10 authors has a US address. That author has a typically English name)
http://arxiv.org/pdf/1507.06706.pdf - "Both codes are available at" (1 US-based author of 5).
I then repeated the same search with 'programs':
http://arxiv.org/pdf/1510.00391.pdf - "Using a fully automated framework based on the FeynRules and MadGraph5.aMC@NLO programs" (1 UK-based author of 5)
http://arxiv.org/pdf/1509.01918.pdf - "The results are compared with the values used in the simulation programs GEANT4 and UNIMOD." (All authors are based in Russia.)
http://arxiv.org/pdf/1508.05895.pdf - "various high-energy physics programs such as Monte Carlo event generators." (1 UK based author of 6)
http://arxiv.org/pdf/1507.00556.pdf - "This requires an interface between the higher-order theory programs and the fast interpolation frameworks" (23 authors, 18 institutions, and 10 institutions in English speaking countries)
http://arxiv.org/pdf/1506.08759.pdf - "The new estimates presented here are based both on simulation programs (GEANT4 libraries) and theoretical calculations" (Authors are from Italy.)
http://arxiv.org/pdf/1504.06469.pdf - "While many fixed-order and parton shower programs allow" (4 authors, 2 based in English speaking countries, one with a common name for a British person)
This is surely not proof of anything, but it does demonstrate that 1) "codes" is use in scientific computing (though it doesn't show the ratio between "codes" vs. "programs" nor its use in fields other than experimental HEP), 2) there's a suggestion that non-native speakers use "codes" more often, but it's not clear at this level, and 3) there are a lot of non-native speakers pushing papers, while I think the 'software-industry people' who publish in English has a much stronger bias towards native English speaking people.
A more thorough analysis would also need to see if the modern scientific sense is historically consistent; perhaps the industry use is newer though wider coinage.
But of course, I'm not a native English speaker.
I don't think it's typical of the scientific community as a whole (certainly not in, say, math), but I don't often interact with computational biologists and such.
In this specific case, I believe well written tests/good coverage would be more prioritized than code review.
Besides, most codes we wrote are merely prototypes that won't be used again.
I used to work with scientists that said pretty much the same thing. Once their stuff started working we had to deal with a huge pile of very badly written code and make this into a product quickly.
I do make an effort to rewrite what I have hacked together once I have a working approach, but if I wrote everything correctly from the start I would be wasting a huge amount of time and get very little actually done.
Now, I also do some prototyping, but my workplace has a formal software development process, and my code would never leak into a product. For one thing, I use a different language than the programming department. Likewise for the electronics and embedded portions of my prototypes.
Sometimes my code continues to serve a useful purpose, for instance allowing testing of the hardware portions of a product, months before production software is ready to use. And if our codes produce different results, well, let's just say that I've lost a few bets, but have also won a few. ;-)
With all those things said, I'd welcome review of my code in any "critical" situation, including preparation of results for publication.
The best approach I think is to take a hybrid approach of hack and then rewrite. I have worked on problems where I have written 10,000s of lines of very hacky code over many months and once I had a viable solution rewrote the whole module from scratch in a week. The key is to make sure you (or someone else) re-writes the code once it is working if you are going to use again and not pretend that what you have hacked together is acceptable just because it (sometimes) works.
For me, it's less about standards than about tools. I don't think software development is slow because of standards.
I don't use the same tool chain as the developers, so I'm oblivious to their standards. Their code is unreadable to me. I follow what would probably have been considered good practices 15 years ago when programs tended to be smaller. I actually learned good hygiene, but in the context of BASIC and Pascal in the 1980s, not the massive tool chains of today. And my programs probably look like a throwback to that era in terms of their conceptual structure.
I re-write early and often. Transforming a program is a good way to find out if it made sense in the first place.
1. The original code is by that time I have finished hacking a mess and it is not a good place to start.
2. I can use the original code and the new code to find bugs I missed the first time around. A good percentage of the time I find bugs in my original code that only a clean rewrite exposes.
Also, I try to incorporate new techniques that I've learned when I write new code.
I am talking about simple things like writing a function instead of copy/paste the same 500 lines of code 10 times, using STL containers instead of heap allocated arrays everywhere, meaningful variable names or (my favorite) using variables for important numbers instead of putting the number 8 everywhere into the code so you don't know if 7 is "8-1" or really is just 7.
Some of the scientists I worked with somehow took pride in not following any coding practices because they were scientists, not mere coders.
I believe in this particular startup I worked for these practices contributed to the eventual failure since we had 100s of thousands of lines of prototype code that crashed left and right and nobody knew exactly what it's doing because there was almost no documentation.
I think a little respect for engineering practices is a good thing for scientists and I don't think it impedes their creativity.
The problem with the startup you describe sounds more like a management problem. Hacked code that sometimes works is not code that should be used more than once and it should be rewritten before check in. If the scientists don't know how to do this then they need to be helped.
Can we comapre "Screwing around" to "Hacked together" (undocumented?) throw-away code and "Writing it down" to "Correctly written" (documented?) code?
Aren't you as a scientist supposed to also document what ideas failed?
Now it depends a bit on what you mean exactly by "hacked together" and "correctly written" I guess. I'm all for prototyping and getting to the "heart" of a problem quickly to confirm / reject ideas. But I wonder.
Do e.g. chemists feel they could try more experiments quicker if they didn't have to take notes on what they tried and what happened? Is there a comparable activity to hacking together throw-away code in chemistry (or other older fields of science)?
To answer your other question in science yes there is the equivalent of hacking something together just to see if the idea will work. It is very common to do a quick experiment to see if something will work before going back and doing all the controls and working out how robust the experimental conditions are. Science is hard and almost everything you do fails.