There's certainly a hint as to where to look for bad coupling, but expected "coupling", like tests, need to be discounted.
It would show some coupling commits, but e.g. bug fixes should/would be segregated to the relevant files, not spread across the system.
The grandparent's making an important point in that you can't design a system such that all possible changes you might want to make are localized to one area of the code. Engineering is about trade-offs: if you rigorously separate view from logic from database, you make it harder to add features that must touch all three. Conversely, if you make each feature its own file and add in hooks to the view/logic/database layer so they call out to plugins, you make it easy to add new features but very difficult to understand what each layer as a whole is doing.
The best you can do is choose the ideal architecture for your particular project, in the particular point in time that you're working on it. That's why basically every software system needs to be rewritten as it grows up: the ratio of complete rewrites to new features to bugfixes to maintenance refactorings changes as the system matures and the requirements become more precisely known. It's also why we have a software industry; if there was one ideal way to design a system for all domains and all points in time, someone would go design it and be done with it, and none of us would have jobs.
I've found the exact opposite of this to be true.
Systems like PHP + "SELECT * FROM database_table" or the MEAN stack, where you use the same data format for both backend storage and UI and intermingle logic with templates, are significantly faster for getting something workable on the screen that users can try out. I've done a complete MVP in 4 days with PHP; a roughly equivalent app in Django (which has some minimal model/view/database separation) took about 2-3 weeks. The fastest I could launch a feature in Google Search that touched both UI and indexing was roughly 6 months; as you'd expect, that has a very rigorous separation of front and back-ends.
Now, the PHP solution will quickly become unmaintainable - with the aforementioned 4 day project, I no longer wanted to touch the code after about 3 weeks. But this doesn't matter - it's 10 years later and the software is still serving users, and I long since moved on to bigger and better things. And that's my general point: what's "good" code is context-sensitive. Everybody wants to work on nicely-factored code where you can understand everything, but there is no business to pay you unless you first make something that people want, and oftentimes that takes hundreds of iterations (including many fresh starts where you throw everything away and rebuild from scratch) that are invisible to anyone collecting a paycheck.
Thank goodness for that. This blog post was so much easier to read then a formal paper.
Why are formal papers so tedious to read? Imagine how much time we'd waste as a group if he written this as a paper. Many of us would give up before finding the actual information, and those of us that /did/ identify the actual information would have invested a lot more time than it took to read this excellent blog post.
Average papers are hard to read because they’re trying to emulate the style of the good papers, but without having enough actual content. The length and register of a blog post are definitely a good fit for the average essay.
Crappy papers are hard to read because they’re crappy, and recasting them into a different format would reveal that they contain no information at all. :)
x : σ ∈ Γ
Γ ⊢ x : σ
Γ ⊢ e₀ : τ → τ′ Γ ⊢ e₁ : τ
Γ ⊢ e₀ e₁ : τ′
Γ, x : τ ⊢ e : τ′
Γ ⊢ λx. e : τ → τ′
Γ ⊢ e₀ : σ Γ, x : σ ⊢ e₁ : τ
Γ ⊢ let x = e₀ in e₁ : τ
Γ ⊢ e : σ′ σ′ ⊑ σ
Γ ⊢ e : σ
Γ ⊢ e : σ α ∉ free(Γ)
Γ ⊢ e : ∀α. σ
[Var]: If the context indicates that it has a particular type, a variable is inferred to have that type.
[App]: If a function is inferred to have some type, and a value is inferred to have the same type as the parameter of that function, then the application of that function to that argument value is inferred to have the type of the result of the function.
[Abs]: If the body of a function is inferred to have some type, given an arbitrary type for its parameter, then the type of such a function is a function type from that parameter type to the type inferred for the body.
[Let]: If an expression is inferred to have some polymorphic type, and another expression would be inferred to have some monomorphic type given that a particular name were bound to that polymorphic type, then a let-expression binding that name to the former expression within the latter expression would have the same type as inferred for the latter.
[Inst]: If an expression has a polymorphic type, and that type is an instance of some more polymorphic type, then the expression can be said to have the more polymorphic type.
[Gen]: The type of an expression can be generalised over its free variables into a polymorphic type.
But having learned this notation, I can now read and write specifications of type systems with ease, and do so much more quickly and compactly than I could in a more approachable notation. To me, that’s a win.
Of course, I’m the sort of person who finds Java programs hard to read because they seem to take so long to say anything. So opinions are going to vary on this!
So I rummaged around a bit more and found http://akgupta.ca/blog/2013/05/14/so-you-still-dont-understa... which explains the notation in much more readable English text. It's a pure win.
Yes, I find typical Java style too verbose for optimal readability; but I find typical Perl style too terse for optimal readability. The sweet spot is somewhere in the middle.
Nowadays I guess everything boils down to a custom of which we (yes, I am part of the problem) have not been weaned. It looks more scientific to write things densely and more or less cryptically.
Also, we scientists are a bit afraid of publicly (I mean, to the general public) explaining our way of understanding things, because we know it is somewhat blurry, informal, possibly even comical to a lot of people. And we tend to be introverts.
I guess things like arxiv.org etc. are going to create a new way of explaining scientific discoveries much more interesting and enlightening.
This is a huge problem in the academic field. Papers are written in academese, not because it's a particularly good way of distributing information, but because it's expected. I absolutely agree that less formal language is both good for the discipline, the technical non-academic audience, and for the lay audience.
Steven Pinker recently wrote a book called "The Sense of Style" which talks about academese and writing well about complex subjects (something which he, indeed, has a lot of experience in!). I haven't read it yet, but I've heard it's quite good.
Let me know if you can get your hands on a list of related plug-ins, I am really curios to see them in action.
I think that this sort of repository analysis is going to be standard practice within the next couple of years.
1. It's a tooling artefact, testing systems certainly don't have to mandate split code and tests. Rust's test framework allows tests in the same file as the tested code, the testing guide recommends that unit test live alongside the code they test and the standard library follows this practice. I'm reasonably sure you can also do so in e.g. py.test
2. I'm not convinced editing two files slows you down, most editors and window managers will let you put both files side-by-side and trivially jump between them. Are java developers slowed down by having to jump between files?
 by marking all python files as "test modules"
OP is referring to this line:
It also turns out, in both softwares, a majority of the couplings are attributable to Test Driven Design, where a source code is coupled to its test. So these are apparently false-positives I should take care of in the next version of the pipeline.
Not shown: the part where the critical production bug you introduced was caught by your test suite, thus saving you countless hours of agony, angry customers, and lost revenue.
It works both ways, and probably one of the biggest skills to being an effective developer is understanding whether a piece of code you write is likely to be thrown away shortly or whether it's going to live forever and cause umpteen headaches for the maintainer. (This itself is surprisingly counterintuitive: I have seen high-priority code backed directly by an executive thrown away a week after being written because of shifting perspectives within the organization, and I've also seen a one-character typo in a "throwaway" migration script result in restoring a million+ users from tape backup.)
By all means do leverage your type system as much as you can, avoid writing tests for what you know your type system handles, and (if your type system is expressive enough) use property-based testing to further leverage your type system into essentially fuzzing your functions.
But you'll still need to write tests.
How on earth do you write a type, or set of types, for that that actually bear some resemblance to the spec and don't simply reflect the details of the implementation?
To simplify it to the point of near-nonsense, how would you write a type which says `append "foo" "bar"` will always result in "foobar", and never "barfoo" or "fboaor"? Or that a theoretical celsiusToFahrenheit always works correctly and implements the correct calculation? If you can't do that, how can you do it for more complex data transforms?
A simpler example is that of lists and the head operation. In most languages, if you try to take the head of an empty list you get a runtime exception. In a language with dependent types you are able to express the length of the list in its type and thus it becomes a type error (caught at compile time) to take the head of an empty list.
Isn't that literally just implementing the program, though? The point of tests is that they're simple enough that you can't really fuck up, and they describe the specification, not the implementation.
If you have to write a formal proof, what's making sure the proof is actually proving what you intend? And what's the actual difference between this proof and the implementation?
The formal proof is the implementation. It is the code you run in production. The proposition is your types. Instead of writing tests, you write types. It's the exact same process you would use with "red-green-refactor" TDD except it's the compiler checking your implementation instead of the test suite. The advantage of doing it with types is that the compiler can actually infer significant parts of the implementation for you! Types also happen to be a lot more water-tight than tests due to the way you specify a type for a top-level function and everything inside of the body can generally be inferred.
If you're interested, here is a series of lectures demoing dependently-typed programming in Agda by Conor McBride:
The point of tests is that they're supposed to be simple enough that it should be near-impossible for them to contain bugs - they're incomplete, sure, but what they're testing should always be correct - whereas it seems that it would be easy for a proof to prove something subtly different from the spec without anybody being able to tell. If we could write complex programs to spec without error, we wouldn't have tests in the first place.
As for your assert example, how do you know append will work as you expect for all possible strings (including null characters or weird unicode edge cases)? With a proof you will know.
But I could just as easily accidentally write a proof which proves something else, couldn't I? This is complex code expressing complex ideas - a type system would certainly help, but it can't tell me that I'm proving the wrong thing.
Then your proof would be rejected by the compiler. Remember, the types specify your proposition: i.e. what you are intending to prove. The actual proof itself is the function you implement for that type.
As for whether you're proving the right thing or the wrong thing, a type system is no less helpful than a test suite. The advantage of a type system is that it checks whether your types are consistent within the entire program rather than in merely the specific test you're running.
Only for some types of mistake, surely. If that was always the case, we'd have a compiler that could read minds.
> As for whether you're proving the right thing or the wrong thing, a type system is no less helpful than a test suite.
Really? I can write down in my test suite, "assert (add 1 1) == 2". Anyone can come along and look at that and make sure it matches the spec. We can add additional tests for various bounds, and possibly use a quickcheck-like tool in addition, and for 99.99% of use cases be happy and confident that we're at least writing the right thing.
What's the type for that and does it actually have any resemblance to "the add function adds two numbers together", or do I have to read a couple of papers and have a background in university-level math to convince myself that the proof actually does what it says?
If you're modifying behavior, you will need to update the tests that break because of that, whether you write tests before or after.
This is a classic problem of externalities. You can only quantifiably judge what you are quantifiably measuring.
In this statement, you are only measuring the file couplings. I hope it's obvious that there are many other factors at play.
For example, let's look at a "typical" development iteration for a feature.
1. you write the automated test
2. run the test
3. if the test passes goto 6
4. edit the software
5. goto 2
With manual testing:
1. edit the software
2. manually test the software
3. if the software does not do what you need it to do, goto 1
TDD gives you a quick feedback loop, since the verification step is automated, at the cost of up front time spent on writing the test.
There's also automated regression testing, which is useful to prevent regressions when you change the software system.
When the software becomes complex:
* regression tests offer a quick feedback loop that scales almost linearly
* manual tests have a slower feedback loop and are often given to the QA staff, which involves communication overhead, scheduling, meetings, etc.
The solution there was for future languages to combine the two. Maybe we'll see a future language combine unit tests and source into the same file.
Heh, maybe such a language would refuse to compile if public functions did not have an associated test.
Sounds like my worst nightmare.
If you change an API, you will have to change consumers of the API. Does that mean that your code is bad?
This exists even in a low level examples: if you change a c++ class, you will need to also change the corresponding header file.
Or perhaps, am I misunderstanding the concept?
Now, it's affirming the consequent to say "you change two files, therefore you must have repeated yourself/coupled two things too much", but if you think those are common problems, then it will still be sound to say "you repeatedly changed these files in sync, you should look there to see if you've coupled them too tightly."
When reviewing code, we tend to judge how the final output looks based on the reviews' aesthetics. There is almost no emphasis on the process of building the software nor is there much emphasis on how long it takes & how reliable the software is.
While aesthetics & clarity are important the notion of "good" or "bad" software depends on the context of the judgement. Is it good/bad for the programmer? Is it good/bad due to the costs of development? Is it good/bad based on it's flexibility toward changing requirements? Is it good/bad based on the flaws in deployed system? Is it good/bad based on the feature velocity?
Why is software productivity so difficult to measure? Software is complex & software is created in complex situations. It is tough to get an "apples to apples" comparison when comparing complex contexts. It's like comparing two people. Is one person better than another? Usually it depends on the context...
Good catch, though. Thanks.
For reference, given some variables, a Markov network is a parsimonious way to express arbitrary covariance matrices in terms of individual interactions between groups of variables (in this case, pairs of variables). Their approach looks very similar to estimating the Maximum Likelihood or MAP graph.
Generally if you have to change a whole bunch of related files when you change one, it's an issue with the design.
There's a tension there with one-and-one-place-only otherwise known as Don't-Repeat-Yourself.
Logical coupling crops up in lots of unexpected places as well:
<%= partial :foo %>
Partials are functions but with no clear argument signature, so they may be used sloppily with no obvious way of determining what (interface, state expectations) they are coupled to.
Something I think is particularly interesting about this approach is that it is language agnostic. It's probably even independent of "programming". It could also be useful for general documents: if I edit wiki page A and always edit B too, should they be the same page instead?
Couplings between different regions of the files, however, are relatively harder to find and requires some more thinking in terms of implementation.
One thing I noticed is that the scripts are written to run once, and always do everything. It would be better to have a Makefile and dependencies, so that the you can run it multiple times, and only the changes are updated.
I'll see if I can push some fixes to github.
Thanks for trying this out.