Hacker News new | comments | show | ask | jobs | submit login
Github reviews as a way to improve code quality? (antirez.com)
55 points by antirez on Mar 17, 2011 | hide | past | web | favorite | 33 comments

This doesn't sit well with me. We are talking about open source software! Let's assume I have some cool little gem I wrote that works well for me, but maybe not the ideal solution for everyone. However, there is enough utility in it that I want to push it up to github. Someone can take it and make it better, using my base as a start.

I believe the above scenario is common. But now I have to fear being reviewed on something that wasn't meant to be a complete work anyway? You don't like it...branch it and make it better! The community will win! People will end up using your version instead of mine if I don't pull the changes back in. Isn't that the way this is supposed to work?

Maybe I'm misunderstanding this post entirely. It's hard enough for a lot of us to post our work publicly - then to have it scrutinized and attached to my github profile seems unfavorable to me. You don't like my stuff - don't use it! Have I done something incorrectly? Email me and let me know! Github seems to be doing just fine w/o this feature and the good projects somehow rise to the top.

BTW - I'm not opposed to being critiqued on my code. I just don't know if a public rating system is the way to do it! And I do believe I would be more reluctant to release anything if I felt it could be rightly criticized for things I know it lacked. Make sense?

Maybe a simple:

  - tests [5 stars]
  - documentation [5 stars]
  - code quality [5 stars]
  - usefulness [5 stars]
And, rating could simply get enabled/disabled if you don't think it has value for a quicky hacky project.

But I get your point, I've got the real bad habit of not wanting to upload not finished stuff.. and since code is rarely finished, I tend to upload way less project than I should. A rating system might only get the matter worst.

You make a good point. It's very easy for people to take things out of context ... and for that matter having a single rating on a scale of 1-5 implies that everybody sees it the same way and so could wind up penalizing little gems that work well in very specific circumstances.

As somebody who loves to reuse code I do like the basic idea of having a way to find out what others think of it. The simplicity of a 1-5 scale has a lot of appeal, but my gut instinct is the downsides may outweigh the advantages.

The test culture of Ruby helps a bit...

Uh, right. Ruby folks have no idea how little of a test culture they have. Doesn't stop them from complimenting themselves though. What, you say, why do I say this about Ruby? Consider the following facts.

- Ruby itself has fairly bad unit tests.

- By default you don't run tests on gems when you install them.

- Many gems that have tests, have tests that only make sense on the developer's machine and can't easily be run elsewhere.

- There is no automated process for identifying which gems are doing a standard list of good things (have documentation, tests, clear license terms, properly identified dependencies...) and which are not.

- There is no easy way for someone with a spare box and an interesting environment to contribute their machine to tracking down which gems fail their tests in your environment.

If you're a Ruby developer, you may be saying, "This sounds like a nice wish list, but look at everything we do..." Stop right there. Everything on this list, the Perl community has been doing for years. And Perl folks are doing all of the other good things that you were going to talk about.

In fact the state of the art of Ruby's "test culture" does not even match the test culture among good Perl developers during the last millenium. You don't believe me? The standard CPAN client has defaulted to actually running unit tests since Perl 5 was released in the mid-90s. And the core Perl test suite back then was much more comprehensive than Ruby's is now.

But there is good news. It is easier to copy someone else's working thing than it is to write your own in the first place. Perl got this working several years ago, and has had time to iron out the kinks. If you are interested, feel free to copy.

http://test.rubygems.org addresses all your concerns related to RubyGems, except that it’s an opt-in thing — you must configure RubyGems to run tests upon gem installation, and the results are optionally uploaded for use by the gem author.

A pearl from Perl, of course.

That is nice.

It is too bad that it is opt-in. And it doesn't immediately fix the problem that some authors write tests for themselves only. But it is definitely a good step in the right direction.

Have you seen http://www.rubyspec.org/?

I had not. But it doesn't seem to be up right now.

There seems to be a curious lack of history here. I'm not a Perl programmer at all, but even I know that CPAN has had reviews and ratings and this sort of stuff for forever (in Internet terms).

It seems to me that this blog post could be replaced by a single line - and would be better, actually - if that line was 'GitHub, please do that review stuff however CPAN is and has been successfully doing it for years'.

Bonus points if he knew enough to explain how CPAN avoids all the obvious failure modes raised by commenters here.

Instead of making it anonymous, it should just be tied to your github account.

Make it known who rated what project and for how much (and only once per project, obviously). Then the project owner can message those people to change their rating once they fix the documentation or whatever they go the bad rating for.

Yes possibly this is the only possible alternative...

One solution to potential issues with this would be to limit the visibility of the reviews/ratings to just the project owner.

There's two main benefits to reviews/ratings:

1. The project owner would get an assessment of his project

2. Potential users/committers would know the "quality" of the project

Making the reviews/ratings private would eliminate #2, but it would also solve most potential spamming, gaming, and stale issues caused by making them public.

In this case you could also just as well promote the existing message feature that's already a part of Github. Promote conversations instead of a one-way "rate this" system.

Honestly I saw testing as part of the problem from time to time, with the attitude "if it passes tests it can be merged". Testing is useful but not so powerful, unfortunately.

I get into this argument with my coworkers from time to time, and I have to say I disagree a million percent. Tests are the only way you can see if the software is going to work.

After you make a change, how do you make sure the software works? By trying it out? Congratulations! That's testing! Now you just need to invest some mental effort in teaching the computer how to do that test for you, and now you can automatically achieve the same level of safety that you had before.

I'm all for code reviews too, but code reviews do not scale the way that tests do.

I'm not against tests, but passed tests is not enough to merge some code from a programmer you don't know very very well, you really need to review the code in a very careful way, and this possibly may take more time or the same time as writing it from scratch...

Yep. Everything you said makes loads of sense. Seems provably true. But I'd be honestly curious how many people have noticed a correlation between a large unit-test suite and the quality of the software they use? I haven't. Of the projects whose source I've been exposed to it feels like I've dealt with as many bug-riddled projects that had lots of tests as I have solid projects that lacked them.

I think projects get off course... teams can develop a culture of 'having unit tests' instead of a culture of 'writing high quality software'. The way the TDD/XP camp talk about software bums me out. As if the test they decided (key word) to write set the software on some Manifest Destiny vector to perfection.

Again, I agree with you. Unit tests are awesome, and the benefit as scale is huge.. but I took antirez's comments as a shot across a different bow.

> I think projects get off course... teams can develop a culture of 'having unit tests' instead of a culture of 'writing high quality software'. The way the TDD/XP camp talk about software bums me out. As if the test they decided (key word) to write set the software on some Manifest Destiny vector to perfection.

The tricky part is that anything can become a cargo cult rallying point. Mediocre programmers can drive anything into the ground. Even a seeming no-brainer like the Law of Demeter can be misapplied to create byzantine systems if someone knows how write code, but isn't very good at it. The situation is ten times worse with test code because it's meta—there's no objective measure of quality for test code. At least with application code it's obvious for any user whether it works or not, but with test code it's all about how much manual testing time was saved, or how much smoother refactorings went.

I've long been skeptical of the claim that TDD or whatever makes you write better code. The way I see it, TDD forces you to put in the time to learn how to write automated tests. Once you are good at it, tests help you move faster by covering regressions and documenting your intentions. In Ruby tests can give a nice sanity check that is built into strongly-typed compiled languages. I truly believe in TATFT, but at the end of the day tests are still meta, and you can't polish a turd.

Really well put, and I should start off by saying I really like TDD. A lot. (also.. for anyone like me who had to look it up.. TATFT = Test All The Fucking Time)

>Mediocre programmers can drive anything into the ground

Sure, totally. But the XP/TDD cult gives them easy to follow instructions for doing so. There's almost a lowest-common-denominator talent-smoothing mantra behind some of these cultures. I think plenty of projects, teams, and people don't end up finding their potential as developers because they're obsessed with a religion that favors obedience to a silly process over intelligent, clever, elegance.

Okay. At this point we're probably drifting steadily off topic :)

Whoops, just forgot to add that new option into the docs.

That's the kind of thing he's talking about, not just the code itself behaving according to strictly defined unit tests, but the project as a whole, and its health.

1) Does it work 2) Is it being updated 3) Are the docs good enough to get my job done as a user 4) Can I tweak it to my specific use case? Options? Source code patch?

The big problem here is that once you address the issues you have to get a bunch of people to change their old reviews.

Why? Just let the project owner hit a button that says "This issue has been addressed" or "This issue is being worked on", adding an optional note, which marks the review accordingly somehow. Then you can look and see how proactive the project owner is in addressing his feedback.

1,392 people rated my project as having 1 star for documentation. I've made documentation better. A few people have now rated it four stars while some still believe it's one star and don't see the point of changing their votes from one star to one star.

I think it's a great idea to show users what parts of a project are weak (it'd help them understand how they can help), but I have the same problem with reviews on Apple's app stores, Android Market, Amazon, etc...

Some of those are handled a bit by the complaints being about previous versions, but stuff on github isn't necessarily versioned.

I guess I am skeptical that a "star" rating could ever be meaningful for Github projects. Frankly, the whole thing seems like a bust regardless if you have a thousand reviews; what kind of meaningful information could you get from a thousand reviews all talking about different versions, different use cases, different kinds of "quality"?

However, for the majority of long tail projects that might have a couple pages worth of reviews, it would be useful feedback, and the project owner could address each review individually with ease.

It may make sense to provide much more weight to the latest N reviews, and a lot less to older, but still to add them into the mix. So your past will weight as well but you can correct your problems and get a better average review.

I like this idea. Software changes over time, so I always have to double-check the date when reading other people's reviews. Especially if the software is developed at a very fast rate, such as Redis.

Another option would be to provide a history of average star ratings, similar to the 52-week activity graph. Then people would be able to see, for example, that most reviews have turned positive since the v2.5 release, etc. Tie reviews to versions and/or commits, and allow people to post another review after X weeks, so that the same person can post different reviews of different versions of the same program.

There could also be something like a time-to-live for a review; say after half a year the mark disappears and you can/should rate the project anew at based on current state.

I like this idea. I've often found projects that are useful or interesting, and clicking a "Follow" button doesn't seem like a fitting action. Ratings would let the community reward good developers and let the not-so-good developers know what they're doing wrong.

To give more "weight" to the system, GitHub users could have their average project ratings displayed on their profiles, and rankings / leaderboards could be generated. Users could receive badges, maybe: "Documenter of the Month — March 2011", etc.

One-to-five rating systems have many problems--gaming, relevance decay, shifting the economic incentive to focus on getting the rating/karma up which may or may not have a correlation to creating value. The only star rating system that I've seen that's actually useful to a consumer is Amazon--but Amazon deals with finished, immutable works and products, not evolving, living software, where every single individual rating becomes stale with the next commit.

To address what antirez needs--an at-a-glance, summary assessment of the quality of the project--github could provide a standard rating of a number of dimensions. Project followers, recentness of commits, forks, merge history, code analytics, presence of tests, volume and activity on documentation, etc. This is still subject to gaming, but here's how you fix that: allow each github user to express their own values/constants for the various measures, and allow you to assess any project based on what that user values: "What would antirez say about a project with no documentation, lots of downstream forks, great testing coverage?" Now you've introduced a trust metric, correlating values of people you trust with projects you might be interested in.

Think OKCupid meets GitHub, with bulk rating of projects rather than potential mates.

How about letting us 'un-watch' a project until a specified bug is resolved?

I think this makes a lot of sense, but it'd be a bit complicated to do it well.

One thing I'd want is the for the older votes to carry less weight. For example, if one of my projects gets voted way down due to lack of documentation, I'd want a documentation effort to be able to counter those votes that may come from people who don't come back to adjust them.

I wouldn't mind if there was a way for people to review my code. I might not need star ratings, but maybe a way for someone to make a case why doing something different or the same way would be a good idea. In terms of value I would see something more along the lines of something of more of a 'discussion'.

It's also interesting for people to be talking about testing on this thread. There might be some good points made but you might miss the point/concept that was posed in by the OP.

I'd rather that Ohloh implemented something like this.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact