Hacker News new | comments | show | ask | jobs | submit login
Analyze Github commits to know which convention is popular (sideeffect.kr)
63 points by sethbannon 1431 days ago | hide | past | web | 44 comments | favorite

Maybe this is intended to be obvious, but it is not to me. Is this actually counting the number of commits that show a particular convention, or is it showing the number of occurrences of said convention?

The web site states "number of commits" (without really discussing methodology) which is, frankly, not at all interesting to me. Is a commit of 100,000 LOC weighted equally to a commit of one LOC? Are these projects internally consistent or no? Do projects eventually become internally consistent?

Number of occurrences, by the looks of it. Each match (per line) increments its counter by one.


Thanks, I was wondering the same thing. I feel much better knowing that 93% of lines are kept to 80 characters than 93% of programmers never use more than 80.

The site seems to stop after having one of the language icons selected. There's a "Loading..." graphic, then it disappears, then nothing. Is anyone experiencing the same issue?

Perhaps some kind soul can put up screenshots of whatever should be seen? (At least for Python & Javascript, which is what I'm interested in.)

I can finally see the charts. Whoever made the necessary tweak, thanks!

What is the methodology for the 'Line length is over 80 characters' ? A lot of lines of code have less than 80 characters, no matter what the conventions are. What would be interesting is the percentage of files that have not a single line with more than 80 characters and the other files. It would be more obvious to see who doesn't care about 80 characters.

For python, it seems to consider tabs as 4 spaces and take the character count of the line. https://github.com/outsideris/popularconvention/blob/master/...

It seems to only look at commits, (so it doesn't consider the entire file, though it's certainly possible to do)

Nice charts. I'd be curious about number of spaces, particularly for JS: 2 spaces seems more common in the Node community, but I see a lot of 4 as well. Perhaps it has to do with whether the author is more comfortable with Ruby (usually 2) or Python (4).

My current company goes against the grain by using snake_case instead of camelCase in JavaScript code. It weirded me out at first, but considering our backend is Python, it's nice not to switch conventions.

I've come to really enjoy using the conventions of the language to keep things clear when writing css-rules that are intermixed with javaScriptVariables talking to a ruby_backend emitting jsonEncodedObjects.

Keeping things js-named helps keep my mind in the path of JavaScript conventions instead of Ruby ones in situations like array_of_things.pop and arrayOfThings.pop()

Yeah, I completely agree. It starts to get a bit confusing when you have a BackEndInCSharp that goes through a javaWebServer before reaching iOS. We tend to have a hard time being consistent once we get to iOS, but that's due also to the fact that we've just merged three different teams (each without a specific standard...)

Yeah, that's true too. Just looking at the positive side of a convention that's not mine to change :)

Douglas Crockford conventioned 4 spaces for indentation so that's what I do.


Cool idea. I would love to see more languages and conventions added. I will say the background image at the top (the code with the focus explosion effect) gives me a headache though. The site looks better without the image.

Python's PEP8 recommends underscores for functions, method names and instance vars, but the stats are showing a break down of camelCase with and without first capital letter (which shouldn't really be relevant).

Are you sure you are looking at the Python page? I don't see a category for variable names.

Clicking on Python returns the Scala page here. I cannot get to the Python page.

It's wrong for Java. "final" is not what is usually called a constant there, "static final" is. Also, who ever uses special prefixes for statics?!

I feel bad for tab-lovers. A crushing defeat :)

The problem is that a lot of "tab-lovers" mean "pressing tab" which, in most editors will insert spaces.


I think most tab lovers mean "tabs to indent, spaces to align". The 'unholy third option' here: http://www.codinghorror.com/blog/2009/04/death-to-the-space-...

Delightfully this is what Go has settled on.

(Who actually presses their tab key anymore? Even vim does that for you.)

It's not that bad in go though because you should just run gofmt every time you save. It's bad when you have to do it by hand.

I don't see why it should be bad to do by hand. I have never experienced any difficulties with this style when using any other language in Vim.

Many people do not enjoy the rote task of administering whitespace by hand, and consider it a problem better left to computers. I'm sure there were some people who thought maintain ledgers was no problem before spreadsheets. Or, not to put too fine a point on it, who thought that auto-indenting editors were a pointless luxury. That doesn't make your opinion wrong, but I think you'll find it's rare in a few years.

It is a task perfectly suited for automation though... I'm not sure why you think it should be any harder to automate than any other style. Find your base indentation level, insert the proper number of tabs. Find your beautification alignment level (already done by most autoindenters. Any autointender that doesn't do this is leaving this task to the user anyway regardless of the scheme being used...), insert the proper number of spaces.



Intellij also has this capability, and likely other editors as well.

I prefer to do it by hand, but that is nothing but my personal preference. It has nothing to do with the scheme itself. If someone is opposed to doing things by hand, they don't need to do this by hand.

I was thinking the same thing. Not necessarily was easy as counting spaces...

Why should it matter what people think? There is no defeat. Why does it even matter what happens? We shall prevail, and it is inevitable.

I'd like to see this metric for C. I expect much higher tab-usage there.

I think there is a problem with "Type of Parameters Definition" for Javascript.

  Followed by space
    def add(a:Int, b:Int) = a + b
  Using space in before/after
    def add(a:Int, b:Int) = a + b
  No space
    def add(a:Int, b:Int) = a + b
I'm not seeing any difference between those three.

Very cool idea, always interesting to see this kind of data.

My only feedback would be to put the language names as well as logos. I had no idea what those last two languages were, and waiting for the hover title text is a bit tedious (not to mention some people may not realise you can do that)

Interesting, though at no point does the site actually say what language it's analyzing. I recognize the javascript and java icons, and I'm guessing that since the third one is a snake, it's python, but I have no idea what the last one is.

It's Scala. The icons have a title attribute.

Yeah, that seems like a problem to me, too. (In case you're interested, the last one is Scala.)

Am I the only one who couldn't recognize the Scala logo till I saw the HTML source?

Mystery meat navigation! I couldn't identify Python or Scala.

Good idea. Needs more conventions. And more languages of course.

Agreed, very cool idea. Would love to see it expanded upon.

If your favorite language isn't here, another trick you could use is adding 'site:github.com' to your Google searches.

There is a typo, for Java the last item says "Use special prefix fot staticvar"

strikes me as odd it doesn't offer ruby as a choice when Github started life and became popular catering to Ruby/Rails devs

The stats are the same for Scala and Python ...

And for JS. Whoops :P

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact