Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Simple way to access various statistics in Git repository (github.com/arzzen)
74 points by arzzen 7 months ago | hide | past | favorite | 26 comments



eeeeewwwww no no no no no. lines of code/number of commits/number of changelog items != productivity. If you are a lead or a manager, read this message.

Software development progress is anything but linear and any attempts to make a metric that does not reflect actual results will result in nothing but succeeding in that pointless metric.


You don't have to use it as a measure of productivity. I find running these kinds of stats on my code bases fascinating. This tool looks really neat imho, great work OP.


> You don't have to use it as a measure of productivity

What I've found is this is well understood by technical people, but not so much by non-technical people.

I won't go into too much detail, but to make the long story short, Amazon, Google, Apple, and other tech giants are scaring the shit out of non-traditional tech companies in the financial industry, health care, etc. And the reason for that is, they (tech giants) are going into different verticals and they (non tech companies) realize they need to develop software faster and they are desperately looking for ways to help them overcome their ability to manage and attract tech talent.

So for them, when they see GitPrime, GitHub Insights and other similar solutions, they get very excited as they believe they found that magical translator that can help them better manage developers. This is why software metrics is rarely adopted by tech companies but is in high demand by non tech companies, which there are MANY. For example, I was talking to a developer in a food storage company that develops in-house software to manage their automation systems. It is these non-traditional tech companies that really want to be able quantify developer productivity and why lines of code, commits, can be so dangerous.


> help them better manage developers.

The only way to better manage developers is to have a manager that used to be developer.

If that's impossible, the best option left would be to just trust the developers on their reports.

All of the other options that are taken (which I would say, for these traditional companies, 99% of the time) inevitably result in a mess.


They do realize it's exactly the type of behaviors that is going to push overperforming developers (10x) to switch to real tech companies right?


The ones who are likely to believe that metric-watching will dramatically improve things are unlikely to believe in the existence of 10x developers (and they’ll have no first-hand evidence to contradict that belief).


You're making very wild and invalid projections here.


How so?

When I took time off from my first startup (which is the foundation for my current startup), I ended up working for a fintech with close to 1 trillion dollars in assets and my job was to help them transform into a technology company. And they were literally rewarding people (gave them gift certificates) for having the most commits, without even asking what the commits were for. Note, the company that I was working for wasn't the only large institution that is afraid of tech giants like Amazon and others.

Also Pluralsight paying 180M for GitPrime and GitHub paying a decent amount for Gitalytics signals that there is demand for software metrics.


Good software isn’t built this way though so it won’t stick around. Analytics like these can be really useful but the companies with the best developers won’t get away with abusing them


This is typically the pattern. To quote a tech manager "I like software metrics but if I implement it, my engineers will revolt" when talking about one of the most popular software metrics solutions today.

GitHub Insights has turned into a hot topic for GitHub internally and if you use wayback to look at GitHub's enterprise sales page from a year and a half ago when they acquired Gitalytics to today, you will find they are pretty much sweeping GitHub Insights under the rug because they know how developers feel about software metrics.

Software metrics can (which is what I am working on) be useful for everybody, but right now, too many non-technical managers just want to "tell if an employee is dicking around or not". Until these non traditional tech companies lose talent, they will assume everything is okay. And based on what I've seen with the fintech that I worked for, many employees will just accept things, since it's not like Apple, Google, Facebook and others will fight for them.


you should not use it as a measure of productivity. I wrote my message to warn people who might get that idea.


Disclaimer: I'm the founder of GitSense which is working to make software metrics a good thing for both leaders and developers, so assume I have a bias.

> lines of code/number of commits/number of changelog items != productivity

This is the biggest challenge that I'm currently faced with right now, as GitPrime (now Pluralsight Flow), GitHub Insights (formerly Gitalytics), Waydev and others have been very effective at giving the impression that you can easily roll-up metrics to quantify developer productivity. You will be surprised by how many times people want a simple score or a simple graph to summarize how productive a developer is.

What really bugs me about lines of code/number of commits/etc. metrics is they are actually really useful for developers, but because they are used incorrectly, it takes a lot of effort to explain to people that context matters. For example, if I want to know what developers contributed to the src/vs/base directory in the vscode project in the last 30 days, lines of code/number of commits/etc. is quite useful. See the following for an example of what I mean:

https://public-001.gitsense.com/insights/github/repos?q=path...

If you switch to the impact view, you can quickly tell based on code churn, commits and other metrics who had the greatest impact. I talk a bit more about impact in the following post, so I won't repeat things here:

https://news.ycombinator.com/item?id=26457072

Right now what I'm working on is making a connected efforts graph that should help management better understand how much effort it takes to develop code, which I'm hoping will displace the notion that you can quantify productivity with low hanging code metrics. For example, when you look at a code change, you should be able to see that it is connected to x number of meetings, x number of emails, x number of code review comments, and so forth.

When it is all over, tracking every line change is a good thing as it benefits everybody, but how it is used today out of convenience is certainly presenting challenges for what I'm working on.

Edit: Do not install my tool as the docker image has an out of date license that I need to update.


It's funny because this week I'm at -2000 lines of code after refactoring, removing redundant slow tests code and implementing some small new features. Guess I'm fired.


Not if the metrics is done right! Code churn is always positive since it's "lines added + changed + deleted" Honestly your example is exactly why software metrics is a good thing for developers.

If you run blame across the file, it won't be obvious that you deleted a bunch of code, but code churn history will.


eeeeewwwww no no no no no. Why every cute stat needs to be converted into a metric ot enslave people. Yes, I know, "because that's what managers do". But then I would argue it's in "us" (the programmers) to not let it happen. Yes, I know, "you cannot do that in big corp". Well then, there's no point in complaining either about anything, since we already assume that its a defacto situation, take it for granted and nobody can do anything about that.

Still, as a counterpoint of how this can be fun: When a fellow programmer left the company where I work, the CTO created a small video of the evolution of the services/code based on his commits during time. You could see the different repos popping up and showing his contributions to each during time and growth. Also, since he is (or was) a smoker, he got a zippo engraved with the number of commits and lines of code contributed. Does it mean he was good? He was bad? He did a lot? He did nothing? Noup. Of course, he worked for 5 years in the company, so it was a way to show to him that he contributed a lot and was part of the growth/existence of it, not if he was the most valuable programmer in the team. If you ask me, maybe many of the lines he commited doesn't exist anymore in the current code, still it was a snapshot of what he was for the team and company at that point. And the value was not on the code, but on being part of the team and the process of a company growing.

You can do whatever you want with stats, it's in you if its for good or not.


It dumps branch info too.

Also a good example of how to write a large-ish shell script with a simple user interface. I may use this to combine some smaller related scripts that our team uses.


Eh, there's a point where commits _could_ be used for metrics, assuming those being monitored are unaware. I dislike the LARP of "oh yeah, it takes me 7 hours to think deeply about the problem of changing this SQL statement, and then one hour and one commit to implement it."


if you are writing a SQL statement that has high impact on the business, take 7 hours to think about it if you need.

If you are suggesting tooling around for 7 hours and then working for 1... sure, if it is a pattern in the persons behavior but dont use time to commit time as the measurement. The issue is not time to complete. the issue is a lack of engagement and that data can be gathered in less harmful ways.

Additionally, I would refrain from getting so granular into how much someone spends each our of every day. that trope is sure to detract talented people.


Try also "gource" in cli for a great live video : https://gource.io/


I do like burndown chart showing code as layers over time https://github.com/src-d/hercules#project-burndown

Like other stats, it is not to be taken too seriously on early projects where re-linting or moving lines around may show as dropping all old code...


I didn’t see code velocity, which is a good stat to use for teams. I haven’t tried it but this repo seems interesting on a similar note:

https://github.com/jph98/github-pr-stats


Nice!

Would be cool to have also a multi-repository way so that you can have aggregated stats for multiple repos


Collecting git statistics is fun and not easy. I miss the equivalent of `hg chrun`.


Obligatory post for every thread about code metrics: https://www.folklore.org/StoryView.py?story=Negative_2000_Li...


Another: “Measuring programming progress by lines of code is like measuring aircraft building progress by weight.” —-Bill Gates


This is good information. I will add it to my favorites.

(Only a few older nerds will get that)

It is nice. I think it will be useful. Thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: