Hacker News new | past | comments | ask | show | jobs | submit login

For a scientific paper, you are correct.

For practical or business purposes, this is a nice bit of incomplete information to help make a decision. I want to take a serious, time-invested dive into a new statically compiled language, but which one should I pick? An old die-hard or the new-hotness? I could make a guess from reading the docs and such, but I'd also want to know community activity and support. This is a handy chart for getting a sense of that.

Or, I'm a business owner who just hired my first engineer. He's saying the backend should definitely be built in Groovy, or maybe he'd be willing to do Scala or something else, but definitely Groovy, yeah, Groovy man. I might be able to get a better idea of which would be beneficial for my long-term business prospects (hiring more engineers, etc) by checking out a chart like this as I might not have time to do real in-depth research.

As a scientist you require complete, sound and accurate statistical data. As a business practice (this site is about start-ups, no?) you need to be comfortable making serious and important decisions based off of incomplete and possibly inaccurate information because making fast decisions is often paramount. You can and will always make other fast decisions later and decide whether it's worth the effort to course-correct if you need to due to new and more accurate information.

This is maybe too deep an analysis of a fun little infographic, but as a former professional poker player who made a living off of incomplete information you got my cockles up.




> "For practical or business purposes, this is a nice bit of incomplete information to help make a decision. I want to take a serious, time-invested dive into a new statically compiled language, but which one should I pick? An old die-hard or the new-hotness? I could make a guess from reading the docs and such, but I'd also want to know community activity and support. This is a handy chart for getting a sense of that."

No, not really, because you've no idea what assumptions are baked into the data. For some decisions, you can make fast, gut-based ones. For others, you need to take a much more considered and scientific approach. The difference can be defined by the ability to course-correct after-the-fact (the harder to course-correct, the more stringent the decision-making process). There's an entire academic (and military) discipline around decision-making processes and with good reason. People want to make good decisions as well as quick ones.

Anyone making business critical decisions based on this chart, without doing the extra work to understand the data, is basically lying to themselves. That's why vanity metrics and data-porn should be handled with extreme caution.


I'd also want to know community activity and support. This is a handy chart for getting a sense of that.

The entire chart? Wouldn't the first column be sufficient? Number of repositories gives you some idea about language popularity.

Well, kind of: there's bias of hype here. Obviously choices behind open-sourced projects on GitHub aren't representative for the industry. It's the software's world avant-garde, if anything.

And even so, that's just one parameter out of five, and it can be very well be considered in isolation from all the rest.

I wouldn't make business decisions based, for instance, on the average number of open issues. Because it's an outcome of many different variables. So how would you know what it means? Is high good? Is high bad?

Interrelations between data - shown by this clever chart - are even more mysterious.

TeX has a very high number of pushes per repository (second best), while there's fairly few repositories, and they are rarely forked.

At the same time R has low number of pushes (second lowest of all), whereas it wins in the "new forks per repository" category (#1).

What do you make of that, businesswise?


I'm talking piece of the puzzle not the entire pie.


> For practical or business purposes, this is a nice bit of incomplete information to help make a decision

Titbits of incomplete information are often placed as a result of publicity campaigns. In the specific case of the github source info for this graphic, the languages near the bottom of the list can easily have that information manipulated by their backers scamming the stats. All you need is one change to be pushed during the measured period for a project to be registered as active, a data point which I know is being scammed for at least one language near the bottom of the list.


Oh, which one?


> For practical or business purposes, this is a nice bit of incomplete information to help make a decision.

Is it? Does the fact that people open lots of issues in C++, Rust and Scala make you more or less inclined to pick one of those for your new project? Why?

I'm all for making the best use we can out of incomplete or noisy data, but that stat doesn't tell you anything, it's just a number.


> Does the fact that people open lots of issues in C++, Rust and Scala make you more or less inclined to pick one of those for your new project? Why?

I have seen "There are two thousand open issues, do they ever fix any bugs?" on a few occasions.




Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: