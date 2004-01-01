Hacker News new | comments | show | ask | jobs | submit login
The Biggest and Weirdest Commits in Linux Kernel Git History (destroyallsoftware.com)
80 points by gary_bernhardt 1 hour ago | hide | past | web | 11 comments | favorite





There is a mention of the 66 parent merge from Linus himself:

http://marc.info/?l=linux-kernel&m=139033182525831

reply


>Anyway, I'd suggest you try to limit octopus merges to ~15 parents or less to make the visualization tools not go crazy. Maybe aim for just 10 or so in most cases.

I'll file this under "problems I'm glad I don't have".

reply


> "Christ, that's not an octopus, that's a Cthulhu merge"

reply


Whoa, you're right! He referenced it neither by 7-character short hash nor by full hash, and I didn't think to check intermediate hash lengths. I've updated the post to reference that email.

reply


I think Gary's commit counts are off:

  $ git log | wc -l
This should count the number of lines in the entire git log, including metadata (not just commits). I think he means this:

  $ git log --oneline | wc -l
The number of commits for Rails should be closer to 61,000.

reply


Clauset Shalizi Newman 2007 has not-nice things to say about the classic physicist's idiot trick of fitting power law distributions by drawing a straight line on a log-log graph: it's got huge bias. https://arxiv.org/abs/0706.1062

However, the other difficult thing about power law distributions is that the dataset size requirements for proper determination of the fact that it's a power law distribution are occasionally incredibly difficult. So their critique is very strong, given the comparative lack of data. It is often the case that computer systems, with the overflowing reams of data, are still not enough. Note that the paper I cited up there suggests MLE and then a Kolmogorov-Smirnoff test, so it'll say a lot of things aren't power laws that could well be.

Another way to look at it is from a more geometric point of view. The metric entropy of any generic system of variables is defined as the sum of the positive Lyapunov exponents: and as an "entropy" that quantity does have a lot of commonalities with the other entropies. But to have positive Lyapunov exponents is often to have a chaotic dynamics, so it could just be conjectured that the time series of commits and merge octopus sizes in kernel git history is chaotic, so the evolution of the time series will be fractal in nature.

But it's also really fucking hard to confirm or deny that one, because there are varied and strange definitions of chaos itself and the methods that have been suggested to measure Lyapunov exponent in real systems are arcane and difficult. You could try some synchronization methods, but they remain arcane and crap. Fractal measurement methods are also shitty and full of dark magic.

One neat little trick might be to discretize the series, symbolic dynamics-style (it's already discretized but discretize further, into like percentiles or something) and run it through one of the dynamical machine learning dealies to see if there's patterns. Not too much literature on that but it's a thing that some randoes in like 2004 or something did

reply


I removed the "power law" language from the post, other than a quick "this is often called a power law, which probably isn't correct" note. I don't want to get bogged down in statistics when all I really care about here is "it's fat and one sided".

reply


GitHub's logo always reminds me of the octopus merge; not sure if it was chosen for this reason, but I think it's quite suitable.

reply


Slight article nitpick: a distribution that 'looks like a straight line' in a log-log plot is often not power-law distributed.

One could say that the distribution has a fat one-sided tail though.

reply


I thought about trying to clarify this, but "power law" is the term that's been thrown around to describe this effect in software systems for many years. Really, I only care about the fat-one-sided-ness (for its practical implications); I don't care so much about the precise mathematical formulation.

reply


Has anyone asked Laxman Dewangan what he was up to with that initial commit and merge thing?

reply




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: