However, the other difficult thing about power law distributions is that the dataset size requirements for proper determination of the fact that it's a power law distribution are occasionally incredibly difficult. So their critique is very strong, given the comparative lack of data. It is often the case that computer systems, with the overflowing reams of data, are still not enough. Note that the paper I cited up there suggests MLE and then a Kolmogorov-Smirnoff test, so it'll say a lot of things aren't power laws that could well be.
Another way to look at it is from a more geometric point of view. The metric entropy of any generic system of variables is defined as the sum of the positive Lyapunov exponents: and as an "entropy" that quantity does have a lot of commonalities with the other entropies. But to have positive Lyapunov exponents is often to have a chaotic dynamics, so it could just be conjectured that the time series of commits and merge octopus sizes in kernel git history is chaotic, so the evolution of the time series will be fractal in nature.
But it's also really fucking hard to confirm or deny that one, because there are varied and strange definitions of chaos itself and the methods that have been suggested to measure Lyapunov exponent in real systems are arcane and difficult. You could try some synchronization methods, but they remain arcane and crap. Fractal measurement methods are also shitty and full of dark magic.
One neat little trick might be to discretize the series, symbolic dynamics-style (it's already discretized but discretize further, into like percentiles or something) and run it through one of the dynamical machine learning dealies to see if there's patterns. Not too much literature on that but it's a thing that some randoes in like 2004 or something did
I've not been fired yet for saying that.
It was also published in Science. Sean also founded Quid.
I thought I was taking crazy pills when I saw it. I wondered how could this guy not know better and how could no one else around him know either.
Lot of people who don't like reading in complex systems, I find. Also a lot of people who aren't sufficiently cynical.
Mean of pareto (power law) distribution, where parameter < 1: infinity
For example, if you have this sort of multiplicative central limit theorem with no lower bound, where you imagine independent variables being multiplied together - the attractor of that dynamical process, the result of that universal phenomenon, is lognormal.
Add a lower bound, bam, power law (champernowne 1953 "A Model of Income Distribution"). So you're really making different claims about reality.
I'll file this under "problems I'm glad I don't have".
Merge: 4332bdd 88d7bd8 88d7bd8
Author: David Woodhouse <email@example.com>
Date: Sun May 8 13:23:54 2005 +0100
Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Here's how I would do it:
time git log -m --first-parent --shortstat --pretty="%H" --min-parents=2 |
grep -v '^$\|3e1dd193edefd2a806a0ba6cf0879cf1a95217da' |
sed 's/.* file.* changed,//' |
sed 's/insertion.*,/+/' |
sed 's/deletion.*//' |
sed 's/insertion.*//' |
sed 's/^\ \(.*\)\ $/\$\(\(\1\)\)/' |
xargs -d '\n' -L 2 echo echo |
sort -k 2,2 -g
Of course "--first-parent" doesn't guarantee that we're walking the mainline (see: https://developer.atlassian.com/blog/2016/04/stop-foxtrots-n... ), but it usually is.
On my laptop it takes 3 mins 30 seconds. Here are the 5 biggest merges by this definition:
099bfbfc7fbb 2015-06-26T13:18:51-07:00 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
3f17ea6dea8b 2014-06-08T11:31:16-07:00 Merge branch 'next' (accumulated 3.16 merge window patches) into master
ce519e2327bf 2009-01-06T17:04:29-08:00 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
7ea61767e41e 2009-09-16T08:11:54-07:00 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
f063a0c0c995 2010-10-28T12:13:00-07:00 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
Whereas OP's definition counts every line ever created on both sides (e.g., counts every line of code in the original project as well as the orphan at the moment of the merge).
git diff master...topic
I find the three-dot diff notation very useful, but confusing, and always have to type "git help diff" first and stare at the docs a bit to find this bit:
"git diff A...B" is equivalent to
"git diff $(git-merge-base A B) B"
Perhaps git should throw a warning when you try to do an octopus merge with more parents than an octopus has legs. If you really want to proceed, add the --cthulhu option. The default behavior would be --no-cthulhu.
Showing 126 changed files with 14,128 additions and 20,617 deletions.
(ok, I'm pretty proud of reducing code size by 6k+ lines while improving lots of stuff, but the commit is a shitshow)
$ git log | wc -l
$ git log --oneline | wc -l
git rev-list --count HEAD
titan:~/src/linux geofft$ git log --oneline 566cf87 | wc -l
The etymologically correct plural is octopodes. (Some people accuse "octopodes* of being pedantic, but as I see it "pedantic" is just a euphemism for "correct in a way I don't like".)
The book "Octopus: The Ocean's Intelligent Invertebrate" which I have here says this on the matter: "By the way, the plural of octopus isn't octopi, because the word is Greek -- octopus to be exact -- not Latin. The Greek plural would be octopodes, but we call them octopuses."
You're not being pedantic as it pertains to this word in the English language, you're just wrong.
The word "octopus" comes from Greek, just like the word "corpus" comes from Latin (and has the etymological plural "corpora").
And if it's prescriptive, how far do you want to go, etymologically? According to Wiktionary, the term originates from a Proto-Indo-European language (but doesn't give plural forms for those roots).
And isn't it also the case that once a word gets accepted into a language, it becomes a word in that language, no matter where it comes from? There are _several_ examples of such words, in all the European languages. The word "common", for example, dates back to Latin "communis", and I'm pretty sure the adverbial form for that isn't "communisly", so why the exception for Octopus?
Here's the german conjugation of "mailen" (writing an e-mail), borrowed from "to mail":
Ich maile, du mailst, er/sie/es mailt, wir mailen, sie mailen.
I don't know any loanwords that break english pluralization rules in german, but for the reverse: The correct plural for "Kindergarten" would be "Kindergärten" (not "kindergartens"), which I imagine some english speakers would have problems with. And "Autobahnen" is rather unintuitive compared to "autobahns".
The plural of 'Baby' is 'Babys' (instead of 'Babies' – though some people also use that form) and 'Computer' doesn't change.
On the other hand, both 'Indices' and 'Indexe' are used and for 'Tempus' the only plural is 'Tempora'.
The English word "journal" comes from French, but if you're talking in English about two systemd log files, they're "journals", not "journaux". (If you're talking in French, they are in fact "deux journaux".)
One could say that the distribution has a fat one-sided tail though.
I built a small web UI where developers could select and unselect development branches, and it would octopus-merge all selected branches into the master branch, and force-push that state onto the QA branch (and deploy it to QA, of course). So QA would always be master + all development branches that were currently being verified. By using a Github webhook, it would update the QA system whenever master or one of the branches being verified was pushed to. I'm not in that team anymore, but I think that deployment tool is still humming along nicely.
My guess is that Nvidia has some internal Git hosting tool or a private GitHub account, and this developer clicked the most obvious "create a repo" button and tried to push their local git clone of Linux it. That push was rejected because it would have clobbered the auto-generated commit, so they did a `git pull` and a `git push`, i.e., they merged in the auto-generated commit.