Hacker News new | past | comments | ask | show | jobs | submit login
Curiosity as a Service – Literally (medium.com)
55 points by KentBeck 17 days ago | hide | past | web | favorite | 5 comments

"An hour-long project ensued extracting data from our code review tool and correlating number of reviewers with time-to-review. Turned out the optimal number of reviewers to minimize time-to-review was between 1 and 2. Any more reviewers and delay grew substantially."

This is off topic to the point he is making but it bothered me that he assumes correlation implies causation, haha. When a code review isn't already going through is when I see people tossing on lots of reviewers.

Regardless, the article runs through some fun thoughts :)

Yeah; there are a bunch of other variables I'd love to see thrown into a broader multivariate analysis, if somebody was going to really dig into that topic in a real (i.e.: more than 2 hours) way.

"Diffusion of Responsibility" provides a rational, well-understood mechanism by which setting more reviewers might result in longer review times, as the post suggests. But we're still talking about relatively small numbers of people here; who knows how strong the DoR effect is in this case, relative to other variables, such as the complexity of the PR? My intuition is that number of reviewers is more likely to turn out to be a multiplier on other factors affecting code review duration; that having lots of reviewers on a tiny trivial change might make the review go even faster (since it's easy, and more eyes means it might be spotted earlier), whereas having lots of reviewers on a massive complicated change might slow it way down (since everyone puts it off, hoping that someone else will handle it).

It would be really interesting to crank up a big statistical analysis with lots of extra columns of data, to see the strength and amount of correlations between other factors. My money is on "bigger changes lead to longer code reviews and also independently lead to more reviewers being set on the pull request". Which.. would be kind of a boring result which everyone already intuitively knew, admittedly. But at least then we'd know for sure.

Or at least, for p<0.05 sure.

Had the same thought! Perhaps he should hold a randomized trial instead -- for every review draw from [1, 6] and add that many reviewers. Although he'd also have to keep his "pinging" behavior consistent...

I'm doing this right now. I got tired of the limitations with GitLab CI's caching, so I spent this weekend writing a custom caching layer that uses my own Google Cloud Storage bucket [1]. There are plenty of more valuable things that I could be working on, but this has been really fun.

[1] https://gist.github.com/ndbroadbent/d394f8a6890eddcaeafe9223...

I'm easy to be one of the curiosity cogs ...if there's an autonomous fuel tank for this I haven't found it yet. More seriously,I think that 80% pig my work its exploratory and driven to a large degree by curiosity. I get paid really well for the other 20% because the quality of that output is driven by the investigation done without production.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact