Hacker News new | more | comments | ask | show | jobs | submit login
CodeFlow: Improving the Code Review Process at Microsoft (2018) (acm.org)
166 points by rammy1234 16 days ago | hide | past | web | favorite | 81 comments

(former Microsoft employee here)

CodeFlow's workflow was nice, but everything else about it was _awful_. Capturing iterations, comments, threads, etc was the good stuff. The UI - eye-bleeding and amateurish. Network operations - terrible.

Despite all that, the workflow was so good that I'd take it over github CRs in a heartbeat.

As a contributor to a F/OSS project with some serious infrastructure (KDE), Github's uncontested success feels weird. It's a really anemic "solution" to the collaboration problem, especially with projects using it as their only collaboration platform. Most larger ones at least run mailing lists and a wiki or so. Some have a more sophisticated bugtracker, but most seem to stick with Github's... placeholder for one. Probably due to its integration with code review.

Github’s success stems from its simplicity imo.

Pre-Github, projects would need to fire-up their own Bugzilla or Fogbugz instances, set-up all the project tracking criteria (areas, iterations, severity, etc).

But unstructured label tags seem to work surprisingly well in practice: projects that would have otherwise demanded 20+ fields for work-item tracking now seem fine with issue labels.

> Github’s success stems from its simplicity imo.

Bingo. It surprises a lot of people how little you need to be successful if everyone's on the same page. I've worked places that had thousands (yeah, thousands) of apps, dozens or hundreds of millions line of code, and had little to no issues with collaboration (oh, they thought they did, but compared to almost any other company, they really didn't), and process/tooling involved github issues, markdown files for docs and little else. Sometimes they'd toss in trello in the mix and that's about it.

You definitely can do better, but as a baseline, it's more than enough.

If you need to put 20+ tags it's more work, because you have to figure out what tag for what field, etc. Filling 20+ select boxes is easier.

Needing to put 20+ tags IMO is a hint that something about the process is wrong ... I'd love to hear some ideas about what could _possibly_ be enabled by having 20+ tags on a given work item. What workflow could possibly need that level of detail?

I agree, but that relates to the original complains about Bugzilla users who required 20+ fields. I encountered 3-5 fields maximum and usually they were optional and I would prefer form with 3-5 proper fields to arbitrary tag string.

> Pre-Github, projects would need to fire-up their own Bugzilla or Fogbugz instances, set-up all the project tracking criteria (areas, iterations, severity, etc).

So a ready-made setup at an existing service open to multiple projects would work. I guess that's what Sourceforge was.

Sourceforge stopped innovating after they started offering SVN as an alternative to CVS (!!), everything since then was about monetising their audience (e.g. spyware injected into binary installers) and wrecking the quality of the user-experience with aggressive advertising.

Sourceforge could have been github, but when a company turns on its users while failing to deliver value it won’t last.

Because git is decentralised it means all the git-hosting services become commodity, and we’ve learned that having the best UX keeps people around - even if your service is commodity, so I expect this basis will ensure GitHub (now part of Microsoft) will stay in the community’s good-graces.

Ironically github is the centralized version of a decentralized system.

Git is distributed, not decentralized

GitHub != git

Around six years ago I switched from the Microsoft platform/stack to a fully open source stack. I remember thinking about Github as something magical, I wanted to learn Git so that I could have a repo on Github, not the other way around. The funny/sad thing is that Microsoft now controls most of the stack I switched to ...

Well GitHub isn't open source at all, so it makes sense that you still wound up with some corporate interest controlling that part of the stack.

I can recommend reviewable.io for github.

Don't look now as they're having trouble with dropping offline ...

All fixed! (And it only affected people signing in from two different browsers at the same time, which was actually a pretty small proportion -- which is why it took so long to find and fix the issue.)

>Despite all that, the workflow was so good that I'd take it over github CRs in a heartbeat.

I have to agree, here, wholeheartedly. Put a bug in that would break the build? No worries, as soon as you kicked-off a CodeFlow, it would start a prod and smoke-test build for you. Everyone could also see those breaks and block the review from completing, until you fixed them in a new iteration.

You would think that, with everyone moving (or having moved) from Source Depot to VSO/Git, CodeFlow would've falled into the deprecated products pile but because it was such a worthwhile utility, teams for entire products carried it forward, rather than relying on GitHub's CRs[1].

[1] - It became more of an amalgamation between the two [Source Depot and Git-basd-VSO] but CodeFlow was your one-stop shop to view it all.

> No worries, as soon as you kicked-off a CodeFlow, it would start a prod and smoke-test build for you. Everyone could also see those breaks and block the review from completing, until you fixed them in a new iteration.

This sounds exactly like PRs with status checks for protected branches (usually master)? Edit: Link with UI screenshots: https://help.github.com/articles/about-required-status-check...

For all of the people reading the article and trying to visualize the tool, can you summarize what about the workflow was so good?

Contrasting to Github PR its a better process but really only a marginal impact on developer productivity.

When i was there - Codeflow had the following feature

- Multiple reviewer states ... accepted/reviewing/awaiting changes/decline to review/reject change. - Multiple comment states ... pending, won't fix, fixed etc. - Iterations based code review flow - after you review - you come back easily view the next iteration (relative diff) - you can really compare back and forth and track comments through iterations - github tends to loose comments through commits - Support for bots to comment like stylecop etc

Codeflow workflow felt like an MS-Office application like Word or Outlook compared to Github - Heavy UI based workflow with no API/commandline counterpart sometimes - Desktop app - not website - Lots of knobs/states - Very easy to work back and forth in the UX

I agree with an earlier comment however - the UX was pretty ugly - the workflow account for lots of states Github doesn't but there is a nice linear flow to the progression of a github conversion that (at least at the time i was there) . IIRC it was a managed .NET - WPF application which are kindy clunky - not as nice as a website with clear links etc.

At least as of November 2018 (when I left), everything you said was still true. However!

CodeFlow was also integrated into VSTS, whose review tooling is very much inspired by CodeFlow. Reviewers could use their web browser or the client app, and it all fed into the same system.

Most of the features that set it apart from Github initially have been copied by now - comment threads with resolutions, individual reviewer states, etc, have all been recently introduced; the gap between CF and GH has shrunk substantially.

That said, CF still does a better job of retaining comment threads and associating them with lines of code. Maybe it's just different UX decisions by the product teams, but it makes a difference when reviewing complicated patches.

Differential is a pretty great open source code review tool: https://www.phacility.com/phabricator/differential/

That is sort of underselling it, Differential has the following features in common with CodeFlow:

- Easy to see the changes an author makes over time to a review

- Integrates with the Package and Ownership system to automatically assign reviewers

- Easy work on changes that touch many files

- Inline lint and code coverage results

- Built in display of unit test and CI results and overall coverage %

- Draft mode to delay review until tests & lint are clean

- Comments move properly as you make changes

- Easy to mark comments as done and find outstanding ones

- Comment on multiple lines instead of a single one

Things that are missing:

- No ability to comment on single characters

- Cannot mark certain files as important

- No way to explicitly mark a change as "simple" (like a rename)

- No web UI to make quick fixes (compared to GitHub is what is most missed)

I'm sure I'm not the only person who wishes VS Code had capability like this (which isn't to say everyone needs it or wants it, just that when it's needed and wanted it sounds amazing)

You could have VS Code plug into a system like this, but there is infrastructure that is needed to run CI, track statuses, manage comment data, etc.

Theoretically if there were a standard for managing all that data within the git source graph, you could implement this solely as a VS Code plugin.

The GitHub Pull Requests extension for VS Code is really neat. It inlines a lot of the PR workflow into VS Code.


The extension is open source and the extension points it uses are all relatively well documented. I'm almost surprised there aren't more extensions doing similar things. (I don't believe that even the Azure Repos extension currently supports an inline PR experience like this.)

Good article. I liked the analysis of how code reviews are performed at Microsoft (yes, computer science is a science!).

My team is fairly happy with github enterprise and the PR workflow but some things about CodeFlow sound better: being able to comment on a single character in a line or share comments between disparate code; and, having a local (non-web) client that runs faster.

What y'all really want is Google Critique. You just don't know it yet, because you haven't experienced it, as it's internal to Google. The closest public product that I could find is http://reviewable.io (no affiliation) which, while it looks weird/amateurish, nails most of the points which make Critique so amazing. There's also Gerritt, but it's too invasive in terms of workflow, whereas Reviewable integrates with GitHub in a more lightweight fashion.

I have worked in both Google and Microsoft and even though I see how people can be efficient in Critique, I must say I prefer CodeFlow.

In CodeFlow any comment is a floating object with a line connecting it to whatever selection it was created for and you can move it around as you like. You also can easily filter out visible comments by state, author, participation in them.. Features around comparing iterations, marking already reviewed files and similar are the same as far as I remember.

Another reason for my opinion might also be that CodeFlow is probably better optimized for large screens, but this might be caused by the weird Google standard of 80 column limit for source code which I will just never understand in the 21st century where everyone has 24" and up full HD to 4k screens.

I am comparing state of both from 2015 where I worked with both in the same year.

> this might be caused by the weird Google standard of 80 column limit for source code which I will just never understand in the 21st century where everyone has 24" and up full HD to 4k screens

even if you have a big screen, it's still useful to be able to fit two or three files side by side on a single screen. 80 chars might not make sense as a hard requirement for, eg, a c++ codebase, but it's still a worthy ideal to shoot for.

Yeah, I agree, but do 120, not hard 80 limit for every language. I remember C++ and JS code where you wouldn't fit even the for() definition in one line and that just becomes crazy

Reviewable author here -- thanks for the compliment! Redoing the UI is on the to-do list, once the migration from Angular to Vue is done.

Thanks for redoing the UI. While evaluating, I really loathed the childish style, and I had problems finding the buttons I need when I needed them. The UI is cluttered and not self explanatory. It was a non-starter for me. That said, reviewable has fans in my immediate surroundings, so the problems are not with the system design.

Thanks, I'll take that criticism to heart. :) Can you expand on "childish style", though? My design philosophy is best summarized as "minimalistic but with touches of whimsy", but perhaps that comes across as childish?

Thank you for putting it together! I switched folks over and we've been paying customers for well over a year now. I can overlook the UI deficiencies when I have proper review workflow, but the problem is that people who don't know the Critique workflow might be put off by the UI. Personally, I encourage them to give Reviewable a good try anyway. Anything larger than a few dozen lines and/or needing multiple rounds of feedback is just untenable in plain GitHub. I can say with certainty, our code would be worse if it wasn't for Reviewable.

Thanks again, I might have to quote you on the home page. :) And yes, the UI is often the sticking point for adoption. I don't know if it's just the style / toolkit (an old version of Semantic UI) and switching to something like Material Design would take care of the problem, or if it's a deeper design / UX issue. A good onboarding flow would likely help as well but I haven't figure out what that should look like yet either. More feedback and advice is always appreciated -- and if you happen to be (or know) a UX expert who knows how to fix it I do have some budget I could throw that way. :)

From my brief play with your demo it's a bit of both.

The toolkit is very small and lacks good definition of elements (both visually in that it lacks contrast, and practically in that I have no idea what is clickable or editable).

The page structure isn't that clear either, where are the column headings? And on my widescreen browser window the isn't a limit on horizontal stretch - I have to play with the browser to find a good width. Also, I can expand code, but how do I shrink/hide it?

Thanks! Sounds like switching toolkits should address many of these issues.

And no, you can't shrink/hide code, other than by re-diffing. What can I say -- it's surprisingly hard to implement and didn't seem worth it.

Will you guys support bitbucket ? Some of your older comments/support bugs have refused to support anything other than GitHub.

Sorry, not going to happen as things stand. A big part of Reviewable's attraction is the tight GitHub integration, so supporting other platforms (BitBucket, GitLab) would essentially require a fork or some pretty heavy-duty abstractions. Either way, future maintenance would become much harder, so I can't build a business case for it. Perhaps if BitBucket or GitLab were larger / growing faster, or if Reviewable was a much larger company with many developers to throw at the problem...

BitBucket doesn't have a large OSS community, but it does have a lot of larger business that need Jira and Confluence integration. So it could be worth it, just the user base isn't as visible as GH.

Definitely something to think about then, since I think larger businesses are the sweet spot for Reviewable and that's where most of the growth has been coming from in the last couple years. Still, forking the codebase _really_ doesn't sound appealing...

Abstractions are sometimes really good for understanding the business. But yeah comes with a price tag and potential mismatch to the real niche (like GitHub add on)

BTW, does anyone where to get better-than-anecdotal data on userbase size and composition for GitHub, BitBucket, and GitLab? And, ideally, growth trends?

So answering myself, with numbers pulled from https://expandedramblings.com:

GitHub: 31M users

BitBucket: 6M users, 1M orgs

GitLab: 100K orgs

The GitLab number tallies with Emily's answer, which gives me some moderate confidence that these numbers are in the right neighborhood. I'm also going to assume that the users-vs-teams numbers are all in roughly the same proportion.

Based one these numbers it doesn't make much sense to target BitBucket unless either 1) they're growing much faster than GitHub (unlikely) or 2) their customers are a much better fit for Reviewable (possible, but it would have to be a truly significant difference in userbase composition). Targeting GitLab doesn't appear to make sense at all (as a business), though perhaps they're growing fast.

I'd go off of the BitBucket plugin install counts and compare them to what you're getting on GH:


It's hard to compare, since Reviewable is an OAuth app and hence needs an install per-user rather than per-org. However, on the page you linked I see a couple apps topping out at 6-7k installs, and most of the rest are in the low hundreds. This doesn't look particularly impressive to me...

Community Advocate for GitLab here. We can't share that data publicly (also because it's hard to track since many are on open source core), but we do have over 100k orgs using GitLab

"edoing the UI is on the to-do list, once the migration from Angular to Vue is done."

Is what makes me uncomfortable about modern webdev. Valuable product work constantly deprioritized by migrations to new frameworks. Magpieing has always been a risk in hobby projects, but in the past there was a smaller torrent of nee things to always be migrating to.

Eh, I don't think it's as bad as you make it sound. It's not "constantly", it's the first major migration effort for Reviewable. And honestly, I probably could've stayed with Angular 1.x for a few more years (it's still being developed, after all), I just got tired of working around its performance issues. There's also 4 years of learning how best to architect an application like this that led to the creation of a Firebase-centric model layer where Vue happened to be a good platform to build on. And even if we didn't switch platforms, a thorough refactoring would've been necessary after many years of accumulated technical debt.

That doesn't seem particularly different that desktop software. Pretend s/he said that once they got to off of WinForms and onto WPF they could iterate on the UI design. or maybe once they upgraded GTK, or whatever.

Do you've experience with Google critique? Did you work there in past or something? How did you copy their features without seeing their product?

I worked at Google ~2006-2012. Most of my experience was with Mondrian, but Critique was coming online shortly before I left. I used it long enough that the general "feeling" of using it stayed with me, but not long enough that I actually remembered specific features to copy -- especially since I didn't start even thinking about Reviewable until ~2014. So really it's more of a "spiritual ancestor". :)

As a Xoogler, I really miss Critique.

Amazing tool. Great integration with version control, code search, continuous integration.

I’m confused, how is this significantly different than Amazons CRUX tool? Anybody been at both know?

I've worked at AWS and Google. CRUX and Critique have similar functionalities. In general there's mostly a 1:1 mapping of equivalent internal tools between the two companies, albeit each with their own flavor.

Disclaimer: I work for Alphabet.

Having only ever used CRUX (and ReviewBoard prior to it), I'm honestly surprised that most of what is discussed in the article is not common outside of Amazon.

Having used CRUX (and ReviewBoard prior to it), I'm honestly surprised it's considered an okay solution? Maybe 10 years ago it was fine, but the productivity it zaps and bad habits it encourages are a staggering waste of developer time. GitHub/GitLab is light years ahead, to the point where even small shops (can) have better code review process IMO. (Gerrit is decent, too, but the weak UI and interesting process cancels out other advantages is has over GitHub/GitLab.)

What were the bad habits in code review that you were thinking about? I'm stuck with TFVC and the code review in TFS is very poor but I used ReviewBoard about 8 years ago and thought it was the best F/OSS available at the time.

mainly for CRUX and the surrounding ecosystem: it's very hard to protect the main branch from pushes that don't meet approval, that's done afterwards during deployments. this is obviously and horrendously dumb, you really want to catch stuff before it goes in master/mainline, and having to fix this stuff before deploying it also a great use of time /s.

but it gets worse. the UI can't or won't do trivial rebases. so either you upload a new, rebased revision, and get the reviewers to re-review (because stale reviews get wiped), or just rebase, push, and mark as merged. i get you are supposed to review rebases, but this can grind a team to a halt. you're waiting for approvals, but by the time you get them someone else has pushed a new change... a colossal waste of time (edit: especially if you require 2 approvals, which i do think is good practice, and some AWS teams do it).

this also means people are trained to ignore the merge button or expect it to be greyed out. but it might be greyed out for other reason though. in GitHub, you have big green check-marks or red icons to indicate which conditions have and haven't been met (reviewers, CI, merge conflicts). in CRUX, hovering over the greyed out merge button gets you one cause at best (there is a page/tab that shows more detail, but almost nobody looks at it).

obviously it depends on your team, but it requires near constant vigilance to maintain high quality with CRUX, especially with a big team or less experienced devs contributing. process is supposed to make quality easier, not harder!

there are many other smaller issues, but that's the big one IMO. as for ReviewBoard, personally i prefer Gerrit, which itself doesn't have a great UI, but makes up for it in the workflow.

But you actually can setup rules to prevent merges unless certain people/groups have approved. It sounds like you and your team are suffering more from bad code review habits than a bad tool. Devs should review what they are publishing to their peers at least once before actually hitting the publish button. It’s a courtesy to them and their time

I personally love the Crux tool and some of its add ons.

> But you actually can setup rules to prevent merges unless certain people/groups have approved.

As detailed, that only prevents merging via the UX, not pushes to the main branch. Maybe the team does suck, but I still consider it a shortcoming of CRUX et al.

Those complaints sound like CRUX issues not ReviewBoard. Looking at multiple branches is not in scope of ReviewBoard at all.

Not to hijack the thread too much (I'm also very curious about the answer -- I'm one of the Review Board founders and these discussions always interest me), but we support TFS these days in Review Board by way of our Power Pack add-on product, if you'd like an alternative solution. I can talk to you more about this off this thread if it's at all interesting to you.

What makes Google Critique better than Codeflow?

How would I know, CodeFlow is internal to Microsoft. As far as I can tell from public sources, it's trying to solve some of the same problems (keeping track of structured, possibly multi-party and multi-round conversations over revisions), but it's a Windows app, and Critique is really Google's V2 or even V3 of this kind of thing that runs in the browser and is deeply integrated with the rest of their engineering infrastructure. I hope it eventually makes it into Google Cloud, it really is that good.

> How would I know, CodeFlow is internal to Microsoft.

Lots of people have worked for both Google and Microsoft. The article said CodeFlow was good. You said no, Critique is better! But I guess you don’t actually know one way or the other.

Which is too bad. Because it’d be great to hear thoughts from people who’ve tried both.

I have worked for both and prefer CodeFlow, my more detailed comment is above.

I see this arrogance inside Google lots.

The way we do things is the right way because that's what we do. Any disagreement is ignorance.

>> You said no, Critique is better!

Nope, I did not. I just said that people really want Critique, which they do. Even CodeFlow itself is evidence of that.

Maybe what I really want is CodeFlow. Maybe Critique is evidence that I need CodeFlow in my life! It could be that you want CodeFlow in your life, you just don't know it yet.

Seeing that I haven't willingly used Windows in the last decade and wrote zero code for it, I highly doubt it. I would, however, pay for Critique+Blaze+Forge if Google offered it on Google Cloud. It's an incredible force multiplier.

Have you used Codeflow? It seems pretty amazing as described. I'd like to see input from someone who worked at both Microsoft and Google to chime in.

I've worked at both. My memories of CodeFlow are a bit faded but I don't remember it doing anything that Critique doesn't. Critique has the advantage that it's hosted on the web, while CodeFlow was a janky Win32 application.

I worked at both, but I left MS before CodeFlow was created. An earlier version of what is now Critique (called Mondrian) was already in use at Google when I joined, and predated me by several years. There might have been something similar before it, I don't know.

I have worked for both and prefer CodeFlow, my more detailed comment is above.

What were the points that made Critique amazing?

Since this thing is coming from Microsoft, I really doubt the tool or the whole process can work outside of Microsoft. It's likely very much tailored to Microsoft needs and full of idiosyncrasies of the Microsoft development workflow.

Here's an example of how things usually are at MS. I worked at MS awhile ago. At that time Git was just being developed, and MS used it's own VCS based on Perforce (which is pretty good, I should say for a centralized VCS). Code reviews were done over email. Kind of like Linux kernel development. If you wanted somebody to review your code, you used a tool to pack your changes into a *.cmd file which contained a diff itself and a Perl (!) script to unpack and view the diff. Then you sent this file to a reviewer over email, and he just ran it on his machine to see the changes you had done. This tool was simple, limited in features, and brilliant. It was hacked by somebody on the Windows NT team back in the 90s. Even if you couldn't unpack a diff into your repository checkout, you could still open the pack as a text file and review the changes, or manually apply them, since it was just a freaking text file.

Come the time MS decided to migrate everything to .NET. Some team at MS similar to TSE responsible for development tools decided that the old Perl tool was inadequate. You don't need no stinking Perl scripts! They developed a replacement in C#. Full of features you can't live without and maybe even an integration with the dev environment (I don't remember). The new tool was distributed in the binary form. The code changes for review were packed into a binary format with plenty of metadata in addition to the actual diff. The problem? Only this tool knew how to unpack this binary format. It worked most of the time and when it worked it was great. But when it didn't work, you were screwed. If it couldn't unpack your code change for any reason, it would throw some C# exception. You couldn't fix the exception and rebuild the tool (you didn't have access to the tool source code or knew how to build it). You couldn't extract the change yourself, since it was packed in a binary format only the tool could extract. You pretty much lost your change if you didn't store it elsewhere.

And this is generally how things were (and still are?) done at MS. They have teams of project managers analyzing and spec'ing out every bit of the development process. Then they create extensive tools to support it with elaborate UIs and automation of everything they can think of. But if you need to do something just a little bit differently, you're left out in the cold because those tools are too close and too brittle to adapt to your specific needs.

> (and still are?) done at MS

It sounds like the DevOps winds have shifted in the last few years at Microsoft. Even Windows (!) is now in git and properly dogfooding Azure DevOps (Boards, Repos, et al), according to blog posts and BUILD talks.

Yup. I needed to search some Windows code so I simply went to DevOps and typed my query. Results were lightning fast and it handled a complex query; I like DevOps for the search if nothing else.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact