Hacker News new | past | comments | ask | show | jobs | submit login

Kiln Harmony is complicated, and has lots of edge-cases it needs to handle. We'll be publishing a whole series of blog posts that explore exactly how Kiln Harmony works. In the mean time, though, just because I think it might spawn some interesting discussion, here is a non-exhaustive list of edge-cases we translate:

  * Git octopus merges
  * Mercurial descriptions that are not UTF-8
  * Git commits whose messages aren't in their nominal encoding
  * Mercurial and Git having invalid timestamps
  * Git having a different author from committer
  * Git and Mercurial commits and changesets with extra metadata
  * Mercurial usernames that are not valid Git usernames
  * Mercurial bookmarks that are not valid Git refs
  * Mercurial named branches
  * Git annotated tags (requires an not-quite-yet-released extension to fully round-trip; the non-annotated part of the tag works today and will be forwards-compatible)
  * Mercurial changesets/manifests/filelogs with bad parent data
  * Git trees that are just flat-up invalid
  * Subrepos and submodules (100% preservation, but we can only
    translate Git submodules/subrepos cleanly to/from each other,
    since Git submodules have to be Git)
There's more, but hopefully that gives you some idea what we were up against. Again, I'll be publishing a series of articles that explore how we handle all the above edge-cases beginning really soon.

I'm really looking forward to these posts - it seems like a lot of the problems you would have faced are specific examples of more generalized data integration issues. Accordingly, the approaches used may have more general applications, which would be cool to tease out from your implementations.

This sounds like a great list of the technical challenges for Kiln implementors (applause) and an awesome list of reasons for me to rationally convince my team (if I had one) to use the single DVCS.

Anyone intelligent enough to use git or hg should be pragmatic enough to learn the other one for the benefit of the smooth workflow, when presented with this list ?

I'm actually really happy with the workflow we came up with; that's something we've been using for months now at Fog Creek with no real issues. The above is a list of data format issues, which your team doesn't have to deal with.

One nit in the article:

"Git master branch in sync with the Mercurial repository’s tip transparently. So far, so good."

Did you mean "default" instead of "tip"?

What test cases do you use? Oddly I do not know what cases the git / hg devs use so I would be interested in seeing test cases plus comments as much as anything

We have a fairly robust test-suite that covers all the corner cases we were able to find by trawling old/major Git repos, and then we have a Jenkins-powered server that does functional tests on a pile of major projects (Python, Ruby, PyPy, Vim, Git and Mercurial themselves, some other stuff I'm spacing) to make sure that what goes in also comes out. The test suite is kept tiny enough we can sanely run it locally; the functional test suite takes much longer, so it just runs in batches every hour or so, I believe. I'll double-check with QA and update this if I've got that off in any meaningful sense.

Super stoked to hear about those upcoming posts. What is the guiding strategy for dealing with these complexities? Hold on to everything and only present the relevant ones to the connecting client?

How about git rebasing or force push in general, which rewrites history for both git and mercurial users?

(This is hard to handle for normal git users as well, but it can be nice in some edge cases)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact