
Mercurial 4.0 Sprint Notes - steveklabnik
https://groups.google.com/forum/#!topic/mozilla.dev.version-control/nh4fITFlEMk
======
agentgt
I know this is slightly tangental but I'm always a little shocked that
Facebook (and I think Google to some extent) have massive mono repositories.

The benefits of having one repository do not seem to be worth the serious
performance issues as well as potential coupling that can happen with a
gigantic code base as well also making much more difficult to OSS certain
parts.

e.g. why doesn't FB use dependency management (or binary package management
aka maven, npm, etc)? That is have multiple repositories and cut releases.
Build tools to help cut and manage the releases and dependency graph of the
releases.

There are even plugins and shell scripts that will make Mercurial act like a
mono repository for many small repositories (I use it for our own Java Maven
code base).

I must be missing some killer features and would love to see it in action (FB
major repository).

~~~
jcranmer
Anyone who is old enough to remember the pain of VCSes like CVS and RCS will
note that the number one feature touted to move to any other VCS is invariably
"atomic changesets"\--all the necessary changes to files are listed in a
single changeset. Monorepos are nothing less than remembering the value of
those atomic changesets.

An example of utility of monorepos is things like automation--if you change,
for example, how you publish packages, and you need to maintain multiple
stable branches, it is immensely useful to keep the automation steps in the
same repository as code. If you don't, your automation repository then looks
like

    
    
        if version < 31:
            step_a()
            step_b()
            step_c()
        elif version < 35:
            step_a()
            step_bv2()
            step_d()
        else:
            ...
    

which quickly grows unmaintainable at scale.

~~~
brazzledazzle
I can only imagine that the only way to scale that would to make whoever owns
the repo responsible for automation as well. A dedicated team can provide the
tooling/frameworks (basically a "golden path") but if/how would need to be up
to the team. I think in a general sense this is the approach Netflix takes to
services.

That said I have no idea how Netflix does source control, only that they use
BitBucket/Stash on-prem and that supports both mercurial and git.

------
the_duke
I had no idea Google and FB were dabbling with Mercurial.

I checked it out years ago, but pretty much settled on Git.

What are the advantages?

~~~
kyrra
1) The .git directory doesn't play nicely with mono-repos. Since all files are
just hashed files that live in the .git dir, knowing which files in there are
part of a subtree is hard. On the other side, Mercurial .hg dir uses a tree
structure to track files, so you can do things like NarrowHG[0].

2) As well, Git has multiple client implementation (like git, egit, jgit,
etc...). Adding new features is a bit more complicated as all the
implementations need to add them before they can be more widely used.
Mercurial has one implementation that everyone uses. So new features are
easier to add.

3) The .git structure is simple, which is great, but it's become the API for
git in a way. While mercurial explicitly says you should never rely on the
structure of the .hg directory. If you want to interact with the .hg dir from
other software, you should either issue 'hg' command or start up a command
server[1] to talk with it. So it creates a cleaner API barrier. Because of
this, the Mercurial team can make changes to the .hg dir to better serve
different needs (like those of a mono-repo), without breaking the world.

[0]
[https://bitbucket.org/Google/narrowhg](https://bitbucket.org/Google/narrowhg)

[1] [https://www.mercurial-scm.org/wiki/CommandServer](https://www.mercurial-
scm.org/wiki/CommandServer)

~~~
avar
The .git file format has also changed multiple times, packed refs, multiple
pack file formats etc. There's even WIP ref backends now which store the whole
thing in some embedded database format.

~~~
alblue
There is no ".git" file. It's a directory with a lot of files. Some of the
packing has been storage optimisations but the logical model (objects
identified by hash) has remained the same throughout.

The nice thing about this is you can present the same logical model while
being flexible about the way that model is persisted, unlike Mercurial which
has a fixed file format upon which operations are based.

------
drewg123
What about speed?

Our team just moved from hg to git for an enormous project that has ~25 years
of history (CVS -> SVN -> hg|git). The biggest improvement to my daily life is
that a git pull takes seconds, while an hg pull takes minutes (or even large
fractions of hours when I've spent a week or two away from work).

~~~
PretzelFisch
You don't have much history in git at the moment.

~~~
AbacusAvenger
Didn't he say he imported a repository with 25 years of history?

------
shoover
That was a busy meeting and the developer mailing list is very busy. It's
great to see continued investment from so many interested parties.

Judging by the notes in the wiki [1], however, the purveyors of my preferred
server, Kiln, are not so engaged lately:

> Available hosting solutions: Bitbucket, Kallithea (self-hosted), Kiln (still
> exists?)

I believe it is maintained and even if not maintained would continue to work
for ages, but I suspect we will be held to 3.x for a while.

[1]: [https://www.mercurial-scm.org/wiki/4.0sprint](https://www.mercurial-
scm.org/wiki/4.0sprint)

~~~
marcinkuzminski
What's you best KILN feature ? I wonder if we can adopt it at RhodeCode

~~~
johnjuuljensen
Kilns biggest omission, which is what made me choose Bitbucket, is the lack of
branch pull requests. Kilns commit-based PRs are useless in a feature branch
workflow.

Bitbuckets code review implementation doesn't handle big changes gracefully
though, so I primarily inspect the changes in Beyond Compare.

~~~
gecko
What's hilarious/sad to me is that Kiln started out (as the prototype that won
Django Dash) doing _only_ pull requests. That was literally all it did.
Pushing directly to a repo would instead, behind-the-scenes, make a branch
repo and put your commits _there_. When you accepted the review, it'd
automatically get merged. Kiln would even warn you if you could safely do the
merge without conflicts. This was all back in 2008, and I believe predated
GitHub launching entirely, but certainly predated PRs being common.

Hilariously, we concluded internally that doing things that way was too
complicated/weird for people to use, while GitHub concluded the exact
opposite, and the rest is what you see.

------
cm3
hg absorb sounds very useful. I'd like to know details, since I can imagine
undesired results it might produce.

~~~
mkj
Looks like it uses annotate information, seems pretty handy.

[https://bitbucket.org/facebook/hg-
experimental/src/default/h...](https://bitbucket.org/facebook/hg-
experimental/src/default/hgext3rd/absorb.py)

------
ksec
So is this a comeback of Mercurial?

I know Subversion is pretty much dead, Mercurial was sidelined and Git seems
to have conquered the world. The only thing I think would shake things up a
little would be Perforce going Open Source. But i dont see that happening as
they seems to be very comfortable in their niche.

------
bedros
anyone knows if mercurial subrepo feature is used by big corporations

------
steveklabnik
Relevant quote:

    
    
      > Facebook is writing a Mercurial server in Rust. It will be distributed and
      > will support pluggable key-value stores for storage (meaning that we could
      > move hg.mozilla.org to be backed by Amazon S3 or some such). The primary
      > author also has aspirations for supporting the Git wire protocol on the
      > server and enabling sub-directories to be git cloned independently of a
      > large repo. This means you could use Mercurial to back your monorepo while
      > still providing the illusion of multiple "sub-repos" to Mercurial or Git
      > clients. The author is also interested in things like GraphQL to query repo
      > data. Facebook engineers are crazy... in a good way.

~~~
Stratoscope
A suggestion if I may... Don't use monospaced formatting (two-space prefix) as
a way to put a '>' on every line of a long quote, as it becomes unreadable on
mobile. One '>' at the beginning of each paragraph is good enough (with blank
lines to separate paragraphs), or italics are fine too.

Here's a copy that should be readable on any device:

> Facebook is writing a Mercurial server in Rust. It will be distributed and
> will support pluggable key-value stores for storage (meaning that we could
> move hg.mozilla.org to be backed by Amazon S3 or some such). The primary
> author also has aspirations for supporting the Git wire protocol on the
> server and enabling sub-directories to be git cloned independently of a
> large repo. This means you could use Mercurial to back your monorepo while
> still providing the illusion of multiple "sub-repos" to Mercurial or Git
> clients. The author is also interested in things like GraphQL to query repo
> data. Facebook engineers are crazy... in a good way.

~~~
steveklabnik
This is offtopic so I'll leave it to just this one reply, but thanks! Lack of
quoting in HN's markdown is the only part I have actual frustration with it; I
usually use the two-space + wrap format because it makes it much more clear
that it's a quote; I find your version hard to tell. But that said, I checked
on my phone, and I see what you're saying about mobile. Ugh.

~~~
pohl
HN markdown _does support italics_ , so italicizing entire paragraphs is a
viable alternative to differentiate what you're quoting from your own
comments. I've seen folks do it now & then.

