
History of version control - 10 astonishments - frabcus
http://www.flourish.org/blog/?p=397
======
tedunangst
I would say it's missing 7.5, 1996 public anonymous cvs. Before that, you were
either one of the developers with a login or you waited until a source tarball
was released. Which makes it really hard for outsiders to contribute because
they're aiming for a moving target they can't see.

~~~
gwern
That was one of the major hallmarks, I seem to recall, of OpenBSD - though we
all remember it for security right now.

~~~
tedunangst
Yes, anoncvs was developed specifically to make openbsd open.

------
_delirium
Some version of #2 might actually not be a horrible idea; companies seem
fairly adept at losing old source code, and physical archives run by competent
librarians are a little bit more durable and organized. Reminds me of a recent
story from engineering (<http://wrttn.in/04af1a>), but there are plenty in
tech as well, where e.g. someone resorts to emailing a former contractor
asking if they have a copy of the source code they wrote on their contract job
three years ago, because the company has somehow lost it.

~~~
tomjen3
I can relate to your link, since I now have a number of hard-drives with a ton
of stuff on them that has been copied back and forth and mixed up so many
times I am not quite sure what is what and which directories are complete (I
have a small suspicion that none of them are). I can't imagine how a large
company can possibly figure this out.

But your solution isn't going to help much. In twenty years, do we even have a
computer that is capable of reading the old floppy disks? Most computers these
days don't come with a floppy drive and the new mac mini doesn't even have a
CD drive. Sure USB may be around, but twenty years ago you would have said the
same thing about the 3.5inch floppy.

Really if you want to store source code like that, you would have to print it
physical paper and store it in a massive archive. And how would you handle
changes?

Do you want to print everything each time you do a svn commit? Or just the
diff (yeah, that is going to be fun to type in)?

A central server, properly organized and upgraded would properly be the best,
but even so it is never going to be very good. In a world were the price of
data is very close to nothing, good metadata seems increasingly expensive.

~~~
mikeash
If you have a dedicated archive with dedicated librarians managing it, they
can be in charge of migrating the archived data forward whenever a particular
storage technology threatens to become obsolete.

~~~
xer0
And they can illuminate the cover titles in calligraphy with little
illustrations of cherubs and such.

<https://en.wikipedia.org/wiki/Illuminated_manuscript#Gallery>

------
rythie
It misses out BitKeeper, which inspired both Git and Mercurial, which was
launched 5 years earlier.

~~~
rst
It gives very short shrift to commercial source control systems generally. Two
(VSS, and SCCS --- the latter part of the original AT&T Unix distributions, on
the same closed-source terms) get mentioned in passing, but neither gets
treated as a milestone.

That would include BitKeeper --- copies were available for a few years at no
cost to Linux kernel developers, but only on increasingly restrictive terms,
which McVoy ultimately wound up revoking altogether. IIRC, Linus released the
first embryonic version of git within weeks after McVoy withdrew the free-as-
in-beer version of BitKeeper.

~~~
Create
Misses also GOOG's dearly Perforce.

~~~
cpeterso
_Google_ 's Perforce?

~~~
TomasSedovic
I read the "GOOG's dearly" part as "dear to Google" not "developed by Google".
Google did use Perforce in the past and they still do, probably.

I think that the GP used the phrase correctly, but not being a native English
speaker myself, I'm not entirely sure.

------
quinndupont
There are actually a number of historians looking at version/source control
(although there still remains a dearth of study, given the importance of the
issue). Michael Mahoney, Michael Cusumano, and N.L. Esmenger are three of the
more important. For a contextualization of the history, see my short
presentation (at UCLA): <http://www.iqdupont.com/networked-modes-of-
production/>

~~~
keithpeter
Scholarship of this kind is important as it gathers secondary references for
future works of synthesis as well as preventing the repetition of mistakes!

Is anyone doing a decent work of _synthesis_ for the history of computing in
general? Something like Judt/Postwar or TARUSKIN/History of Western Music?

------
sehugg
Ugh, SourceSafe. I worked for a company that used it. They had terrible
intermittent file corruption issues. Long story short, we tracked down the
root cause -- an Ethernet cable wrapped too tightly around a power supply
brick, which caused network errors over SMB which led to file corruption. Ugh.

~~~
wglb
I used SourceSafe for the better part of 10 years. The underpinning technology
very much resembled RCS.

But all-in-all, it was quite serviceable for a 75,000 line C++ project. Just
don't try to do branches.

~~~
tikhonj
That's a little bit like saying "All-in-all, this car is quite serviceable.
Just don't try to use third gear."

~~~
wglb
Well, that made me chuckle, and I can't disagree. But to the other comment's
post, it never lost anything for us.

Thank goodness we didn't need to do any branching.

So perhaps we can think of RCS as first gear, VS as second gear, CVS/SVN as
third gear, and git as fourth gear?

------
latchkey
It is interesting to me that Subversion barely gets a mention. There should be
a 7.5 which is along the lines of:

cvs was great and all, but we couldn't version our directories, branching and
merging was a mess, the wire protocol was hard to use, it had a ton of
security holes and the storage format took up too much space. So, Subversion
was created as a way to do a better cvs, without thinking about the larger
intrinsic issues with the current state of version control. Thus, missing the
whole 'distributed' boat and letting Linus eat our lunch.

Note: I worked at CollabNet during this time and watched a lot of the
discussions around Subversion. I have great respect for the Subversion
developers. Karl is an awesome and brilliant guy. It was a bubble, technology
isn't anywhere where it is today and I think we were all misguided at that
time. We all made a lot of mistakes.

~~~
technomancy
The article is about astonishment. Everything SVN did was fairly obvious.

~~~
latchkey
Ha! Definitely can't argue that point. ;-)

------
sendos
_It makes me wonder, what is next? What new astonishing thing will happen in
version control?_

I think what's needed is an intelligent (as in AI) merge mechanism. Right now,
if two people are adding two different features to a set of files, then
merging those changes is error-prone and requires a lot of manual work.

If this ever gets perfected and automated, it will be a huge milestone.

~~~
viraptor
This should be much simpler once we stop using those silly text files and
start storing everything as a proper representation of AST. Then again, at
that point we can get rid of the silly text-diff-based systems and just store
everything as versioned trees.

~~~
gbog
This sounds like a bad idea. Code is text. If ast helps merging diffs, why not
use it for analysis in case of conflict and keep code as text?

~~~
tikhonj
I think it's most accurate to say that code is a textual representation of an
AST. Saying that it's just text is just like saying it's just a bunch of
numbers--both technically true but missing the bigger picture.

One potential reason no to store code as text is that there are many
equivalent programs that differ only in inconsequential text. A perfect
example is trailing whitespace.

There are also some benefits of storing code as an AST. For one, it would make
it trivial to identify commits that did not change the actual code--things
like updated comments. This would help you filter out commits when looking for
bugs. Another benefit would be better organized historical data: in a perfect
system, you would be able to look at the progress of a function even if it got
renamed part of the way through.

~~~
adambyrtek
But then you end up with a version control system that is not generic, but
dependent on a particular language. The story of Smalltalk suggests that the
added value might not be worth the coupling and complexity it requires.

~~~
tikhonj
You should be able to write a generic version control system like this where
you can just plug the appropriate parser in and it would work for that
language. For backup, you could have it still keep some files as text.

------
ScottBurson
I don't know what the date would be, but I think somewhere before "you can
keep lots of versions in one file" should go "you can keep lots of versions
_of_ one file" -- a versioning file system. Not sure when this was introduced;
Wikipedia thinks it may have been in ITS.

I used to work on a large Lisp system where our entire source control system
was provided by Emacs versions and locking on a central NFS server, with some
explicit branching support in the build code, and with version freezing done
by copying directories. I can hear you gagging, dear reader, but actually it
didn't work that badly, except that it didn't handle distributed development.

------
cynwoody
2\. Humans can manually keep track of versions of code! (1960s)

As everything, to begin with there was no software.

    
    
        “At my first job, we had a Source Control department. When you had your code ready to go, you took your floppy disks to the nice ladies in Source Control, they would take your disks, duly update the library, and build the customer-ready product from the officially reposed source.” (Miles Duke)
    

Balderdash! Floppies were first available in 1971. They had decks in the
sixties. They had paper tape. But they didn't have floppies! So, using the
terminology of decks, the author scores a validity check!

~~~
frabcus
Yes, well spotted - I had trouble dating that comment, as I assumed the
reference to changing to RCS just after it was about 1972.

You're right that it talks about floppies, so it can't have been.

Enjoying the few more comments the article has generated with memories of
those earlier days. Would love to see the earlier astonishments written up
more precisely.

------
InclinedPlane
It's easy to forget that we are living in a golden age of version control
systems. The market is rife with many fairly decent commercial systems and
some of the best, state-of-the-art systems are completely free.

~~~
gbog
Sorry to not share your angelism. git seems worshipped here. I never used it
but I used cvs and mercurial a lot, and we are very far from what a
real,versioning should be: completely transparent.

cd dir should propose to update it.

save file should commit it and push it in tmp branch.

~~~
InclinedPlane
I think you might be letting perfect be the enemy of good. Certainly there's a
lot of improvement remaining for version control. Ideally everything (files,
database records, etc.) should be versioned automatically, but that shouldn't
stop us from appreciating the good state we are in today compared to the dark
years when version control for anything was difficult or impossible.

~~~
gbog
Well, I used three version control software: cvs, svn, mercurial. While I
agree that the latest is better, I feel it is a bit of an exageration to say
that previous to git/mercurial, versioning was difficult or impossible. Many
people say git and github are the best invention in the latest years, but I
don't see that it had really changed the industry compared to good ol' cvs.

