
Eric Sink on version control - J3L2404
http://spin.atomicobject.com/2010/01/27/january-softwaregr-meeting-eric-sink-on-version-control
======
blasdel
_Eric cited Cisco, with 15 terabytes in source control, and 50 developers
writing perl scripts on top of ClearCase to serve 6000 users. That scale of
problem is something the open source, distributed tools aren’t able to
handle._

Obviously ClearCase can't handle it either if they need 50 developers to
manage it. As much as I despise ClearCase, I don't think the tools, their
development process, or implementation idiom are the problem here. Keeping
15TB of objects in a differential system is a _category error_ (if more than a
few % of that isn't generated or data they've got bigger problems).

~~~
daemin
Not necessarily, since Cisco is a hardware/networking company, I would imagine
that a lot of that data is VHDL (or some other hardware descriptor language),
schematics, and even compiled binary images. They would also be storing the
tools they use to build the products themselves in there (compiler, linker,
assembler, etc).

I would also wager that a lot of the 15TB of data is historical in nature, a
lot of that for products that aren't being made or sold anymore but for which
the source code, compiled images, etc is still stored there.

Version control is not necessarily just for source code or text files, and
what better place to store these extra version-specific artefacts than in the
version control system.

~~~
blasdel
But there's no reason to store those artifact objects as if they were
diff/patch/merge-able -- store them separately as archived read-only blobs.

A lot of people use a standard hierarchical filesystem that gets rsynced
around, a new option would be to store them in something like S3 and keep
their addresses as references in source control.

