So congratulations to Larry and crew -- and damn, were you ever right in 1993! ;)
 Seriously, read this: http://www.landley.net/history/mirror/unix/srcos.html
 The citation here is, in that greatest of all academic euphemisms, "Personal communication."
 "Sourceware" because  predates the term "open source"
It's been an interesting ride and if nothing else, BK was the inspiration for Git and Hg, that's a contribution to the field. And maybe, just maybe, people will look at the SCCS weave and realize that Tichy pulled the wool over our eyes. SCCS is profoundly better.
Was BitKeeper the first version control system to "think distributed" ?
My buddies in the kernel group were actually starting to quit because they were forced to use the NSE and it made them dramatically less productive. Nerds hate being slowed down.
Once the whole SCM thing crossed my radar screen I was hooked. Someone had a design for how you could have two SCCS files with a common ancestry and they could be put back together. I wrote something called smoosh that basically zippered them together.
Nobody cared. So I looked harder at the NSE and realized it was SCCS under the covers. I built a pile of perl that gave birth to the clone/pull/push model (though I bundled all of that into one command called resync). It wasn't truly distributed in that the "protocol" was NFS, I just didn't do that part, but the model was the git model you are used to now minus changesets.
I made all that work with the NSE, you could bridge in and out and one by one the kernel guys gave up on NSE and moved to nselite. This was during the Solaris 5.0 bringup.
I still have the readme here: http://mcvoy.com/lm/nselite/README
and here are some stats from the 2000th resync inside of Sun:
I was forced to stop developing nselite by the VP of the tools group because by this time Sun knew that nselite won and NSE lost so they ramped up a 8 person team to rewrite my perl in C++ (Evan later wrote a paper basically saying that was an awful idea). They took smoosh.c and never modified it, just stripped my history off (yeah, some bad blood).
Their stuff wasn't ready so I kept working but that made them look bad, one guy with some perl scripts outpacing 8 people with a supposedly better language. So their VP came over and said "Larry, this went all the way up to Scooter, if you do one more release you're fired" and set back SCM development almost a decade, that was ~1991 and I didn't start BitKeeper until 1998. There is no doubt in my mind that if they had left me alone they would have the first DVCS.
Fun times, I went off and did clusters in the hardware part of the company.
Also, even if privately... you need to name that VP. ;)
What about shell scripts?
Is this paper available online? Thanks.
And for the record, Evan was somewhat justified in not saying I had anything to do with Teamware since I made his team look like idiots, ran circles around them. On the other hand, taking smoosh.c and removing my name from the history was dishonest and a douche move. Especially since not one person on that team was capable of rewriting it.
The fact remains that Teamware is just a productized version of NSElite which was written entirely by me.
If I sound grumpy, I am. Politics shouldn't mess with history but they always do.
Sun, could, at least, make a profit building workstations and servers and licensing chips.
It's actually very sad they don't build those SPARC desktops anymore.
I liked the desktops but there's no money in them. Market always rejects it. So does FOSS despite it being the only open ISA with mainstream, high-performance implementations.
There does not need to be demand: Steve Jobs (in)famously said that where there was no market, "create one".
I for one would absolutely love to be able to buy an illumos-powered A4-sized tablet which ran a SPARC V9 instruction set, plugged into a docking station, and worked with a wireless keyboard and mouse to be used as a workstation when I'm not walking around with my UNIX server in hand. Very much akin to Apple Computer's iPad Pro (or whatever they call it, I don't remember, nor is that really relevant).
But the most important point was, and still is, and always will be: it has to cost as much as the competition, or less. Sun Microsystems would just not accept that, no matter how much I tried to explain and reason with people there: "talk to the pricing committee". What does that even mean?!? Was the pricing committee composed of mute, deaf and blind people who were not capable of seeing that PC-buckets were eating Sun's lunch, or what?
What people forget is that Steve Jobs was a repeated failure at doing that, got fired, did soul-searching, succeeded with NEXT, got acquired, and then started doing what you describe. Even he failed more than he succeeded at that stuff. A startup trying to one-off create a market just for a non-competitive chip is going to face the dreaded 90+% failure rate.
"But the most important point was, and still is, and always will be: it has to cost as much as the competition, or less."
That's why the high-security stuff never makes it. It takes at least 30% premium on average per component. I totally believe your words fell on deaf ears at Sun. I'd have bought SunBlades myself if I could afford them. I could afford nice PC's. So, I bought nice PC's. Amazing that echo chamber was so loud in there that they couldn't make that connection.
"I for one would absolutely love to be able to buy an illumos-powered A4-sized tablet which ran a SPARC V9 instruction set"
That's actually feasible given the one I promote is 4-core, 1+GHz embedded chip that should be low power on decent process node.
The main issue is the ecosystem and components like browsers with JIT's that must be ported to SPARC. One company managed to port Android to MIPS but that was a lot of work. Such things could probably be done for SPARC as well. The trick is implementing the ASIC, implementing the product, porting critical software, and then charging enough to recover that but not more than competition whose work is already done for them. Tricky, tricky.
Raptor's Talos Workstation, if people buy it, will provide one model that this might happen. Could get ASIC's on 45-65nm really quick, use SMP given per-chip cost is $10-30, port Solaris/Linux w/ containers, put in a shitload of RAM, and sell it for $3,000-6,000 for VM-based use and development. It would still take thousands of units to recover cost. Might need government sales.
Basically, you want something to suit you and a small number of other people, but you won't pay for the cost of having something that "tailor made". You will only pay for the high-volume, lower-cost, more general product. So... that's all you get.
As long as the code for those processors remains free, and a license to implement a SPARC ISA compliant processor only costs $50, the SPARC will never really, truly be gone, especially not for those people capable of synthesizing their own FPGA's, or even building their own hardware.
Some people did exactly that, a while back. Too bad they didn't turn their designs into ready-to-buy servers.
That was exciting. It could do well even on a 200MHz FPGA given threading performance. Then, use eASIC's Nextreme's to convert it to structured ASIC for better speed, power-usage, and security from dynamic attacks on FPGA. That it's Oracle and they're sue-happy concerns me. I'd read that "GPL" license very carefully just in case they tweaked it. If it's safe, then drop one of those badboys (yes, the T2) on the best node we can afford with key I/O. Can use microcontrollers on the board for the rest as they're dirt cheap. Same model as Raptor's as I explained in another comment.
Alternatively, use Gaisler as Leon3 is GPL and designed for customization. Simple, too. Leon4 is probably inexpensive compared to ARM, etc.
"SPARC ISA compliant processor only costs $50, the SPARC will never really, truly be gone, especially not for those people capable of synthesizing their own FPGA's,"
"Too bad they didn't turn their designs into ready-to-buy servers."
Not quite a server but available and illustrates your point:
Btw, I found this accidentally while looking for a production version of OpenSPARC:
Yeah, buddy, they're like magic hardware. Even if they aren't ASIC-competitive for the best ASIC's. Still magic with a market share and diverse applications that shows it. :)
FPGA's are cheap and good enough for prototyping; once one has a working VHDL / Verilog code, it's tapeout time.
Now, for remote attacks, you can embed RF circuitry in them that listens to any of that. You can embed circuits that receive incoming command, then dump its SRAM contents. You might modify the I/O circuitry to recognize a trapdoor command that runs incoming data as privileged instructions. You can put a microcontroller in there connected to PCI to do the same for host PC attacks. I know, that would be first option but I was having too much fun with RTL and transistor level. :)
High-end chips only need to compete with other high-end chips. And low-end SPARC will not take off now that x86 has taken over.
You can't just compete directly in that market: you have to convince them yours is worth buying for less performance at higher price and watts. Itanium tried with reliability & security advantages. Failed. Fortunately, Dover is about to try with RISC-V combined with SAFE architecture (crash-safe.org) for embedded stuff. We'll see what happens there.
ref: https://en.wikipedia.org/wiki/Revision_Control_System, https://en.wikipedia.org/wiki/Walter_F._Tichy
Just as an example, RCS could have been faster than SCCS if they had stored at the top of the file the offset to where the tip revision starts. You read the first block, then seek past all the stuff you don't need, start reading where the tip is.
But RCS doesn't do that, it reads all the data until it gets to the tip. Which means it is reading as much data as SCCS but only has sorta good performance for the tip. SCCS is more compat and is as fast or faster for any rev.
And BK blows them both away, we lz4 compress everything which means we do less I/O.
RCS sucked but had good marketing. We're here to say that SCCS was a better design.
Why does bitmover have the guts now?
SCCS is a "weave". The time to get the tip is the same as the time to get the first version or any version. The file format looks like
this is the first line in the first version.
Now lets say you added another line in version 2:
this is the first line in the first version.
this is the line that was added in the second version
So any version is the same time. And that time is fast, the largest file in our source base is slib.c, 18K lines, checks out in 20 milliseconds.
Where RCS or indeed git would have handled this reasonably well (indeed the xdelta used for git packfiles would have eaten it for lunch with no trouble), in SCCS, or anything weave-based, it was an utter disaster. Every checkin doubled the number of weaves in the file, an exponential growth without end which soon led to multigigabyte files which xdelta could have represented as megabytes at most. Every one-byte addition or removal doubled up everything from that point on.
And here's where the terribleness of the 'every version takes the same time' decision becomes clear. In a version control system, you want the history of later versions (or of tips of branches) overwhelmingly often: anything that optimizes access time for things elsewhere in the history at the expense of this is the wrong decision.
When I left, years before someone more courageous than me transitioned the whole appalling mess to git, our largest file was 14GiB and took more than half an hour to check out.
The SCCS weave is terrible. (It's exactly as good a format as you'd expect for the time, since it is essentially an ed script with different characters. It was a sensible decision for back then, but we really should put the bloody thing out of its misery, and ours.)
I don't agree that the weave is horrible, it's fantastic for text. Try git blame on a file in a repo with a lot of history then try the same thing in BK. Orders and orders of magnitude faster.
And go understand smerge.c and the weave lightbulb will come on.
The fact that the largest file you mention is frankly tiny shows why your performance was good: we had ~50,000 line text files (yeah, I know, damn copy-and-paste coders) with a thousand-odd revisions and a resulting SCCS filesize exceeding three million lines, and every one of those lines had to be read on every checkout: dozens to hundreds of megabytes, and of course the cache would hardly ever be hot where that much data was concerned, so it all had to come off the disk and/or across NFS, taking tens of seconds or more in many cases. RCS could have avoided reading all but 50,000 of them in the common case of checkouts of most recent changes. (git would have reduced read volume even more because although it is deltified the chains are of finite length, unlike the weave, and all the data is compressed.)
50K lines is not even 3x bigger than the file I mentioned. Which we check out in 20 milliseconds.
As for optimizing blame, you are missing the point, it's not blame, it's merge, it's copy by reference rather than copy by value.
(And yes, optimizing merge matters too, indeed it was a huge part of git's raison d'etre -- but, again, one usually merges with the stuff at the tip of tree: merging against something you did five years ago is rare, even if it's at a branch tip, and even rarer otherwise. Having to rewrite all the unmodified ancient stuff in the weave merely because of a merge at the tip seems wrong.)
(Now I'm tempted to go and import the Linux kernel or all of the GCC SVN repo into SCCS just to see how big the largest weave is. I have clearly gone insane from the summer heat. Stop me before I ci again!)
Doesn't seem bad to me. The weave is big for binaries, we imported 20 years of Solaris stuff once and the history was 1.1x the size of the checked out files.
this is the first line in the first version.
this is the line that was added in the second version
The delete needs to be an envelope around the insert so you get
this is the first line in the first version.
this is the line that was added in the second version
this is the first line in the first version.
this is the line that was added in the second version
The thing bzr didn't care about, sadly, is performance. An engineer at Intel once said to me, firmly, "Performance is a feature".
Performance as a feature, OTOH, is one of Linus's three tenets of VCS. To quote him, "If you aren't distributed, you're not worth using. If you're not fast, you're not worth using. And if you can't guarantee that the bits I get out are the exact same bits I put in, you're not worth using."
Performance indeed killed bzr. Git was good enough and much faster, so people just got used to its weirdness.
And boy is git weird! In Mercurial, I can mess with the file all day long after scheduling it for a commit, but one can forget that in git: marking a file for addition actually snapshots a file at addition time, and I have read that that is actually considered a feature. It's like I already committed the file, except that I didn't. This is the #1 reason why I haven't migrated from Mercurial to git yet, and now with Bitkeeper free and open source, chances are good I never will have to move to git. W00t!!!
I just do not get it... what exactly does snapshotting a file before a commit buy me?
I often amend my latest commit as a way to build a set of changes without losing my latest functional change.
OTOH, I find that behavior weird as I regularly add files to the index as I work. If a test breaks and I fix it, I can review the changes via git diff (compares the index to the working copy) and then the changes in total via git diff HEAD (compares the HEAD commit to the working copy).
Really old school revision control systems, like CDC's MODIFY and Cray's clone UPDATE, were kind of like SCCS. Each line (actually card image!) was tagged with the ids of the mods that created and (if no longer active) deleted it.
Do you have references? I've heard of these but haven't come across details after much creative searching since they are common words.
I remember looking at IT jobs back then, and seeing a business world covered in Windows NT machines; I even got my MCSE (alongside some UNIX certifications that I was more excited about), because of it. Looking at jobs now, the difference is remarkable, to say the least. Nearly every core technology a system administrator needs to know is Open Source and almost certainly running on Linux.
And, the funny thing is that the general prescription (make a great Open Source UNIX) is exactly what it took to save UNIX. It just didn't involve any of the big UNIX vendors in a significant way (the ones spending a gazillion dollars on UNIX development at the time). Linux got better faster than Sun got smarter, and ate everybody's lunch, including Microsoft. Innovator's Dilemma strikes again.
Apple is an interesting blip on the UNIX history radar, too...though, they're likely to lose to the same market forces in the end, as phones become commodities. I'm a bit concerned that it's going to be Android, however, that wins the mobile world since Android is nowhere near the ideal OS from an Open Source and ethical perspective; but, I guess they got the bits right that Larry was suggesting needed to be right.
Anyway, it was a weird flashback to read that article. Things change, and on a scale that seems slow, until you look back on it, and see it's "only" been a couple of decades. In the grand scheme of things, and compared to the motion of technology prior to the 1900, that's a blink of an eye.
But yeah, it was weird that everybody thought NT was going to be the future. And now, MS has opensourced a good deal of infrastructure, is working with Node, has announced an integrated POSIX environment for Windows. And since it's in corporate, it might even be able to fix the fork(2) performance problems.
I sincerely consider Linux a great UNIX. Probably the best UNIX that's existed, thus far. There are warts, sure. Technically, Solaris had (and still has, in IllumOS and SmartOS) a small handful of superior features and capabilities (at this point one can list them on one hand, and one could also list some "almost-there" similar techs on Linux). But, I assume you've used Solaris (or some other commercial UNIX) enough to have an opinion...can you honestly say you enjoyed working on it more than Linux? The userspace on Solaris was always drastically worse than Linux, unless you installed a ton of GNU utilities, a better desktop, etc. But, Linux brought us a UNIX we could realistically use anywhere, and at a price anyone could afford. That's a miracle for a kid that grew up lusting after an Amiga 3000UX (because it was the closest thing to an SGI Indy I could imagine being able to afford).
Oh, and don't flame me for speaking in ignorance. I've been a Linux user for half a decade at least now, and I CAN say I see problems with it. I can also say, as a person who is programmer, that some of the things that Cantrill pointed out are actually evil. Note, however, that I don't claim Solaris, or any other OS is better. Every UNIX is utterly fucked in some respect. I just know Linux's flaws the best.
By the way, I've been trying to get Amiga emulation working for a while. It basically works at this point, but the *UAEs are a misery on UNIX systems. Without any kind of loader, you have to spend 10 minutes editing the config every time you want to play a game. But if you're in any way interested in the history you lived through, check out youtube.com/watch?v=Tv6aJRGpz_A
But now my beard is showing and I'm ranting. My point is this: it took something from completely outside of the commercial UNIX ecosystem, so far out in left field that none of the UNIX bosses (or Microsoft) saw it as a threat until it was far too late...and it took something that was good, really good in at least some regards, that it would have passionate fans even very early in. Linux did that. And, compared to everything else (pretty much everything else that's ever existed, IMHO), it's great.
And, I'm on board the retro computing bandwagon. I have a real live Commodore 64 and an Atari 130xe. I'd like to one day find an Amiga 1200 in good shape, but because I live in an RV and travel fulltime, I don't have a lot of room to spare. But I do like to tinker and reminisce.
Truer words have ne'er been spoken. My first big boy job involved building and maintaining a large open source stack on top of AIX. These days I occasionally experience hiccups related to OpenBSD not being Linux. Problems aren't even in the same league. That said, the thrill of getting stuff to work on AIX was certainly greater (and purchased with more human suffering).
And I'd love to have some real retro computers, but I've got no money, and most of the really interesting ones are from the UK. Ah well...
Okay then, SmartOS. Why is an exercise left for the reader, because it would just take too much and too long to list and explain all the things it does better, faster and cheaper than Linux in server space; that's material rife for an entire book.
> can you honestly say you enjoyed working on it more than Linux?
Enjoyed it?!? Love it, I love working with Solaris 10 and SmartOS! It's such a joy not having a broken OS which actually does what it is supposed to do (run fast, be efficient, protect my data, is designed to be correct). When I am forced to work with Linux (which I am, at work, 100% of the time, and I hate it), it feels like I am on an operating system from the past century: ext3 / ext4 (no XFS for us yet, and even that is ancient compared to ZFS!), memory overcommit, data corruption, no backward compatibility, navigating the minefield of GNU libc and userland misfeatures and "enhancements". It's horrible. I hate it.
> The userspace on Solaris was always drastically worse than Linux,
Are you kidding me? System V is great, it's grep -R and tar -z that I hate, because it only works on GNU! Horrid!!!
> But, Linux brought us a UNIX we could realistically use anywhere, and at a price anyone could afford.
You do realize that if you take an illumos derived OS like SmartOS and Linux, and run the same workload on the same cheap intel hardware, SmartOS is usually going to be faster, and if you are virtualizing, more efficient too, because it uses zones, right? Right?
It's like this: when I run SmartOS, it's like I'm gliding around in an ultramodern, powerful, economical mazda6 diesel (the 175 HP / 6 speed Euro sportwagon version); I slam the gas pedal and I'm doing 220 km/h without even feeling it and look, I'm in Salzburg already! When I'm on Linux, I'm in that idiotic Prius abomination again: not only do I not have any power, but I end up using more fuel too, even though it's a hybrid, and I'm on I-80 somewhere in Iowa. That's how I'd compare SmartOS to Linux.
"Tout ce qui est excessif est insignifiant"
Edit: Bryan Cantrill, spelled it wrong.
I think the assumption you're making is that people choose Linux out of ignorance (and, I think the ignorance goes both ways; folks using Solaris have been so accustomed to Zones, ZFS, and dtrace being the unique characteristic of Solaris for so long that they aren't aware of Linux' progress in all of those areas). But, there are folks who know Solaris (and its children) who still choose Linux based on its merits. We support zones in our products/projects (because Sun paid for the support, and Joyent supported us in making Solaris-related enhancements), and until a few years ago it was, hands-down, the best container technology going.
Linux has a reasonable container story now; the fact that you don't like how some people are using it (I think Docker is a mess, and I assume you agree) doesn't mean Linux doesn't have the technology for doing it well built in. LXC can be used extremely similarly to Zones, and there's a wide variety of tools out there to make it easy to manage (I work on a GUI that treats Zones and LXC very similarly, and you can do roughly the same things in the same ways).
"Are you kidding me? System V is great, it's grep -R and tar -z that I hate, because it only works on GNU! Horrid!!!"
Are you really complaining about being able to gzip and tar something in one command? Is that a thing that's actually happening in this conversation?
I'll just say I've never sat down at a production Sun system that didn't already have the GNU utilities installed by some prior administrator. It's been a while since I've sat down at a Sun system, but it was standard practice in the 90s to install GNU from the get go. Free compiler that worked on every OS and for building everything? Hell yes. Better grep? Sure, bring it. People went out of their way to install GNU because it was better than the system standard, and because it opened doors to a whole world of free, source-available, software.
"You do realize that if you take an illumos derived OS like SmartOS and Linux, and run the same workload on the same cheap intel hardware, SmartOS is usually going to be faster"
Citation needed. Some workloads will be faster on SmartOS. Others will be faster on Linux. Given that almost everything is developed and deployed on Linux first and most frequently, I wouldn't be surprised to see Linux perform better in the majority of cases; but, I doubt it's more than a few percent difference in any common case. The cost of having or training staff to handle two operating systems (because you're going to have to have some Linux boxes, no matter what) probably outweighs buying an extra server or two.
"and if you are virtualizing, more efficient too, because it uses zones, right? Right?"
Citation needed, again. Zones are great. I like Zones a lot. But, Linux has containers; LXC is not virtualization, it is a container, just like Zones. Zones has some smarts for interacting with ZFS filesystems and that's cool and all, but a lot of the same capabilities exist with LVS and LXC.
I feel like you're arguing against a straw man in a lot of cases here.
Why do you believe LXC (or other namespace-based containers on Linux) are inherently inefficient, compared to Zones, which uses a very similar technique to implement?
And, it's not Linux' fault the systems you manage are stuck on ext4. There are other filesystems for Linux; XFS+LVM is great. A little more complex to manage than ZFS, but not by a horrifying amount. So, you have to read two manpages instead of one. Not a big deal. And, there's valid reasons the volume management and filesystem features are kept independent in the kernel (I dunno if you remember the discussions about ZFS inclusion in Linux; separate VM and FS was a decision made many years ago, based on a lot of discussion). Almost any filesystem on Linux has LVM, so, filesystems on Linux get snapshots and tons of other features practically for free. That's pretty neat.
Anyway, I think SmartOS is cool. I tinker with it every now and then, and have even considered doing something serious with it. But, I just don't find it compellingly superior to Linux. Certainly not enough to give up all of the benefits Linux provides that SmartOS does not (better package management, vast ecosystem and community, better userland even now, vastly better hardware support even on servers, etc.).
All that said, Sun had an ethos of caring. In the early days of bitmover amy had some quote about the sun man pages versus the linux man pages, if someone can find that, it's awesome. We keep a sun machine in our cluster just so we can go read sane man pages about sed or ed or awk or whatever. Linux man pages suck.
Sun got shoved into having to care about System V and it sucked. I hated it and left, so did a bunch of other people. But Sun carried on and the ethos of caring carried on and Bryan and crew were a big part of that. My _guess_ is that Solaris and its follow ons are actually pleasant. I'll be pissed if I install it and it doesn't have all the GNU goodness. If that's the case then you are right, they don't get it.
What I expect to see is goodness plus careful curating. That's the Sun way.
SmartOS is nice. I've always thought so and I have a lot of respect for the folks working on it. But, it isn't nice enough to overcome the negatives of being a tiny niche system. Linux has orders of magnitude more people working on it (and many of those people are also very smart). That's hard to beat.
How about simple logic instead? I know zones work, because they have been in use in the enterprises since 2006, and they are easy to work with and reason about; if I have the same body of software available on a system with the original lightweight virtualization as I do on Linux, and my goal is data integrity, self-healing, and operational stability, what is my incentive to running a conceptual knock-off copy of zones, LXC? To me, the choice is obvious: design the solution on top of the tried and tested, original substrate, rather than use a knock-off, especially since the acquisition cost of both is zero, and I already know from experience that investing in zones pays profit and dividends down the road, because I ran them before in production environments. I like profits, and the only thing I like better than engineering profits are engineering profits with dividends. That, and sleeping through my nights without being pulled into emergency conference calls about some idiotic priority 1 incident. Incident which could have easily been avoided altogether, if I had been running on SmartOS with ZFS and zones. Based on multiple true stories, and don't even get me started on the dismal redhat "support", where redhat support often ends up in a shootout with customers, rather than fixing customer's problems, or being honest and admitting they do not have a clue what is broken where, nor how to fix it.
> And, it's not Linux' fault the systems you manage are stuck on ext4. There are other filesystems for Linux; XFS+LVM is great.
Did you know that LVM is an incomplete knock-off of HP-UX's LVM, which in turn is a licensed fork of Veritas' VxVM? Again, why would I waste my precious time, and run up financial engineering costs running a knock-off, when I can just run SmartOS and have ZFS built in? The logic does not check out, and financial aspects even less so.
On top of that, did you know that not all versions of the Linux kernel provide LVM write barrier support? And did you know that not all versions of the Linux kernel provide XFS write barrier support (XFS at least will report that, while LVM will do nothing and lie that the I/O made it to stable storage, when it might still be in transit)? And did you know that to have both XFS and LVM support write barriers, one needs a particular kernel version, which is not supported in all versions of RHEL? And did you know that not all versions of LVM correctly support mirroring, and that for versions which do not require a separate logging device, the log is in memory, so if the kernel crashes, one experiences data corruption? And did you know that XFS, as awesome as it is, does not provide data integrity checksums?
And we haven't even touched upon systemd knock-off of SMF, nor have we touched upon lack of fault management architecture, nor have we touched upon how insane bonding of interfaces is in Linux, nor have we touched upon how easy it is to create virtual switches, routers and aggregations (trunks in CISCO parlance) using Crossbow in Solaris/illumos/SmartOS... when I wrote that there is enough material for a book, I was not trying to be funny.
It sounds like a Linux fan ranting circa 1995 because that is precisely what it is: first came the rants. Then a small, underdog company named "redhat" started providing regular builds and support, while Linux was easily accessible, and subversively smuggled into enterprises. Almost 20 years later, Linux is now everywhere.
Where once there was Linux, there is now SmartOS; where once there was redhat, there is now Joyent. Where once one had to download and install Linux to run it, one now has but to plug in an USB stick, or boot SmartOS off of the network, without installing anything. Recognize the patterns?
One thing is different: while Linux has not matured yet, as evidenced, for example, by GNU libc, or by GNU binutils, or the startup subsystem preturbations, SmartOS is based on a 37 years old code base which has matured and reached operational stability about 15 years ago. The engineering required for running the code base in the biggest enterprises and government organizations has been conditioned by large and very large customers having problems running massive, mission critical infrastructure. That is why for instance there are extensive, comprehensive post-mortem analysis as well as debugging tools, and the mentality permeates the system design: for example, ctfconvert runs on every single binary and injects the source code and extra debugging information during the build; no performance penalty, but if you are running massive real-time trading, a database or a web cloud, when going gets tough, one appreciates having the tools and the telemetry. For Linux that level of system introspection is utter science fiction, 20 years later, in enterprise environments, in spite of attempts to the contrary. (Try Systemtap or DTrace on Linux; Try doing a post-mortem debug on the the kernel, or landing into a crashed kernel, inspecting system state, patching it on the fly, and continuing execution; go ahead. I'll wait.) All that engineering that went into Solaris and then illumos and now SmartOS has passed the worst trials by fire at biggest enterprises, and I should know, because I was there, at ground zero, and lived through it all.
All that hard, up-front engineering work that was put into it since the early '90's is now paying off, with a big fat dividend on top of the profits: it is trivial to pull down a pre-made image with imgadm(1M), feed a .JSON file to vmadm(1M), and have a fully working yet completely isolated UNIX server running at the speed of bare metal, in 25 seconds or less. Also, let us not forget almost ~14,000 software packages available, most of which are the exact same software available on Linux. If writing shell code and the command line isn't your cup of tea, there is always Joyent's free, open source SmartDC web application for running the entire cloud from a GUI.
Therefore, my hope is that it will take less than 18 years that it took Linux for SmartOS to become king, especially since cloud is the new reality, and SmartOS has been designed from the ground up to power massive cloud infrastructure.
> I think the assumption you're making is that people choose Linux out of ignorance
That is not an assumption, but rather my very painful and frustrating experience for the last 20 years. Most of those would-be system administrators came from Windows and lack the mentoring and UNIX insights.
> (and, I think the ignorance goes both ways; folks using Solaris have been so accustomed to Zones, ZFS, and dtrace being the unique characteristic of Solaris for so long that they aren't aware of Linux' progress in all of those areas).
I actually did lots and lots of system engineering on Linux (RHEL and CentOS, to be precise) and I am acutely aware of the limitations when compared to what Solaris based operating systems like SmartOS can do: not even the latest and greatest CentOS nor RHEL can even guarantee me basic data integrity, let alone backwards compatibility. Were we in the '80's right now, I would be understanding, but if after 20 years a massive, massive army of would-be developers is incapable of getting the basic things like data integrity, scheduling, startup/shutdown or init subsystem working correctly, in the 21st century, I have zero understanding and zero mercy. After all, my time as a programmer and as an engineer is valuable, and there is also financial cost involved, that not being negligible either.
> Linux has a reasonable container story now; the fact that you don't like how some people are using it (I think Docker is a mess, and I assume you agree)
Yes, I agree. The way I see it, and I've deployed very large datacenters where the focus was operational stability and data correctness, Docker is a web 2.0 developer's attempt to solve those problems, and they are flapping. Dumping files into pre-made images did not compensate for lack of experience in lifecycle management, or lack of experience in process design. No technology can compensate for lack of a good process, and good process requires experience working in very large datacenters where operational stability and data integrity are primary goals. Working in the financial industry where tons of money are at stake by the second can be incredibly instructive and insightful when it comes to designing operationally correct, data-protecting, highly available and secure cloud based applications, but the other way around does not hold.
> Are you really complaining about being able to gzip and tar something in one command? Is that a thing that's actually happening in this conversation?
Let's talk system engineering:
gzip -dc archive.tar.gz | tar xf -
will work everywhere; I do not have to think whether I am on GNU/Linux, or HP-UX, or Solaris, or SmartOS, and if I have the above non-GNU invocation in my code, I can guarantee you, in writing, that it will work everywhere without modification. If on the other hand I use:
tar xzf archive.tar.gz
I cannot guarantee that it will work on every UNIX-like system, and I know from experience I would have to fix the code to use the first method. Therefore, only one of these methods is correct and portable, and the other one is a really bad idea. If I understand this, then why do I need GNU? I do not need it, nor do I want it. Except for a few very specific cases like GNU Make, GNU tools are actually a liability. This is on GNU/Linux, to wit:
% gcc -g hello.c -o hello
% gdb hello hello.c
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-45.el5)
Dwarf Error: wrong version in compilation unit header (is 4, should be 2) [in module /home/user/hello]
"/home/user/hello.c" is not a core dump: File format not recognized
On top of that, on HP-UX and Solaris I have POSIX tools, so for example POSIX extended regular expressions are guaranteed to work, and the behavior of POSIX-compliant tools is well documented, well understood, and guaranteed. When one is engineering a system, especially a large distributed system which must provide data integrity and operational stability, such concerns become paramount, not to mention that the non-GNU approach is cheaper because no code must be fixed afterwards.
Personally, I run linux on my desktop. Insane, I know, but I can't afford mac, and the OSX posix environment just keeps getting worse. Jeez. At this rate, Cygwin and the forthcoming POSIX environment from MS will be better. But anyways, I'm not switching my system to BSD or Illumos anytime soon, despite thinking that they are Pretty Cool (TM). Why? The binary footprint. Mostly Steam. Okay. Pretty much just Steam. Insane, I know, but I'm not running (just) a server.
So in summary, all software sucks, and some may suck more than others, but I'm not gonna care until Illumos and BSD get some love from NVIDIA.
And why do you care what a crazy person thinks? Oh, you don't. By all means continue the holy war. Grab some performance statistics, and hook in the BSDs. I'll be over heating up the popcorn...
> And the fact that KVM is from linux.
Actually that's great that it's from Linux, because one major point of embarrassment for Linux is that KVM runs faster and better on SmartOS than it does on Linux, because Joyent engineers systematically used DTrace during the porting effort:
> Insane, I know, but I can't afford mac
Once you're able to afford one, you won't care about the desktop every again, because your desktop will JustWork(SM).
> but I'm not gonna care until Illumos and BSD get some love from NVIDIA.
NVIDIA provides regular driver updates for both Solaris and BSD. I bought a NVidia GX980TX, downloaded the latest SVR4 package for my Solaris 10 desktop, and one pkgrm && pkgadd later, I was running accelerated 3D graphics on the then latest-and-greatest accelerator NVIDIA had for sale.
> By all means continue the holy war. Grab some performance statistics, and hook in the BSDs.
They take our code, and we take theirs; they help us with our bugs, and we help them with theirs. BSD's are actually great. BSD's have smart, capable, and competent engineers who care about engineering correct systems and writing high quality code. We love our BSD brethren.
But good to know that the Solaris and BSD NVIDIA drivers work. If they work with lx-branding, I might actually consider running the thing.
>Once you're able to afford one, you won't care about the desktop every again, because your desktop will JustWork(SM).
Yeah, no. It seems like OSX is making increasingly radical changes that make it increasingly hard for applications expecting standard POSIX to run. By the time I get the cash, Nothing will work right.
They've already won; Apple isn't coming back (did they "go thermonuclear" in the end or did that nonsense die with Mr Magical Thinking?).
Don't confuse Android with Google; you can grab the source and do what you want with it, like Cyanogenmod have, or like millions of hobbyists are doing themselves. It's all available with bog standard open source licenses - no need to worry about ethics.
In case of software, it is never too late: as you once put it so eloquently, software does not suddenly stop working and does not have an expiration date.
If this software works and works well, then Paul Graham's revolutionary idea of when you choose technology, you have to ignore what other people are doing, and consider only what will work the best applies. (Common sense really, but apparently not to the rest of our industry.)
If this software will work the best, and do exactly what I want and need it to do, I have enough experience to know not to care that everyone else runs something like git just because that is trendy right now. (A lesson appreciated by those who run SmartOS because it is the best available technology for virtualization, cloud, and performance, instead of running Linux and Docker.)
xscreensaver does. (-:
Yes, you're absolutely right that there are a ton of startups built on opensolaris (who have proprietary code they haven't and don't intend to ever give back to the community), and there is smartos/omnios/illumos as well. But none of those projects would have in any way contributed to the health of Sun Microsystems, nor provided the funding to get Solaris to where it is today. ZFS may have never seen the light of day if Solaris were open sourced in 1995.
It depends on how that would have affected Jeff Bonwick. If it kept him from deciding that Sun ought to develop a new filesystem, promising Matthew Ahrens a job writing one out of college and working together with Matt on it, ZFS would never have existed.
I also told him that he'd go way farther at Sun than I did and I was right, I think he made DE, I didn't. He played the game better. Smart guy. Him, Bryan, Bill Moore, those guys were the new Sun in my mind.
You may have done more by open sourcing whatever it is that you have done; if so congrats.
At work, many of my newer colleagues have backgrounds in closed source software development. We are developing software that has no exact analog to existing software, and we hope that it will have a big impact. If it becomes as important as we think it could be, then bit keeper vs git is a fantastic example of why our work should be open source from the start.
Linus brought DVCS to the masses, but to pretend there wasn't more at play than simply open sourcing a project and hoping it all works out is complete rubbish. People have families to feed. Closed source is not inherently evil.
It takes a unique situation to produce something like git that's product is beyond the sum of the project itself.
Open sourcing those patches rather than keeping them to myself made a difference that was greater than anything I imagined. Similarly, the impact of making ZFS open source far surpassed the expectations of the original team at Sun. I think that making any worthwhile piece of software open source will lead to adoption beyond the scope of what its authors envisioned. All it takes is people looking for something better than what previously existed.
As for closed source being inherently evil (your words, not mine), how do you fix bugs in closed source software that a vendor is not willing to fix? How do you catch things like a hard coded password that gives root privileges? How do you know that the software is really as good as they say? It is far easier with open source software than with closed source software. Closed source software is a bad idea.
I grew this to a place where I could pay salaries. Doing so was super super hard. I had a lot of scary nights where I thought I couldn't make payroll. Just building up to a place where the next payroll was OK was a big deal for us.
So open source it? When you finally got to the point where you can pay people without worrying all night?
I get that you see that open source is the answer, and it is for some stuff. For me, jumping on that years ago was asking too much.
After Andrew Tridgell (SAMBA, among other projects) reverse-engineered the bitkeeper protocol  in order to create his own client, the license was rescinded for everyone.
As a result, Linus wrote git.
And mpm wrote hg, never forget:
Now, I know that that's in essence an admission of defeat! I'm not pretending that it's anything less than that. However, this is also the easiest explanation of, and model for, branching that anyone has ever created. The explanation of branching which the nontechnical user immediately understood was, in fact, just having a couple of these repository-folders sitting side by side with different names, `current_version` and `new_feature`. It is a model of branching that is so innocent and pure and unsullied by the world that a nontechnical person got it with only a couple of questions.
Like I said, I'm actually employed at an SVN shop, where branches are other folders in the root folder and the workflow is less "push this to the testing server" and more "commit to the repository, SSH into the testing server, and then update the testing server." But that Hg model resonated with someone who doesn't know computers. To me, that was a moment of amazement.
I'm not even saying which one is better really; I like Git branches too! It's just that I'm astonished that the more confusing DVCS is winning. Most peoples' approach to Git is "I am just going to learn a couple of fundamentals and ask an expert to set up something useful for me." I would guess that most Git users don't branch much; they never learned that aspect to it. I'm really surprised that software developers aren't more the sort to really say "why am I doing this?" and to prefer systems which make it easier to answer those questions with pretty pictures.
The network effect should explain it.
That said, using pictures to answer questions is fairly sadistic when those asking them are blind. I know a blind developer and I never use pictures when talking to him in IRC.
Disclaimer, it's been years since I used hg as my primary DVCS. So some of my thoughts here might be out of date, or have a misrecollection.
> branching in hg is pretty much broken*
It really isn't. It's absolutely not suited for the task that many people want to use it for, but it's totally fitting with the intended use case and the "history is immutable" philosophy of mercurial.
Using mercurial branches for anything resembling feature branching is a bad idea. But mercurial branches are perfect for things like ongoing lines of development. So, for a project like PostgreSQL, you'd have a "master" (default) branch for the head of development and then once a release goes into maintenance mode you create a new branch for "postgres-9.4" and any fixes that need to be applied to that release will be made to the maintenance branch.
Following hg's "immutable history" policy the fact that the commit was performed on a maintenance branch is tracked forever. And it should be because the purpose of your source control is to track those kinds of things: "This is the branch we used for maintenance releases of version X.Y.Z, it is now closed since we no longer support that version"
The issues with mercurial's branches are:
- For a long time they were the only concept in hg that had a simple name and looked like "multiple lines of development". Even though hg supported multiple heads and multiple lightweight clones, neither of those had commands or features with a clear and simple name, so they people turned to "branches" expecting them to do what they wanted even when they were a bad fit.
- "branch" is very general name that is often used (quite rightly) to refer to a bunch of slightly different ways of working with multiple concurrent versions. In general use it might refer simply having 2 developers who both produce independent changes from the same parent. Or to intentionally having multiple short lived lines of development based around feature. Or splitting of development right before a release so that the "release branch" is stable. Etc. Yet the feature in hg that is called "branch" is useful for only some of those things. It would have been better to call it a "development line" or something like that.
- It took far too long for hg to get a builtin way to refer to named heads (bookmarks). The model assumed that each repository (clone) would only ever want to have 1 head on each branch (development line) and that producing multiple heads was a problem that ought to be resolved as soon as possible. There's a lot of history behind that approach (almost every CVS and SVN team I ever worked with did that), but DVCS tools made it easier to move away from that, yet official hg support lagged.
So even today, the "branch" concept in hg is only useful for a small number of cases, and the "bookmarks" concept is what most people want, but they're separate things with names that don't align with expectations.
For the rest: https://stevebennett.me/2012/02/24/10-things-i-hate-about-gi...
And it doesn't help that the primary hosted repository system is Bitbucket and Bitbucket didn't support pull requests from bookmarks last time I checked.
Bookmarks (today, and for several years) work just fine. So "branching" in the general sense isn't broken even though the combined feature set is a bit haphazard.
That bitbucket doesn't work well with bookmarks is a sign of how little Atlassian cares about hg, rather than an hg issue.
If you're arguing that the hosting options for hg are limited and fall far below the git options, then I'm not going to disagree.
Mercurial also supports an autonaming of revisions which automatically applies to all child revisions, too: these are meant to be independent lines of development with their its own head revision, and are called 'named branches'; that is what `hg branch` does. The problem that you're identifying (and that I agree is counterintuitive!) is that these names become part of the commits themselves and therefore public knowledge. Mercurial warns you when you `hg branch` that this is happening and says "did you want a bookmark?" but does not tell you, e.g., "to undo what you just did, type `hg branch default`."
Are they talking about hg or git here. Because that flow in git is:
No. That fails with the following semi-helpful error message:
remote: error: refusing to update checked out branch: refs/heads/master
remote: error: By default, updating the current branch in a non-bare repository
remote: error: is denied, because it will make the index and work tree inconsistent
remote: error: with what you pushed, and will require 'git reset --hard' to match
remote: error: the work tree to HEAD.
remote: error: You can set 'receive.denyCurrentBranch' configuration variable to
remote: error: 'ignore' or 'warn' in the remote repository to allow pushing into
remote: error: its current branch; however, this is not recommended unless you
remote: error: arranged to update its work tree to match what you pushed in some
remote: error: other way.
remote: error: To squelch this message and still keep the default behaviour, set
remote: error: 'receive.denyCurrentBranch' configuration variable to 'refuse'.
To get to talk about `push` you then need to introduce the SVN-style "bare repository" in the diagram, a folder-box with the cylinder now drawn large inside it, and explain that this folder exists only to contain the .git subfolder and act as an SVN-style repository. You can then draw `pull` arrows down from it and `push` arrows up to it.
Then the workflow is more SVN-style:
error: Your local changes to the following files would be overwritten by checkout:
Please, commit your changes or stash them before you can switch branches.
Now, you're missing the point if you think "God, drostie is really pedantically getting on my case for missing the remote-repository-update and the git stash here! Anyone will learn that workflow eventually!"
The point was not any such thing, the point was clean diagrams when explaining the idea to a fellow developer -- in fact a diagram so clean that a nontechnical user asked about it and accidentally learned enough to get some new vocabulary about how a developer's life works, so that they could more effectively communicate what they want to the developer.
It is my contention that the git diagram, as opposed to the git workflow, is sufficiently messy that a nontechnical eye will lose curiosity and most certainly will not get the idea of "make a branch, push the branch to the shared repository, then update the testing repository, then switch to that branch, then discard that branch if things don't work out." That strikes me as too in-depth for nontechnical casual users to express.
Our job is often to think up new things. It's really hard to come up with new abstractions when your thinking is muddled by all kinds of incidental complexity.
Significantly less training than is required to know the internals of how it operates though.
The core problem with Git is that it was designed to serve the needs of the Linux kernel developers. Very, very, very few projects have SCM problems of similar complexity, so why do so many people try to use a tool that solves problems they don't have? Much of that internal complexity extends up into the Git interface, so you're paying for complexity you don't need.
Others in this thread have praised hg and bzr for their relative simplicity for a DVCS. I'd also like to point out Fossil.
In the normal course of daily use, Fossil as simple to use as svn.
About the only time where Fossil is more complex is the clone step before checking out a version from a remote repository.
Other than that, the daily use of Fossil is very nearly command-for-command the same as with svn. Sometimes the subcommands are different (e.g. fossil finfo instead of svn status for per-file info in the current checkout) but muscle memory works that out fairly quickly.
Most of that simplicity comes down to Fossil's autosync feature, which means that local checkout changes are automatically propagated back to the server you cloned from, so Fossil doesn't normally have separate checkin and push steps, as with Git. But if you want a 2-step commit, Fossil will let you turn off autosync.
(But you shouldn't. Local-only checkins with rare pushes is a manifestation of "the guy in the room" problem which we were warned against back in 1971 by Gerald Weinberg. Thus, Fossil fosters teamwork with better defaults than Git.)
Branching is a lot saner in Fossil than svn:
1. Fossil branches automatically include all files in a particular revision, whereas svn's branches are built on top of the per-file copy operation, so you could have a branch containing only one file. This is one of those kinds of flexibility that ends up causing problems, because you can end up with branches that don't parallel one another, making patches and merges difficult. Fossil strongly encourages you to keep related branches parallel. Automatic merges tend to succeed more often in Fossil than svn as a result.
2. Fossil has a built-in web UI with a graphical timeline, so you can see the structure of your branches. You have to install a separate GUI tool to get that with most other VCSes. The fact that you can always get a graphical view of things means that if you ever get confused about the state of a Fossil checkout tree, you'll likely spend less time confused, because you're likely also using its fully-integrated web UI.
3. Whereas svn makes you branch before you start work on a change, Fossil lets you put that off until you're ready to commit. It's at that point that you're ready to decide, "Does this change make sense on the current branch, or do I need a new one?"
Fossil's handling of branches is also a lot simpler than Git's, primarily because the local Fossil repository clone is separate from the checkout tree. Thus, it is easy to have multiple Fossil checkouts from a given local repo clone, whereas the standard Git workflow is to switch among branches in a single tree, making branch switches inexpensive.
(And yes, I'm aware that there is a way to have one local Git checkout refer to another so you can have multiple branches checked out locally without two complete repo clones. The point is that Git has yet again added unnecessary complexity to something that should be simple.)
The only people I've ever seen "get Git naturally" were developers starting from the implementation details and working their way up (#0).
Everybody else either worked very hard at it(#1) or just rote-learned a list of commands(#2) that pretty much do what they want from which they don't deviate lest the wrath of the Git Gods fall upon them and they have to call upon the resident (#1) or heavens forbid the resident (#0) who'll usually start by berating them for failing to understand the git storage model.
> Well, Git is nothing like SVN and you'll always be missing something if you try to understand Git through SVN concepts.
Mercurial is also nothing like SVN, the problem is not the underlying concepts and storage model, it's that Git's "high-level UI" is a giant abstraction leak so you can't make sense of Git without understanding the underlying concepts and storage model, while you can easily do so for SVN or Mercurial.
 because the porcelain sort of makes sense in the context of the plumbing aka the storage model and implementation details
 because the porcelain in isolation is an incoherent mess with garbage man pages
> But, I mean, close enough. It's `git stash branch <newbranch>` and it generates an ugly error message but it does exactly what you want it to do, so you can ignore that error message and hack away.
git checkout -b new_branch
git commit -a
And as for the push problem:
1. You aren't going to encounter it in git when pushing a newly created branch, but yes, you then have to ssh in and check it out.
2. I wonder how Mercurial handles pushing to repository with uncommitted changes, does it just nuke them?
hg push ssh://testing-server/
Of course, you can also commit your changes and then `git branch`, which sounds insane (that commit is also now on the master branch!) until you remember that branches in git are just Mercurial's pointers-to-heads. This means that you can, on the master branch, just `git reset --hard HEAD~4` or so (if you want the last 4 local commits to be on the new branch and you haven't pushed any of them to the central repo), and your repository is in the state you want it in, as well. (And you'll need that last step even if you `git checkout -b`, I think.)
Regarding your second one, Mercurial's simplified model is actually really smart. You have to understand that Git complects two different things into `pull`: updating the repository in .git/ and updating the working copy from the repository. In Mercurial these are two separate operations: you update/commit between the working copy and the repository; you push/pull between two repositories. The working copies are not part of a push/pull at all. So if you push to a repository with uncommitted changes in its working copy, that's fine. The working copy isn't affected by a push/pull no matter what.
With that said, if that foreign repository has committed those changes, Hg will object to your push on the grounds that it 'creates a new head', and it will ask you to pull those commits into your copy and merge before you can push to the foreign repository. (The manpages also say that you can -f force it, but warn you that this creates Confusion in a team environment. Just to clarify: a 'head' is any revision that has no child revisions. In the directed acyclic graph that is the repository history, heads are any of the pokey bits at the end. You can always ask for a list of these with `hg heads`.)
"OK," you say, "but let's throw some updates into the mix, what happens? Does it nuke my changes?" And the answer is "no, but notice who has the agency now." Let's call our repositories' owners Alice and Bob. Alice pushes some change to Bob's repository. Nothing has changed in Bob's working folder.
Now if Alice tells Bob about the new revision, Bob can run an update, if he wants. Bob has the agency here. So when the update says, "hey, those updates conflict, I'm triggering merge resolution" (if they do indeed conflict), he's present to deal with the crisis. Git's problem was precisely "oh, we can't push to that repository because we might have to mess with the working copy without Bob's knowledge," and it's a totally unnecessary problem.
Bob can also keep committing, blithely unaware of Alice's branch, if Alice doesn't tell him about it. The repository will tell him that there are 'multiple heads' when he creates a new one by committing, so in theory he'll find out about her commits -- though if you're in a rush of course you might not notice.
Bob can keep working on his head with no problem, but can no longer push to Alice (if he was ever allowed to in the first place), because his pushes are not allowed to create new heads either. In fact he'll get a warning if he tries to push anywhere with multiple heads, because by default it will try to push all of the heads. However he can certainly push his active head to anyone who has not received Alice's branch, just by asking Hg to only push the latest commit via `hg push -r tip` -- this only sends the commits needed to understand the last commit, and as long as that doesn't create new heads Bob is good to push.
PTSD? :) Use local topic branches for everything to avoid unpleasant surprise merges. Once you are ready to merge, pull the shared branch, merge/rebase onto that and push/submit/whatever.
I sometimes keep separate branch for each thing that I intend to become a master commit. This way I can use as many small and ugly commits and swearwords as I please and later squash them for publication after all bugs are ironed out.
This helps with remembering why particular commits look the way they do, especially in high latency code review environments where it can take days or weeks and several revisions to get something accepted.
> Git's problem was precisely "oh, we can't push to that repository because we might have to mess with the working copy without Bob's knowledge," and it's a totally unnecessary problem.
Actually Bob's working copy isn't modified, it's just that if his branch was allowed to suddenly stop matching his working copy, he would probably have some fun committing (not sure what exactly, never tried).
Is the benefit clearly obvious? If you actually adhere 100% to Stallman's code I'm not so sure.
In fact, it was the reverse - he felt like he was being locked out of kernel development because he didn't want to align his moral code with those who used BK.
So, he tried to find a way to hold true to his code without forcing the rest of the kernel team to give up BK.
You say that like it's a bad thing.
telnet bk-server 5000
and typed "help".
What changed? Is BitKeeper still an ongoing business with some other model, or is that, as they say... it? I hope not.
Too late? Maybe. But we had a viable business that was pulling in millions/year. The path to giving away our stuff seemed like:
step 1: give it away
step 2: ???
step 3: profit!
Will it work? No idea. We have a couple of years to find out. If nothing pans out, open sourcing it seemed like a better answer than selling it off.
I haven't checked out Bitbucket because last time I evaluated (2+ years ago) they didn't have good on-prem options.
Bitbucket has been rotting since Atlassian bought them, and now there's really no "killer app" for Mercurial hosting. There are Mercurial hosting services out there, but nothing anywhere close to Github/Gitlab.
Actually since BK is now opensource we might think of adding a BK backend to RhodeCode and our VCS abstraction layer that already supports GIT, Mercurial and Subversion
Looking forward to some hopefully differentiated features.
I'm also interested how open-sourcing BK will improve the other systems, too.
#mercurial in Freenode right now is monitoring this thread, very relevant to our interests.
Someone at Facebook in #mercurial right now is trying it on some Facebook repo, to compare performance.
How / why did you decide to use the Apache license rather than the GPL?
(It seems like a viral license might protect you a little bit, if you want to prevent your competitors from forking and improving your code base and then using it to compete against you.)
As to why that license, I think it was because LLVM or clang or both had recently picked that and all the lawyers at all the big companies liked that one. We don't particularly care, if everyone yells that it should have been GPL we'll fork it and relicense it under the GPL. Our thought was that Apache is well respect and even more liberal than the GPL but we can be convinced otherwise.
Also, replying to something a little higher in this thread, I wouldn't say that Apache 2 defines all contributions as Apache 2. That section of the license starts with the words "5. Submission of Contributions. Unless You explicitly state otherwise, ..."
And so Apache 2 just becomes the assumed default license on contributions, but it's not at all forced or required that contributions come in under Apache 2.
I'm just not super-fond of relying on Bitbucket, reliable though they've been, for hosting my stuff.
But a package I could toss on my own VPS? I'd toss some money at that. Wouldn't even need it to be open-source, but I'm no zealot.
Care to be more specific?
I'll grant that Fossil's wiki is not a competitor to MediaWiki, but that doesn't make it "awful." It just makes it less featureful. So, what feature do you need in a wiki that Fossil's wiki does not provide?
As for the ticketing system, again, it isn't going to replace the big boys out of the box, but it also doesn't have to match them feature-for-feature to be useful. Also, the Fossil ticketing system's behavior is not fixed: it can be modified to some extent to behave more like you need. Did you even try modifying its behavior, or are you just complaining about its out-of-the-box defaults? Be specific!
> unexpected behavior for basic commands like "fossil rm".
If you mean that you want fossil rm to also delete the checkout copy of the file in addition to removing it from the tip of the current branch, and you want fossil mv to rename the checkout copy in addition to renaming it in the repository, then you can get that by building Fossil with the --with-legacy-mv-rm flag, then setting the mv-rm-files repository option. You can enable it for all local Fossil repositories with "fossil set mv-rm-files 1".
Alternately, you can give the --hard flag to fossil mv and fossil rm. That works even with a stock binary build of Fossil.
> I'm just not super-fond of relying on Bitbucket
For some of us, relying on a cloud service just isn't an option. We're willing to give up many features in order to keep control of our private repositories.
> a package I could toss on my own VPS? I'd toss some money at that.
Fossil runs great on a VPS, even a very small one, due to its small footprint. I wrote a HOWTO for setting it up behind an nginx TLS proxy using Let's Encrypt here:
I also ditched it all 3 or 4 years ago, so my memory's not great, but what got me about the ticketing was that, for whatever reason, I could not sit any non-technical user in front of it and have it make sense to them. No amount of tweaking ever made it make sense for anyone but the devs. I know that's not specific, but this is all in the pretty distant past for me, and that's the takeaway I had from it.
Fossil's lack of any built-in emailing was also lousy. I'm aware that some people rig up some hokey RSS-to-email system to accommodate that, but really, come on, that's awful!
Hey, if fossil actually serves your needs, that's great. I like the value proposition--one file is your repo, wiki, tickets, the whole ball of wax, it's cross-platform, it's just that the execution of the idea didn't work for me.
(As an aside, another thing I didn't like about fossil was its community--tending toward defensiveness and "it's supposed to work that way" instead of "hey, maybe you, the user, are onto something".)
So name a bug tracker with equivalent or greater flexibility to Fossil's that non-technical users do understand.
I've only used one bug tracker that's simpler than Fossil's, and that's because it had far fewer user-facing features.
Every other bug tracker I've had to use requires some training once you get past the "submit ticket" form. And a few required training even to successfully fill that out!
> Fossil's lack of any built-in emailing was also lousy.
Email is hard. Seriously hard. RFCs 821 and 822 are only the tip of an enormous iceberg. If Fossil only did the basics, it would fail for a whole lot of real-world use-cases, and it'll only get worse as email servers get tightened down more and more, to combat spam, email fraud, domain hijacking, etc.
I, too, would like Fossil to mail out commit tickets and such, but I'm not sure I want the build time for the binary and the binary size to double just because of all the protocol handlers it would need to do this properly. Keep in mind also that Fossil generally doesn't link out to third-party libraries. There are exceptions, but then, I'm not aware of a widely-available full-stack SMTP library, so it would probably have to reimplement all of it internally.
Now, if you want to talk about adding a simple gateway that would allow it to interface with an external MTA, that would be different. I suspect the only thing wanting there is for someone to get around to writing the code. I don't want it bad enough to do it myself.
> "it's supposed to work that way" instead of "hey, maybe you, the user, are onto something".)
If you propose something that goes against the philosophy of Fossil, then of course the idea will be rejected. We keep seeing git users ask about various sorts of history rewriting features for example. Not gonna happen. No sense having a philosophy if a user request can change it.
If you're talking about a Fossil behavior that isn't tied to its philosophy, but it just works the way it does for some reason, logic and persuasion are a lot less effective than working code. The Fossil core developers accept patches.
 I mean something you can expect to be in all the major package repositories, and in binary form for Windows.
I still think the Fossil value proposition--one file with all your project ephemera--is a good one. It'd be neat if BitMover produce something similar, e.g. maybe not a file, maybe a directory, but same idea, etc.
It's a good excuse to blame that opensource broke your business, and that opensource could not save you from dying...
Well clearly in retrospect, step two should have been renaming it "Dawson's Creek Trapper Keeper Ultra Keeper Futura S 2000" , adding incredibly advanced computerized features including a television, a music player with voice recognition, OnStar and the ability to automatically hybrid itself to any electronic peripheral device, absorbing the secret military computer at Cheyenne Mountain, and taking over the world.
> Spending a lot of time dealing with manual and bad auto-merges? BitKeeper merges better than most other tools, and you will quickly develop confidence in the quality of the merges, meaning no more reviewing auto-merged code.
Do you have examples of merge-scenarios that are a Conflict for git but resolve for BK?
> BitKeeper’s raw speed for large projects is simply much faster than competing solutions for most common commercial configurations and operations… especially ones that include remote teams, large binary assets, and NFS file systems.
Is there a rule of thumb for what size of repos benefits from BK? (And I suppose size could either be the size of a current commit or the total size of the repo.)
Are there any companies like github or bitbucket that support BitKeeper repos?
As for size it's csets * files, as that gets big, Git slows down faster than linear, we're pretty linear.
I haven't read much about bk so far, so forgive my lazy web question: does/can bk operate over standard ssh as git/hg/svn can, or does it require a dedicated listening server to connect to?
Edit: answering my own question, yes it does support ssh as a transport
From the "Why" page:
BitKeeper’s Binary Asset Manager (BAM) preserves resources and keeps access fast by providing local storage as needed.
BAM is great for any organization that handles:
* Office files
* CAD files
* Any large binary files
On the commercial site there is a link to some BAM paper, take a look at that and maybe ask in the forum or irc if this gets lost.
For those who commented that way, please reconsider this winner takes all approach to your outlook of the world. The world is better because of choice and it's in everybody's best interest to have more distributed version systems.
The argument diversity is good is not so simple true, there are tones of benefits to diversity however there is cost to it too: fragmented finding talent, support, time to fix bugs, more eyes on the project, developer headaches in supporting competing standards so on...
>Why use BitKeeper when there are lots of great alternatives?
>For many projects, the answer is: you shouldn’t.
BitKeeper itself is a collection of repositories. Download an install image, install, and clone it:
$ bk clone http://bkbits.net/u/bk/bugfix
$ cd bugfix
$ bk here
$ bk comps -m
Play with it, it's very different from Git, the subrepo binding is just like file bindings. Everything works together and obeys the same timeline.
The about only difference is that git prefers the whole history and you cannot yet set per submodule shallow clone policy. It wouldn't even be too hard to add that.
Regarding versioning, you can always version any file by storing every revision and compressing them as best you can. I believe this is what Perforce does. Repo size can of course become an issue, and git doesn't do a great job with that, since it stores everything, and stores it locally. Perforce can at least discard old revisions and lets you select history depth on a per-file or file type basis.
The more serious problem with git in my view is that there's no good automated merging tools for many types of files, nor are any likely to arise. And more importantly, most people working on your average game aren't interested in forming in-depth mental models of how their tools work, and certainly don't want to have to pick up the pieces when they go wrong. So for most files, you need an exclusive lock (check in/check out, lock/unlock, etc.) model, or similar. That works quite well. But for obvious reasons, git just doesn't support this model at all, and I believe Mercurial is the same - and no amount of transparent/magic large file storage backends or whatever are going to fix that.
$ bk clone http://bkbits.net/u/bk/bugfix/
Asking due to looking for a Gogs/GitLab (like) server side solution for a project under development. However, it needs to handle binary data well, which Git-based solutions don't.
$ bk clone bk://bkbits.net/bkdemo/bk_demo
$ cd bkdemo
# edit files using your favorite editor
$ bk -Ux new
$ bk commit -y"Comments"
$ bk push
* The -U option to bk tells it to operate on "user files". That is files that are not part of the BitKeeper metadata
* The modifier x corresponds to "extras", files which Bitkeeper doesn't know about (changed files is c)
* `new` adds files to the repository
* [on commit] the -y option is for changeset comments (~commit messages)
So `bk -Ux new` is `git add <untracked files>` and `bk commit -y"thing"` is `git commit -m "thing"`
 aka `git add $(git ls-files -o --exclude-standard)` or `git add -i<RET> <a> <*> <q>`
At the very least, Bryan Cantrill will be happy :-D.
You can have a cloud of servers so the binaries are "close" (think China, India, US).
It is unclear to me if the BAM server is part of this opensourcing or not. The page talks about a 90-day trial.
Also, it is common in other (usually non-D)VCS workflows to lock binary files while working on them, since concurrent changes can't be merged the way text files allow. Does BK support anything like this?
We have not done the centralized lock manager, we didn't get commercial demand for that (yeah, surprised me too). We could do it though, it's not that hard.
BitKeeper doesn't support locking binary files.