Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Why don't companies open-source the source code of their old products?
76 points by blubbi2 on July 7, 2013 | hide | past | web | favorite | 55 comments
I'm not talking about products like the first iPhone, I'm talking about products (like gameboys) that are still cool and interesting, but completely irrelevant for the company that invented them.

Why don't these companies open-source their old products?

It may sound trite, but the real reason is -- why would they?

The engineers are all busy working on new things. What product manager is going to say, hey, Fred, take off two weeks and take this random product from 1997 and see if it still builds at all, find the old dependencies, clean it up, talk to legal to see if we still own the rights to (x, y, z), and throw it up on GitHub -- just to expose some of our shoddy programming, last-minute hacks, vulgar comments cursing out other programmers, and maybe even some proprietary company information that we had no idea was in that code base.

There's no upside, but possibly plenty of unknown downside.

Yeah, you can see that by looking for when companies do open-source old code. It's usually a case where someone has some specific reason they want it open source, and is willing to push for it.

id Software games: John Carmack wants to get old code out there, partly so students/others have some real-world examples to peruse and hack on rather than only toy games from textbooks, and partly because he's proud of a lot of his clever hacks and eventually wants them to see the light of day.

Axiom (computer algebra system): Someone starts a proprietary software company based on academic research, and it fails. One of the original researchers has a personal interest in getting the code released, so the project can continue.

Blender: Company goes bankrupt, and the original programmers want it to be open-sourced so their work doesn't disappear. Raise $100k to buy it from the creditors in bankruptcy.

This is so, so true. So often when people ask questions like this, they seem to be overestimating the value of having access to the code that they're interested in, and underestimating the expense and risk of open sourcing it.

Very few companies would be willing to just throw the code as is to the world. Usually they're going to want to go over that old code line by line to make sure it's fit for public consumption and doesn't contain intellectual property that they don't have a right to open source. All of that stuff costs money and engineering time and probably legal time as well.

They would also have to have some reason for doing it in the first place, something more than that it could be hypothetically interesting to hypothetical people. That's a leap of faith that few companies are willing to make, and the larger the company, the more people have to make that leap and sign on to it.

Having said that, I also wish that more companies would open source their old code (for academic reasons if nothing else). I just think the reason it doesn't happen more often is for really practical and predictable reasons, and not because those companies are totally oblivious to the idea or because they're evil or anything.

A potentially big upside is the preservation of our culture for future generations.

Companies mostly care about making money, not being beneficial to human race. (not passing judgement, just observing)

It's funny you say this...

Back when the idea of a corporation was first formed (I do not remember the time frame) one of the requirements to complete the process of incorporation was to prove what you were going to do for the community around you.

I wonder how this idea got lost in the process...

Companies are usually improving the community in some way (that's why they make money in the first place). Grocers sell food to the community, hardware stores sell tools and supplies for making things, even the dreaded oil companies have sold the primary source of fuel for a century (and made money hand-over-fist for it). Random acts of kindness (e.g. open sourcing projects for no reason other than "We could"), while cool, are tangential to the purpose and specialty of the company.

This is how it should be - sadly both ends (doing good and making money) are not always aligned with each other. From my personal observation I would say that they are less and less aligned the bigger the company is.

I seem to remember sending the Library of Congress a cd with source code on it when I did a copyright registration.

I guess we'll see what they've done with it when the first computer program copyrights expire in around 50 years (assuming Disney doesn't buy anymore extensions).

I think the biggest issue is that software/technology isn't "trickling down" the way that many other resources do. It's proprietary, region restricted, and short-lived.

Maybe a solution is to find some unpaid interns or volunteers, NDA them, and unleash them on the technical challenges (and scrubbing the obscenities from the comments). That leaves only the legal question for the company.

It could be a huge PR boost. EA games got some good mileage out of releasing their older titles like Tiberian Sun as freeware. Imagine if they unleashed them as open source, and thus made them into a learning tool (like a textbook) for future generations.

This is the exactly the point why we are not publishing our source codes.

A number of reasons come to mind:

1. They might not have the source any more. There's a depressingly large amount of software that was created, that no-one has the source code to. It might have been lost in a server crash, it might have been stored on a medium which is now unreadable, or people might have just not cared about it after the product was shipped and who knows where it is now?

2. There might be legal reasons preventing them from releasing it. For example, be a software patent covering part of the code which they licensed for their commercial use, but which prevents them from releasing it as open-source. An example here would be Doom 3's shadow volume code - the open-source release didn't actually include the code that shipped with the game, instead it included a slower (yet not patent-encumbered) algorithm. Unless you built absolutely everything in-house and licensed nothing from anyone else, it can be tricky to work out whether you're actually allowed to release the source to your product.

3. It might reveal other things that should be private to the company. Often source code has overly-tight integration with things like your in-house version control, or your build system, or other things of that nature. Perhaps your debug build automatically uploads crash logs to your bug tracker, and includes credentials for doing that which were stripped out of the release build? In order to do an open-source release, you need to look through your entire code to figure out if there is anything sensitive there, and remove it (in a way that doesn't stop the product working).

And finally...

4. What benefit does it get them? Gathering up, scrubbing, testing, and releasing an open-source product takes a substantial amount of effort, and hence a substantial amount of money. And you'd be hard-pressed to see a return on that investment in anything except vague, largely-meaningless "community good will".

I would be seriously shocked if #1 was a valid reason. Products for which the source is completely lost after they are shipped? Can it happen before the release? Why there were no backups?

This has happened to a product I worked on. There were 2 companies working on it, and when we were done there were 2 backups that I made after we'd finished it.

Backup #1 (USB HD, kept by development company A) was lost in a flood. Of course, this sort of thing is why you have 2+ backups...

Backup #2 (USB HD, kept by development company B, the one I worked for) was lent to the client after they wanted to take a copy of it themselves. Then fast forward 6 months, and:

    US: Please would you return our backup drive of product XYZ?
    THEM: [paraphrased] Who are you?
This product was a one-off, and not key to company B, so nobody chased this up (not that they could do much anyway because the backup drive was presumed lost forever). We only found out about the loss of backup #1 when company A asked us for our backup, because they needed the code for a project they were thinking of doing...

Happens all the time dude. I worked for a financial institution a couple of years ago and was involved in a data center move. The IT guys took care of moving/replicating the servers and data, and it was my job to move the applications and services. You have no idea how many console apps and windows services I had to decompile in order to change the connection strings (they were hardcoded previously ... well before I started working there). Thankfully, they were written in .NET, so decompiling changing and recompiling was trivial ... But, this kind of thing is par for the course in some large and old organizations.

It doesn't have to be "completely" lost. If it doesn't build on the current version of the compiler, and the compiler that was originally used to build only runs on an OS version for which you can't buy the hardware any more, then it might as well be lost, even if you have it right there in front of you. I have stuff on my Mac right now (I keep migrating my home directory forwards from machine to machine) that I keep basically as a souvenir, I know perfectly well that while it built on a NeXT in 1998 it won't have a chance in hell of compiling now.

Let's say you find the backups. They're likely in some obsolete tape format, and you would have to reconstruct an ancient Novell/NT/Unix environment to recover them. That's assuming you even know the source code is on a tape labeled something like "Server03".

I worked at 2 places in the '90s where this happened and it involved 3 products and 1 lawsuit.

Never mind that it was no longer possible to buy the tools that were used to build the products...

Maybe it got backed up onto magnetic tape and then archived whoknowswhere along with the equipment necessary to read it.

Maybe they thought it was being backed up but the backup script started failing silently 2 years ago.

I worked for a company in the 90s where the source repository was wiped because the idiot admin tried to run a backup in the wrong place at the wrong time and nuked the repo. Or something like that -- I never really understood his excuse for what had happened.

We lost all the history, but (after they escorted the guy from the building) we were able to restart with source trees pulled off dev workstations.

No surprise that I got an email from one of the VPs of the nearly-defunct company a few years later asking if I might know the master password to the source control database. So yeah, they had a backup, but it was useless because of course it had been encrypted to prevent theft but nobody thought of recording the password somewhere secure.

it is surprisingly common in my experience.

At On2, we open-sourced Theora (initally called VP3) partly because I (the CTO at the time) worried that if the company failed, all our IP would be locked up forever due to complex legal obligations to investors and customers.

The other reason we did it -- the reason that convinced the CEO and board -- was that we figured as we developed better products (VP4, VP6 etc) it would be preferable for us business-wise if the N-1 product was free to everyone. Otherwise, inferior competitors would continue to profit by undercutting us on price. It's impossible to undercut free.

I guess I answered the opposite question -- why do companies EVER open source their old products? The culture of VC-backed and public companies is not especially conducive to making the decision to do so, even with old products.

One argument you will hear against doing so is simple: old, free products might compete with new, profitable ones. There is no real business incentive to take that risk.

To OP:

How long have you been in development, specifically corporate development?

Imagine code with no modern tools or development process and a limited number of people that might understand how and why it was written. Code written under pressure or by overly clever geniuses. Code not under source control with no or little documentation. Code written for companies that don't exist or have changed hands numerous times. Archived to media that no one has computers or drives or OSes for...

Imagine, all of this is standard.

I don't work for a big company, so this is purely speculation, but there could be many reasons why it's not done.

* Documentation Time

If you've got a big project, it could take weeks, even months to properly document some software to a position where it could be used by someone outside of the company. Sure you have your internal documentation but it can often be incomplete, or make assumptions that the person reading it knows about other bits of the company.

* Deployment

Big projects will often use very specialized hardware, software and environments, to the point where it could be nearly impossible to deploy outside of the company. It could depend on internal services that can't be open sourced because there still used, or are an important part of the business. Take Google Reader, yes it would be nice if it was open sourced, but internally it probably uses services, databases, APIs specialized just for Google, it's probably been optimized to work on Google's hardware, with their webserver, with their OS build etc.

Reddit is another example of this, Reddit's code is open source, and while it can be deployed, it's not easy. This seems mostly because it's been built to work on a very specialized set of software versions, and in a very specific environment. Larger open source projects tend to be tested on a multitude of environments, with applications only deployed or built internally, there's no point because you can very accurately control your environment.

* Some of the code is still used

Some, or even big chunks of the code could still be being used in current software. If you've got a library that's particularly useful, you might keep using it. If it works there's no point re-writing it just for a new project.

* The code is very bad

We all know it happens, a project contains terrible code, bad bugs and maybe even security issues, they never bothered getting fixed because they were never noticed. Given the opportunity to look through the code people might pick up on these issues and it would look bad on the company.

* Open source is complicated

Open source seems to come along with a whole host of fun things to deal with, GitHub issues, ranty blog posts, forks, copyright, licenses can all get a bit complicated. Even if it's old software that isn't used anymore you probably need to do some degree of management before things get too out of hand. Even a single tweet can have a big impact on a company, or a products reputation, so particularly at larger companies they'd probably want it managed in some way.

Anyone else got anything to add?

Dependencies on commercial software (e.g., that package for the sound system, purchased as source and modified, without which the product won't even compile).

Dependencies on specialized build tools; porting to something free would not be easy.

Exposure of security holes in existing deployments by revealing bad security practices.

People tend to believe the source code is more useful than it actually is.

A good example was when Netscape open sourced Navigator v4. People couldn't get it to build and it was missing some proprietary components. So even though the open source world was desperate for a web engine, nothing was really done with it. In the end it was decided to start over from scratch with Mozilla.


People often underestimate the amount of effort that open sourcing (in a responsible manner) a large project requires.

Royalties to some components.

You can't really opensource a project if it includes, say, a movie playback component or 3d engine or audio code with royalty-based licencing; the code won't even compile with it, and it may be customized/integrated to an extent where it would be a lot of work to even identify which of your source code files are "contaminated" by licenced stuff that isn't yours to publish.

Some of the code contains someone else's proprietary code. IBM have said they can't release OS/2 for this reason.

If you're talking about games then the answer is licensing. There are a zillion licenses around the audio stack, the video stack, perhaps the game art or the game characters etc etc. None of that licensing allows you to release the source code (and thus the source material) to the public.

For non-games it can also be about licensing (or as others have mentioned providing evidence that you used code in an unlicensed way) or about support (nobody wants to answer questions about the code).

-legal problems (licensing, liability)

-proprietary technology (ie Google Reader hooks into Google backend extensively)

- Cost of getting it ready for open source (usually a large sunk cost already)

- They may still make money out of it in the future (ie selling old games with emulation)

Also, for individuals there is a lot of positives in that prospective employers can see their work. From a company there may be way less incentives.

I wish Google could treate Google Reader the way they treated Google Wave.

Software products usually have some 3rd party components in there, and they can't release the code for them because they don't own the copyright. This is the #1 reason why OS/2 can't be open-sourced.

Probably because projects which were not designed to be open source rely on too many internal systems to be worth open sourcing.

The google feedback tool springs to mind, the google engineer that made it is working towards open sourcing it but because it want created with that in mind it is taking ages to remove its dependency on loads of internal Google stuff.

Exactly this. I haven't looked at the Reader architecture myself, but take a hypothetical "Hey, let's open-source Reader" project at Google:

Reader probably uses the Googlebot crawl infrastructure to retrieve feeds. We're not going to open-source the crawler, so now we need to rip that out and replace it.

Let's say Reader used Megastore to store feed data (the Megastore paper says "hundreds of applications" use it, so this is not an unreasonable idea). Megastore is a nice chunk of intellectual property in and of itself, plus it's built on top of Bigtable.

Throw in Chubby for locking/coordination and some MapReduce jobs for bulk processing, and you're basically down to either rewriting the entire core of Reader or open-sourcing the "crown jewels" of Google's infrastructure. Not to mention any number of underlying libraries (e.g. core C++ libraries) that have hundreds of engineer-hours of investment.

Because it is not how open source movement works.

The big idea about open source is that if you're making something useful for many other people and if it is sane, readable, manageable, then you will get lots of feedback, testing and even patches with bug fixes and improvements.

This is the story of nginx - when it was open sourced, people found it useful for themselves, and then they start using it, which leads to the extensive testing, fixes, patches, etc. Now it has several forks, including tengine.

The other story is the story of RedHat at times of RHEL4. They decided that they will maintain their own set of patches for kernel and glibc. Eventually kernel's src.rpm contained a hundred of patches. This was a mistake, because they should send those patches into main tree instead, and, if patches proven to be correct and useful, they would get feedback and testing and code reviews for free. As far as I remember, they did so with RHEL5.

There are another important stories about how community ceased to improve open source projects after they being acquired by big companies - no one wants to improve other people's property. MySQL and Xen were the most well-known examples.

So, there is no use to open source anything which no one need, except, may be some hobbyists and marginals.

These are all very good reasons posted below, but I believe that the most important reason is corporate mindset . The powers that be - mostly old-hands- view software as physical "property" and unwilling to let go. Sharing it is seen as sacrilege! Also is seen as providing competitors a peek under the kimono and giving them an advantage.

It is the same reason that many pharmas don't release the data of their failed trials. How much better the world would have been had we had a commons of the failed drug trials. sigh

Software is written scratch, while the products like the GameBoy are made from dozens of closed source components. Sharp owns the IP on the CPU. What could they open source in a GameBoy?

Why would they? There's plenty of downside and not much upside?

In the case of something like the Gameboy, Nintendo still profits from selling old software for that platform, emulated on their newer consoles.

The more interesting question is in the cases of businesses that fail. Though I can't imagine this happening, it would be wonderful if the law specified that if a bankrupt company is liquidating its assets and cannot find a buyer for its source code, that it be open sourced.

I hope I dont sound like a goofy idealist but... All of these issues (rightly) point out that the major barrier to open-sourcing is cost. Doesn't this seem like a very effective use of kickstarter? However, I can see just negotiating with a vendor of legacy software could be a lot of work and require a good bit of technical know-how for a volunteer on kickstarter.

I agree with this. I think that even products that aren't core to a business could be open-sourced. Twitter is a great example of open-sourcing code that is not critical to their core business. I think it's a great way to give back as well as helping your company get notoriety.

There is only one success story I can think of where open source revived a faltering product: Eclipse.

I suppose the conclusions from that can be quite open:

a) It doesn't happen often enough to know how effective it would be

b) The product has to be useful enough to attract a community to maintain it, which might be a rare condition

There are quite a lot of reasons, but these come to mind first:

There's a shockingly large number of projects out there that use "borrowed" code. This is in the sense that they may use code from open source projects (and didn't distribute the code back as stipulated in the license), code that they've inherited from other companies that may or may not have been part of a legal acquirement or through the much heard "Corporate Espionage", which sounds sexier than it really is.

Usually, it's just an intern or someone else who finds out the default password to source control and/or CVS server and sends out a few feelers to someone interested. Next thing you know, he plugs in a thumb drive and now he's a free agent with a plane ticket and fancier watch.

Early games especially were like this since being first often meant being successful at courting investors (this was before the video game crash of '83).

Source: I once did a freelance job ages ago where the dev team manager, who used to work at a gaming company, freely admitted that it happened quite a lot in the industry. Maybe it's still happening.

Then there are cases where the source is outright appalling in the number of hacks, re-hacks, back-hacks and any number of ducktape and hope holding libraries together. No one of sound mind would ever let anyone else see that mess. Let alone give the impression that some of that code survives in some form in modern software.

Can you imagine how many vulnerabilities exist? For all we know, it may be possible to do a lot more than install Linux on a Gamecube. http://www.gc-linux.org/wiki/Main_Page

Of course there are also the times where "works of art", as it were, are lost to history. As jbri said, there are times where large projects often are destroyed for silly reasons like crashes or even recycling.

I've bought several used hard SCSI drives back in the day on eBay that may have belonged to one such company (SCSI was expensive, yo!) I have no idea what the project was or what I may have in my hands, but it looked like a whole lot of assembly, a bunch of esoteric C (which might as well be assembly) and some proprietary language code, that resembles a cross between Pascal and Python to my untrained eye, and I don't think anyone outside knew how or where to compile.

Now did they copy this stuff before selling it off? Who knows. This was after the dot com crash, so I have no idea if even the original owners were involved in the sale since many of those companies' assets were liquidated in a very short period of time. And we know how delicate they can be when money is at stake.

Then there are issues with licensing (kinda related to the first point) and already mentioned by people here. Open source, as samarudge said, is complicated. If your legalese isn't kosher, you can expect to lose a buttload of cash in a lawsuit(s) even if you win. Especially with the threat of patent trolls around, you can bet fewer companies are willing to risk OS-ing their code.

Hardware too! We should have the original hardware for our childhood game consoles running in FPGAs! One FPGA with NES, SNES, Master System, Sega Genesis, Atari...

Because someone else could update that old product and start competing with the newer version.

Google Reader may come to mind..

Why don't you start a web-service to help companies opensource products?

because it doesn't solve any of the problems mentioned in this thread? The essential parts are a stable community and 'buildability', none of which are solved by a webservice. Unless you have a great idea for this, than let's talk ;)

I think I have some ideas ;) What's your mail?


They're morons not to.

Word of mouth in appreciation of a good deed or good product is the most compelling sales pitch you'll ever hear.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact