Hacker News new | past | comments | ask | show | jobs | submit login
Apple Acquires FoundationDB (techcrunch.com)
483 points by ea016 on March 24, 2015 | hide | past | favorite | 376 comments



Anyone running FDB in production, good luck, downloads are removed from https://foundationdb.com/

I don't understand why anyone would run a closed source database, especially with the open source options available.


There is a species of parasite that infects ants, making them climb blades of grass so they are eaten by wandering bovines.

The cow was the target all along, not the ant.


The even better parasite story is the one that enters into a rat and crosses it's wires so that it is sexually attracted to cat pee. It goes into heat when it sees the cat and naturally gets eaten - the goal was the cat brain the whole time. As heard on NPR http://www.wbur.org/npr/9560048.


Things like these make me question evolutionary theory a bit. It almost seems too planned to have happened through random mutations.


It's being run backwards and looks great that way. Imagine playing a couple of pool games and then singling out that one really cool shot you made, and start by saying, "that pool shot is how I play pool."

The parasites did their thing just fine for each individual organism. But that one random one that happened to cross wires in just the right way ended up doing orders of magnitude better. A rare occurrence, but very advantageous once it happens.


Theres actually a relatively simple explanation when you look at the neurobiology. First of all toxo is a parasite that can infect and damage the brain so it makes sense that it could affect behavior. Secondly, the response of fear to cat urine or attraction to female mice is mediated by the same pathways in the brain (the limbic system, which governs emotion, and is closely tied to olfactory input, specially in mice). Making the mice feel attracted to cat urine is actually not that farfetched when you stop to think about it and notice that its just a matter of "flipping a bit" somewhere in the wiring. And since this adaptation leads to a large fitness increase it makes sense that it would be strongly selected for that that the modern version of the parasite would have very specialized mechanisms to take advantage of this quirk in the rat brain.


Mutation is only one of several mechanisms of evolution, though.


What else are you thinking of that would contribute to such changes, these almost look planned.


I get it. The sequence of events seems very analogous to a Rube Goldberg machine, and one would conclude if the sequence of events didn't go as planned the parasite would have been doomed.

My theory is the parasite community for that species can survive other ways just fine, but has stumbled across a scenario that serendipidously gives it an extra boost at the change of reproducing.

I would imagine that most species that benefit from these complex interactions tend to have left-over backups for survival and reproduction. When species over-adapt to specific scenarios, it seems likely they'd become to susceptible extinction or their populations would be periodically thinned out.

That's one guess.


That sounds fairly sensible.


Apologies for the late followup on this. I was specifically thinking about natural selection.

Berkeley's Understanding Evolution website has a nice overview of all the mechanisms of evolution. http://evolution.berkeley.edu/evolibrary/article/evo_14


Who's who?


FoundationDB is the parasite, wanting to be eaten by the cow (Apple). It was eaten up by ants(companies that need noSQL, ACID databases) and they brought its attention to the cow.


The parasite is the DB company, the ants are the users and the bovine is Apple


All of them are a steaming pile of cowshit?


Hey, isn't that a quote from a book? Could have sworn I heard it ages ago in a book I enjoyed very much!

I think it was Peeps.


The life-cycle of the parasite is also illustrated in a great oatmeal comic :

http://theoatmeal.com/comics/captain_higgins



Peeps, IIRC, was trematodes[1]. After the opening action sequence, the narrator rambles about how trematodes infect snails to get to the birds that eat the snails. Still involves the intermediate host[2] concept, though.

[1] http://en.wikipedia.org/wiki/Trematoda

[2] http://en.wikipedia.org/wiki/Intermediate_host



You can find some more info here (not about the book): http://www.damninteresting.com/a-fluke-of-nature/


I first heard this incredible story about the lancet fluke from Daniel Dennett.

http://www.ted.com/talks/dan_dennett_on_dangerous_memes?lang...

http://www.salon.com/2006/02/08/dennett/



The parasite is money, isn't it? The ant is FDB and the bovine is of course Apple


This is actually kinda ridiculous. I feel so bad for companies that have invested in this product. Just pulling the downloads like that... wow. I'm hoping that paying customers at least got some heads up, data migration is the hardest thing to get right.

Good luck to all the former FDB'ers.


Contracts for support just don't go poof.


No, but the support itself does. Then it becomes a legal issue, with you taking Apple to court, and you still don't have database updates or support.


Contracts for support are mostly useless in any case when your service is down and you need to get a vendor to respond in a timely fashion. If they are in breach of contract, you have no mechanism to force them into compliance in a short enough timeframe. I've been in situations several times where a vendor with a paid support contract just couldn't fix a problem fast enough and I was forced to dig into their product myself to figure out a fix.

This is why everything you build your own product on and that can't be replaced in a matter of hours by a competing product should come with source. Worst-case scenario, you fix the issue in source yourself.


This isn't entirely true. If you pay Oracle millions of dollars a year, their support is actually fantastic, like really, really good!


If you're into the last month or so of said contract and then your vendor is bought out before getting a chance to renew then one of two things happen:

1. the new owners aren't interested because they want to product to cannibalise for something else they're doing. Original product will never see the light of day again, no new bug fixes and possibly no more security patches.

2. the support contract cost is hiked up from maybe a few hundred dollars a year to several thousands

or the possibility of:

3. one or other of above but with the product's support staff reduced, dispersed or fired.

#1+3 above happened to us after Oracle bought out a bunch of stuff we relied on heavily for our hosting platform. Was a very painful time.


I am aware of that. My comment isn't about physical contracts, it's about expectations of a database company.


Also relevant: https://www.youtube.com/watch?v=Xe1TZaElTAs (About support contracts vs. community support)


The acquire-hire-kill cycle for startups that target developers has become really frustrating. I'd love to try a lot of these new products, but it's hard to know if it's worth my time.


Here's a simple rule you can apply - if it's closed source, then it's not worth your time.

Like, don't build on top of freaking closed source platforms and after 17 years of OSI, 24 years of GPLv2 and myriads of open-source alternatives available, you'd think people would learn.


"if it's closed source, then it's not worth your time."

Of course. Because Gimp does all the thing Photoshop does

Blender does all that Maya does I'm sure as well.

SCADA systems for industrial control? Just download the first one that shows up on Google.

EDA systems? Sure, there's a GPL alternative. People use Mentor (as an example) because they don't know better.


> Of course. Because Gimp does all the thing Photoshop does

It actually does, for the majority of the population. Which doesn't matter. Any Photoshop alternative must be a clone of Photoshop.


Not for pro users, which I thought is what the topic of discussion was. Lack of color matching is the first feature that comes to mind, and that's an absolute deal breaker for any print design people.


Startups don't make another photoshop, they make another gimp and hope to grow it into another photoshop over time, or hope to get bought out by a company that makes another photoshop and wants their talent and technology.


> Gimp does all the thing Photoshop does

For most people, yes. Unfortunately Photoshop is pirated, so it isn't being used at face value.

> Blender does all that Maya does I'm sure as well

If you want to build your own stuff, which isn't uncommon at all and if you have a movie budget of 100 million dollars, you might be better off forking Blender. Also, Autodesk's Maya is not a startup and it isn't being used necessarily as a platform, so that's building a straw-man on your part ;-)

> SCADA systems for industrial control?

You mean Eclipse SCADA?

Funny thing is that I'm working on a software system using it right now. My mission is to steer and monitor power plants and nothing on the market was suitable for my needs. Building on top of an open source stack helped tremendously.


Except of course if there is no oss alternative and you can't conceivably build it on your own.

But yeah - Pretty much agree. The fact that most oss products are best-of-class makes the choice even easier.


If there isn't:

a) look at building your own. What's inconceivable to one person may not be to another.

b) Build support for at least two closed-source alternatives, and make sure you always have at least two current alternatives.


Tragedy of the commons in action. Every VC-backed startup is incentivized to sell, but in the process continues to kill goodwill towards dev startups. I'd be super hesitant of relying on any other company for core tech with high switching cost for this reason.


This doesn't have to be the case.

What if startups with high switching costs (DBMS, OS, pretty much anything else infrastructural) added a guarantee to their sales contracts?

For example, it could be that, if they get acquired, they'll open-source the technology, sell it to another for-profit entity that will maintain it, or provide a migration tool.

Even for something like FoundationDB, it'd hardly be any skin off Apple's back to have a few employees spend a few months ensuring that previous customers have some sort of support.


>What if startups with high switching costs (DBMS, OS, pretty much anything else infrastructural) added a guarantee to their sales contracts? For example, it could be that, if they get acquired, they'll open-source the technology, sell it to another for-profit entity that will maintain it, or provide a migration tool.

That would be like them putting a huge paper hat on their head, saying "I'm not a good target for acquisition".

Companies buy them so they (also) get their product/IP etc. If those startups have promised to give it away in such a case, then they are not that good of a buy.


Those are just contingencies if the acquirer doesn't want to sell the product anymore. Basically it's a guarantee that, even if they get acquired, their customers won't be punished.

That scenario won't hurt companies that are being acquired with the intent of keeping the product running, for obvious reasons. It also shouldn't hurt acqui-hire scenarios.

The only time it might hurt is when the acquirer wants to use the technology internally, but not offer it to anyone else. That's fairly rare, and the downside (potentially scaring away that tiny fraction of acquirers) is much smaller than the upside (making potential customers feel safe).


It's hard to really guarantee effectively without putting up serious money.

The typical large closed-source codebase is full of undocumented things and hidden dependencies on random chunks of proprietary environment. Making it usable as open source is a ton of work.

And you can't guarantee that work will happen without putting up some kind of bond or buying insurance, given that the vendor could simply go bankrupt and not honor the contract.


This is simply the risk you take being an early adopter. If you are uncomfortable with the risk then choose something from IBM/oracle/et al.


Or choose free software. If it's free software, even if the original developer goes away, you can hire someone to keep working on it, or someone else can decide to make a business supporting it, or the like.


Or you just get left with a broken OSS project whose contributors moved on to shinier things, and nobody really much cares about continuing it, and you don't have the knowledge/time/resources to hire someone to do it.


But if the product is already working, chances are that you get away long enough (to do a graceful migration) by just doing small bug fixes, changes to support OS updates, etc.

This is what we are now doing for a project with Berkeley DB XML, which hadn't seen updates for five years. When there finally was an update, it was buggy and moved to the Affero GPL 3, which conflicts with other open source licenses used in the project. So, we continue to use the five year old iteration with a small set of patches.

(Lesson learned: once a product is owned by Oracle, prepare your evacuation plans.)


I was agreeing with the previous posters until I realized this has kind of happened to me. Tastypie was the go-to REST package for Django 3 years ago. Now it has kind of been abandoned and everyone is ranting about Django-REST-Framework.

Fortunately it is stable (for my use cases), and it doesn't actually seem ton be that big a deal that it isn't being worked on. Django_REST_framework is a lot nicer though.


One of the problems you get is that once something is stable, the maintainers generally don't have a huge incentive to dedicate already-over-committed time to work on features they don't personally need. Just triaging tickets on a popular project can be a major time commitment, particularly since a fair percentage of them will be helping people understand the API or trouble-shoot something in their project which is causing failures in your code.

In the case of tastypie, I think all of the maintainers have switched jobs at least once in the last few years and at the same time the general Django community has been moving in the direction of simplicity rather than complex generic frameworks. Daniel's list of things he's not interested in implementing in restless is a good list of things which have been painful in tastypie: https://github.com/toastdriven/restless#anti-features

I've only been a minor contributor but I've increasingly found myself favoring really simple views – roughly https://docs.djangoproject.com/en/1.7/topics/class-based-vie... – since I work almost entirely on read-only public data.

If you or someone you know would like to work on tastypie, we're looking for new maintainers:

http://toastdriven.com/blog/2014/may/23/state-tastypie/

If you have an urgent patch, let me know & I'll see about merging it.


Like I say, it is stable for what I am using it for just now.

I have started using REST framework for other parts of the project, and it seems a lot more consistent with Django's other components (serializers are similar to forms, APIViews are similar to the generic views). In the end that just makes things easier. Its less context switching essentially, which is really useful when I don't touch that part of the project for a few months, then need to update something.

Thanks for the offer though (and thanks for the framework, it has been useful). And unfortunately I don't have the time to help with maintenance.


I do choose established providers. The problem is for the community as a whole -- each company that this happens to is another hit to the credibility of early startups.


Even if you get a few months of reliable support, you still have to switch which will cost you dearly.

Often what is done is source escrow. The source is given to a third party, and if anything goes wrong the third party releases the source to the one who purchased the product.


Even this sounds like a bad outcome for users of the tool. One minute you're building a product, the next you're stuck maintaining a closed source database.


I've been wondering if there's a sort of nash equilibrium to these things. How many startups will a restaurant let auction its open tables before losing faith in all of them?


Can you (or anyone else) explain why these occur?

Is it simply a case of the acquirer wanting the technology and maybe the personnel while the acquired company is losing money hand over fist so they just shut the company down?


Sometimes it's because the product is on the way to failure in the market. It might be a great product, but that doesn't mean it'll ever be profitable.

The acquirer may love the technology and use it internally, but it's expensive to keep it in a public marketplace. It requires support staff, marketing, sales, etc.

There are also acquisitions that are purely about customers, so the startup's product is shut down or rolled into the acquirer's existing products.

So there are a lot of good reasons you'd want to company other than its product, but we can't always tell which reason it is right away.


It's a question of whether the acquirer and/or the acquiree have a good faith social contract with the early adopters who take the risk and believe in them.


There is no transfer pricing mechanism for programmers like there is for footballers. Worse, programmers are often loyal to a particular technology or concept. So in order to extract the valuable staff from a company it's necessary to destroy that thing so the staff can be reallocated profitably.

"Technology" is rarely re-used per se.


"So in order to extract the valuable staff from a company it's necessary to destroy that thing so the staff can be reallocated profitably."

I wonder how often this actually works. In most cases where I've seen this (or heard of it happening first hand from people I know) the vast majority of the people you'd want to keep were out the door of the new place pretty close to as soon as possible (meaning, as soon as the contracts allow, or the golden handcuffs are mostly off, or whatever is relevant to the specific situation).

On the inside there tends to be a pretty predictable path: New management tells everyone nothing will change materially, everything inevitably changes very quickly, people get disgruntled and take off for other opportunities (at a quickly accelerating pace as the old guard sees all their former colleagues from the old place leaving the new).


It's Apple. They don't like having lots of public projects, especially ones tangential to their mission, they probably want to weld it firmly to their in-house usage and have it as a competitive advantage which nobody else can buy.


This is really about closed source, proprietary infrastructure technology (not all kinds of software in general ), offered by startups, not established companies.

That is to say, people use Oracle, SQL server, Teradata, etc and they are not worried they will go out of business or be sold or otherwise that they will be left in the dark by a sudden shift in business practices and product availability.

The problem is almost entirely with startups, which are an easy target for bigger companies interested in their technology and team skills set. This is even more so an issue because the majority of startups are VC funded and are under pressure to sell or comply with VCs interests.

So in practice you have three ‘safe’ ways to build your infrastructure, which are not mutually exclusive. Choose OSS software, buy from companies such as Oracle, IBM, HANA, etc, and build it yourself. Depending on the expertise of your team and funds available for purchases, as well as qualifies of the available solutions, OSS is probably the safest way, followed by purchased from large corps. Rolling your own infrastructure is expensive, time consuming, requires committing your developers to building infrastructure instead of well, building a product, and may not work in the end at all.

Even the companies that can afford to do it everything in house (Facebook, Yahoo, Twitter, etc) choose OSS for the majority of their needs and build on top of that. Google is Google. However, if that infrastructure is your selling point and what differentiates you from the rest, and/or you have very specific needs and makes more sense to do it this way, doing it yourself can be a great alternative.

We rarely use OSS here, and we don’t use any proprietary infrastructure technology. We have built everything ourselves, and it has worked out great so far, but we have put a lot more effort and resources into that, whereas we could have instead invested on the Product. If we had to make a choice again, in retrospect, we ‘d most likely have gone the OSS way. It’s all about tradeoffs.


Established companies are no guarantee. If you rely on Visual Basic or FoxPro, then good luck to you. Oregon hardly had a good experience with Oracle. Much of IBM is doomed.

FOSS or proprietary to yourself are the only valid options.


> We have built everything ourselves, and it has worked out great so far, but we have put a lot more effort and resources into that, whereas we could have instead invested on the Product.

So, you don't use a libc, or a web framework, or any library that implements common logic that other programmers have spent ages refining?

If yes, erk, I don't want to be anywhere near that codebase. If not, I'm reminded of https://www.youtube.com/watch?v=YKjPI6no5ng


Obviously, I thought I didn't need to mention that, by everything ourselves I meant our core infrastructure. We didn't write an OS, a compiler tools chain, an editor, a standard library etc. Though we did write our own javascript syntax/semantics based language/compiler/runtime for server side needs. Like I said, if we were to start over, we d use OSS, but, all things considered, that decision didn't hurt us and gave us leverage, knowhow, flexibiliry and that helped is be where we are now.


There are two reasons why many companies run a closed source database. For some types of databases, there is no good equivalent in open source. For some types of workloads, the closed source implementations are orders of magnitude more efficient, faster, or scalable than anything in open source, so open source cannot reasonably support the workload.

It happens more often than you'd think. There are still many things in the database world that open source does relatively poorly compared to alternatives.


What is an example of something done much better by a closed source database (compared to open source)?


Many banks, hedge funds use kdb+/q for time series databases. This (very expensive) software is literally unheard of outside of these niche domains. I've been using it for close to 5 years for high frequency data, and honestly nothing out there comes close to this awesomeness of kdb+


I would be interested to hear what you consider high volume (writes/second). I am supporting a manufacturing system and it sits at the moment at 7 000 writes/second (it is a normal time series, ie id,time,value,quality).


7000 writes per-second is pretty low for many high-volume time series needs.

Though it's not usually write throughput that most of these technologies are worried about. It's usually compression using dsp methods, aggregate stream folding computations, etc... that matters.


Still waiting what is considered as high. Compression etc are not part of the question.


Just queried my db, and 300k a second is a typical peak in CME futures (across all products). I'd imagine that options could be a lot more than that.


If I may ask, what are you using as db?


kdb


I worked in adserving. On peak moments we were doing around 350,000 updates a second.


Chill dude, I was being conversive. Your attitude makes me not even want to provide you with my own data now.


That took some restrain. Thanks for being nice. I got your point originally but some people think everything needs to be an argument or worse a fight.


Engineers wanting data isn't exactly surprising. I would not consider it attitude.


The systems I deal with start at millions of writes per second and go up from there. I have heard of systems that do over a billion writes per second, though I have not breached that threshold personally (yet).

From an IoT or sensor network standpoint, 7000 writes/second is an idle server.


what db are you using?


I don't have numbers handy. The real power of kdb+/q comes from the column oriented architecture and the extremely powerful vector functional language q. The language q is inspired by APL. I highly recommend you to check out this article to get a sense of the APL family of languages https://scottlocklin.wordpress.com/2013/07/28/ruins-of-forgo...

If you want a database for blazing fast data-storage and retrieval, there are many options available. You start seeing the real benefits of kdb+/q when you use q to simplify very complex operations that aren't easily done in SQL. Also, the high level operators that q makes your code extremely terse. I've written complex backtesting systems that perform data mining on massive datasets - all in one page of very tight q code!


Are you using SQL or ?


ISAM


How much would the price be?


There is a free 32bit version of kdb available (http://kx.com/software-download.php). For the commercial version, pricing information is not publically available.


The zip files on that page contain the source for kdb, I thought I'd take a look at how it works.... nope! That is impenetrable


It's impenetrable to you, in the same way Mandarin is impenetrable at first glance to someone used to reading Latin languages.


The thing with KDB is almost all uses of it are in memory deployments. It isn't hard to make something that has little persistence or relegates persistence to a 2nd class citizen to run quickly.


Well, but it does not exists as open source.


no, but std::map does


Certainly, but it does not provide the same benefits as the complete package.


Not even J?


Datomic is closed source and has features that no open source database currently offers. In particular, it's a time series database of immutable/append-only facts, so its horizontal read scalability is excellent, but it's still ACID and supports joins.


If I remember correctly, Datomic is more of a data modeling layer / transaction manager, and less of a database.


It's definitely a database.


It's definitely a very slow database. You have to be extremely fortunate to have a problem that fits into its niche neatly. I'd sooner figure out a historical insert-only schema for PostgreSQL in future. They're not great about fixing problems with Datomic either, it feels like an afterthought. Means of overflowing labor not currently allocated to a Cognitect contract gig, not a priority in its own right.

-- sad production ex-user of Datomic


I think they've improved a lot WRT fixing problems--we had a chat with them after some issues with Datomic in production, and since then (6 months ago) we've had every problem we've discovered get fixed very promptly, and Datomic's continued to scale for us.


coolsunglasses - Why did Datomic seem slow to you? Can you describe the problems you had in detail? I'm not from Cognitect, just someone who is developing some prouducts that currently use Datomic among a few other databases.

Would love to hear some honest feedback. Maybe your struggles were because of the tech, earlier versions, bad hardware config, or mis-applied use case?


MS SQL Server: who else has such easy and varied ways to cluster/mirror/replicate?


MongoDB!


Stardog[0], a semantic/ontological[1] database, is probably best in class, and is closed source. Anyone interested in writing a open source triplestore, email me ;)

[0] http://stardog.com/ [1] They've started calling it a graph database, though I think triplestore is the most correct name


When you're best in class you can afford to be proprietary.

Clark & Parsia had a history of open source (eg. Pellet) which was the best in-memory reasoner for a long time IMO... but not a lot of luck getting sustainable business subscription revenue. This led to the switch to dual-license AGPL in 2008 and now closed-source Stardog...


The nice thing if you where using Stardog and this happened, you could easily move to any of its competitors which implement the same standards. Including opensource version too. i.e. you might miss a feature but at least your queries will still rung and you won't need to redo your whole app again.

SPARQL should really be everyone first technology to investigate before heading off to anything else. i.e. when you are still pivoting every week you should have the most generic database tech possible. Only when you scale you should specialize.


One could be "interested in writing a open source triplestore", but why would you go down that path rather than, say, optimizing the heck out of Neo4J?


If you want to help "optimizing the heck out of Neo4j", we are hiring http://neo4j.com/jobs/


Well, first, I don't have a high opinion of Neo4j. Secondly, SparQL queries are pretty distinct, and while, yes, they can be translated into generic graph queries, I'm pretty sure there are some fun optimizations to be had if you focus on their patterns 100%. Thirdly, because it'd be a hell of a lot of fun! Why else would anyone write a database...


> Thirdly, because it'd be a hell of a lot of fun! Why else would anyone write a database...

Presumably, because you have a business, which has a product, which has a nascent feature, which requires some particular set of time-and-space-and-distribution guarantees that no current database on the market makes. This is why, for example, Cassandra was developed.


Do you mean forking the codebase or layering something like N3 over it? (btw, last I checked the Neo4j community version could only scale up and the distributed version was commercial.)


While there are some OSS column store DBs, Vertica is a very well put together solution. It's very fast, it scales reasonably well, it has support for wide-range of analytic functions, and good support.


Foundationdb has ACID transactions over the entire db, over the cluster and over multiple keys. And fast. I've looked over so many open sources alternatives, and they claim ACID but it's a deception based on some narrow interpretation of ACID. It's very annoying to spend time researching to discover the truth between the lines.

I would love to find a fast, scalable open source db that implements Foundationdb's features.


> What is an example of something done much better by a closed source database (compared to open source)?

How about FoundationDB? ;-)


That's not the question. A correct answer would be saying what foundationdb does.


Okay Mr. Pedantic, high-performance distributed ACID transactions with an optional SQL layer on top.


While I very much don't like Oracle as a company, I'm not aware there being other DB with something like flashback.

EDIT: Also, I'm not sure there's production ready free software column-store DB.


> I'm not aware there being other DB with something like flashback.

Postgresql had it long before Oracle, but it was dropped as being too much of a hassle to maintain somewhere in the 7->8 transition IIRC.


I haven't really found an open source vertica style columnar store either, and I find this mystifying.


Clustered Indexes. Data in-index. Restoring your database should not be measured in hours for just millions of rows. Statement generation for backups? IF you had clustered indexes you'd never finish restoring.


Oracle. Who else can give me half the database for ten times the cost?


The issue isn't with closed source databases. The issue is with trusting smaller startups who don't have a customer base large enough to avoid dropping.

Oracle and Teradata for example are proven databases with official support available worldwide and a talent pool you can draw from almost immediately. You don't get that with most open source databases (at least those that don't have a parent e.g. Datastax, Mongo, MySQL).


> Oracle and Teradata for example are proven databases with official support available worldwide and a talent pool you can draw from almost immediately.

I've watched Oracle try to strangle more than one company once they were dependent on their database. One they succeeded, one managed to migrate to PostgreSQL just in time. If you build your business to be dependent on Oracle they have you by the balls; don't think they're not going to squeeze. And IME the worldwide talent pool is much more available and... well, talented, for PostgreSQL or MySQL or any of the major open-source options.


If no one allowed small-startup technology into production, then the industry stagnation would be tremendous, so I disagree that is the issue. However, once you do rely on a small-startup database, it better be closed source, so I disagree: the issue is the closed-source part, not the small-startup part.


The fact is that the world depends on closed source databases. So yes I continue to disagree that you should never use one just because it is closed source.

And I never said that people shouldn't use small startup technologies. Only that when you do you take the risk of the company not being around in a few years. And the people who will take that risk are really other startups or early adopters.


>> I disagree: the issue is the closed-source part, not the small-startup part.

Correct. But you are betting the success of your nascent start up on another nascent start up. This is straight up wrong.

For a large company its different. They have all the resources to go into months long migration projects. As a start up, you can't afford time for migrations when you are busy doing the real work.


This is why the world needs early adopters. There needs to be people interested enough in new technologies for their own sake to invest time in them. That's a very different motivation than P&G or Unilever.


Exactly. There is room in this world for closed and open source products.

Enterprise companies will trade technology for stability and supportability. Most of us will flip the other way.


I oversimplify it by saying "Companies will pay cash and accept closed source in return for good documentation and someone to answer the phone." Most non-IT buyers don't bring up open vs closed source in purchasing discussions.


please don't down vote because you disagree. Write a reply. Down vote what is inappropriate and doesn't add to the conversation.

I strongly disagree your stance on Oracle in particular. I had two arrays at a medium size academic library 8 years ago. Anything I had to call about the data base meant a line item for business review due to cost if it wasn't covered by Oracle's service agreement.

PostgreSQL is amazing and I much rather work with that and hire whoever I want with what I want to do either per instance or annual contracts.


It's funny you mention that.. but actually hiring a part-time PostgreSQL DBA is all but impossible, I reached out to most of the support companies listed on the north american website... mainly I wanted for someone to setup a small (3-node) replica set of the most recent version of postgres with plv8 some sane backup scripts and pretty much nobody replied... EnterpriseDB won't talk to you without laying out at least $10k to start, and I would rather pay a person (or small company) I can call that to get things running... more if it kept running well.

I didn't have the time to delve through all the options out there for this purpose, and evaluate each of them, when there are out of the box solutions that were closer to my needs, though not strictly SQL based (Mongo, Rethink, ElasticSearch, Cassandra all come to mind). There is ~6k/month allocated to hosting costs, and ~$40k/month to the handful of people in the IT team... there isn't much wiggle room there for a small company, and everyone wears a couple of hats. The current application is using MS-SQL (hosted in Azure without redundancy) and MongoDB mirrored data for searching against... licensing to get a replicated MS-SQL setup for better availability would be more than our entire next generation hosting budget... If we could have actually talked to someone who wasn't a sales person at EnterpriseDB that could do more than send you a PDF sheet targetted at managers that might have swayed me.

Sorry, will end my mini rant.. in the end, what support I do have from MongoDB (using their backup service), and my experience with actually just using ElasticSearch and Cassandra has been far better for setting up for something resembling high availability/distributed configuration has been easier than even getting a proof of concept PostgreSQL setup working.

I really hope that PostgreSQL gets it together within the next year or so, it would have been my first choice had I been able to actually get some support within a reasonable budget for my needs, or if I actually had the equivalent of a DBAs salary or more to throw at the problem, which I didn't/don't.


This is part of the horrible brokenness of IT labor.

I don't know the features of PostgreSQL that you want to use, but I'm totally willing to learn if somebody is going to cover my living costs. But I'm not even going to respond to your job ad if you put "PostgreSQL plv8 REQUIRED" in it.

For that matter, if you think it's simple enough for a part-time DBA, then why don't you just assign one of your existing IT people to learning and implementing the RDBMS that you need? Surely not all of them want to do the exact same job forever. PostgreSQL has excellent documentation.


Because our resources (time) is already pretty thin wrt maintenance as well as our next generation version. PostgreSQL has several unsupported, and a commercial option for replication. Unfortunately you need a support contract to even talk about getting the commercial/supported version, and there's ongoing development towards bringing it in the box. I already expended enough time trying to get up to speed and have something reasonable working, and it was less time to look elsewhere for the features I needed in another database that had redundancy/scale features in the box.

If I was hiring a full time DBA, I would have put POSTGRESQL DBA as the job title, and made plv8 a feature requirement that I needed/wanted. As it is, there's no budget for that.


This is probably why a lot of companies like to use closed source solutions. I have mainly been using SQL Server and there is a lot of consultants who knows how it works. In a few years I think there will be more database products with good support from 3rd parties but currently it is hard to know what to choose.


Totally agree: All to often I see anonymous downvotes on answers that are factually correct and/or helpful.


The problem is, downvotes due to disagreement are codified as acceptable.

Which still sucks, since downvotes can lead to shadowbans.

PS. Just double checked the Guidelines, and this codification is no longer there. However, there's also no guidance to suggest an appropriate reason to downvote.


agree. same experience. so I rarely comment on HN anymore. dominated by very very narrow-minded assholes. my time better spent elsewhere.


Where do you spend it, out of interest? What comparable places are there for chatting on nerdish things like this?


I was hoping for lobste.rs would be the place but it is invite only so the audience is very small in comparison.

I believe I have a few invites.


If you see this: How can one get an invite without posting a mail address in public forums?


Reddit. esp certain subs. its not perfect, and its hard to find the same density of tech-smart folks as HN. however, its more relaxed and friendly, doesn't punish creativity, and doesn't have the same "that opinion or statement is not allowed here" effect as I see on HN. It does have a kind of liberal/PC groupthink on some issues. But they're issues I don't like to talk about anyway. Reddit's discussion forum UI is also much friendlier and more sophisticated than HN's. And they have lots of areas that focus on non-HN topics, while also being better at allowing one to avoid politics and "startup Foo raised/valued-at $X" posts, yet-another-framework posts, etc. Again, you lose some compared to HN, but also gain a lot. Luckily everything on the web is a tab away and we can vote with our eyeballs.


I guess he hasn't learned the house rules.


> I don't understand why anyone would run a closed source database, especially with the open source options available.

I generally share this opinion. However, in this case no open source solution came close to the features offered by FoundationDB. There are a couple of attempts (like CockroachDB) which could achieve something similar in the future.

One more reason for me to hate Apple now.


ActorDB is quite similar


Not really. Architecture-wise it is very different.


https://github.com/FoundationDB/ looks to be emptied out

This organization has no public repositories.


Because HA, simple to use solutions are very expensive, closed source, or operationally costly.

if you think you've got the secret sauce, and you've actually had to put it into action and still hit five nines, awesome. But IME doing that with something like PostgreSQL is non-trivial (read: costly). That's why FoundationDB looked so appealing (to me).


Biased opinion here (Aerospike CTO and Founder), but you might want to check out open-source Aerospike DB. Like Foundation, it's clustered by default, very very good at SSD / Flash, runs great in cloud deployments, etc.

It's been used in production by big ad houses like AppNexus as well as retailers like SnapDeal. Lots of miles on the code.

Was closed source for years, but went open source about 9 months ago.


No clean SQL layer like FoundationDB though.


Does Aerospike offer transactions?


It seems they offer per-key transactions.

VoltDB offers full transactions / open-source. Lots of differences between us and Aerospike / FDB.


No high availability on open-source. Meant to be used with partitioning in mind so if you need cross-partition transactions most of the time, it's slow.


VoltDB has fantastic HA, but yes, not in the OSS version.

You might be surprised how many apps deploy on VoltDB with cross-partition transactions making up a solid chunk of their workload. Yes, they're slower than partitioned operations, but they're still faster than MySQL much of the time.

Most of the apps we see partition very well, especially for writes. The fact that they can run 10k distributed aggregates a second to get a global view is something few other systems can touch.


AGPL would make VoltDB unattractive to most, I'd imagine.


Seems to work for Mongo, but yeah.

You pay me and the licensing question goes away. You still get to poke around the source.


If you don't need SQL, there are a few pretty compelling options out there, each not without their faults... just the same.

MongoDB, RethinkDB, ElasticSearch, Cassandra and I'm sure a number of others.. each of them have HA options (though RethinkDB is still a few months out for auto-failover iirc), and Cassandra has pretty close to linear scalability in production for some very large data loads. I really like each for different reasons, and would lean towards one or another depending on load.

Not to mention, my speculation is PostgreSQL will likely have an in-the-box replication with failover and/or multi-master solution in place within a few versions.


PostgreSQL has been speculated to get "it'll be all better RSN" failover for as long as I've been using it (around 8.4 IIRC). ;-)

But yeah, what made FoundationDB's SQL-Layer exciting (for me) was:

- For small clusters it was free. - Automatic HA - Operationally Inexpensive (talking admin time/effort/training; far cheaper than PostgreSQL) - Horizontal Scalability

The things that didn't matter for the 99%:

- It wasn't very fast.

I don't have "big data" problems (I could invent some). Most small shops (I suspect) don't.

The problem I do have is 3AM pagers, availability, wearing too many hats, putting dozens of hours into learning and experimentation to get PostgreSQL to use the hardware it's put on effectively, coming up with complex CARP+REPLICATION+FAILOVER plans, ZFS snapshotting because PostgreSQL still can't match the backup/restore process any commercial database had nailed down two decades ago, backing up the snapshots, figuring out how to partition clients into different table-spaces, blah blah blah.

You sacrifice some single-client performance with FoundationDB, but you solve almost every other problem you've got. And you now have the option of deploying a couple extra nodes to exceed your previously fairly intractable TPS milestones.

It's so easy in fact, you can now autoscale your database with your application servers.

And for your 80% of smaller clients it's absolutely free.

Such a bummer. :-(


Your description of what matters to many customers doesn't get enough appreciation. Its faster too often trumps it lets me sleep at night in the battle for attention.


I think the biggest loss is that all the listed solutions aside from PG lack true multi-key transactions.


VoltDB. Multi-key tranactions. Serializable C in ACID. CP in CAP. Open source.


ActorDB has them


Looks very different in terms of the sort of performance you'd likely expect out of it, though. My initial reading is that if you want acceptable performance you're likely forced into thinking about sharding - and even then you're going to be punished by SQLite's poor concurrency support.


SQLite's concurrency support is irrelevant. ActorDB is a sharded by default type of database. The point of ActorDB is to split your dataset into "actors". Actor is an individual unit of storage (a sqlite database). As such your entire database is scalable across any number of nodes.

A natural fit is something like dropbox, evernote, feedly, etc. Every user has his own actor and it is a full SQL database. Which makes the kind of queries dropbox would need very easy to do.

It also has a KV store option. This is basically a large table across all your servers. This table can have foreign key sub tables. From my understanding of FoundationDB this part is closer to what they were offering.


It's not irrelevant. It means your ability to perform concurrent transactions depends entirely upon your ability to decompose your data into a very large number of different actors, otherwise you're bound to be hampered by SQLite's global lock. If you decompose too far you'll end up doing cross-actor transactions, which per the documentation has a substantial performance impact.

This is not to rubbish it - I've not used it after all - but the claims being made for ActorDB are pretty far away from the claims made for Foundation.


All distributed databases shard data. If you hammer at only a specific shard area, performance will be limited to the speed of that shard.

ActorDB fits a specific data model extremely well. Some less so. But that is the case with all databases.


> All distributed databases shard data. If you hammer at only a specific shard area, performance will be limited to the speed of that shard.

Agreed. And what I'm saying is that it appears that ActorDB's per-shard area concurrency is limited to one writer. And that means that SQLite's concurrency support is (contrary to your earlier post) extremely relevant: not just in terms of pure performance, but also ability to perform concurrent operations. If you need more concurrency, your only choice is to shard extremely heavily (which might mean you require more cross-shard operations, which are apparently slow).

As you say, some data models fit the actor model well, but this is still a far cry from the capabilities that were promised by FoundationDB.


FoundationDB was single process. They had no per-node writer concurrency.

The reason why I said sqlite concurrency support is irrelevant is because ActorDB serializes requests anyway. It must do so for safe replication.


> FoundationDB was single process. They had no per-node writer concurrency.

Interesting - I didn't know that. Even so, it depends on what kind of writer concurrency we're talking about, I guess - I presume that ActorDB is limited not just by having to run requests one at a time per-process (which is a legitimate tactic to avoid latching overheads and so on), but by also not being able to run any new transactions against an actor that's received a write until that write commits?

> The reason why I said sqlite concurrency support is irrelevant is because ActorDB serializes requests anyway. It must do so for safe replication.

Do you mean by this that the entire cluster can only perform one request at a time? Or am I misreading you?


Individual actors are individual units of replication. What one actor is writing is concurrent to what another is doing.

Read/write calls to an actor are sequential. I'm quite sure this is how other KV stores like Riak do as well. They have X units per server and those process requests sequentially. Their actual concurrency is basically how many units per server are running. They may interleave reads/writes per node or they may not.

ActorDB does not allow reads while a write is in progress. It is quite possible we will optimize this part in the future as it is quite doable.


In FoundationDB you never had to think about splitting your dataset into something like your "actors". All transactions are independent and parallelizable, unless when they touch a common key - in which case one of the transactions is retried (optimistic concurrency).


Or get a 5-10 year support contract for things critical to your company's function?


All sorts of ways around that, so it's unlikely you would get a contract that precludes you not being out of support before the term is up.

As a thought experiment, if the company just went bankrupt and had no one working for it any more, you contract won't help right? Second part, what happens if someone just buys the assets (IP) and closes the company, then hires the old team? There are many possible variants here.


Or pays out the contract and the value of the payout doesn't cover the cost to your business?


Anyone have a cache of the downloads for OSX and Linux?


It's fairly common to have contract terms requiring the SaaS provider to provide the source code to their (typically closed-source) product in case they go under. Of course, that only helps if you buy the product or support for it and not just rely on free downloads.


I think it's reasonable to expect any company doing anything vaguely important to do configuration management at least to the level of keeping copies of their production software installation binaries.


I do.


Nooooo!!!! FoundationDB is too special to relegate to the iCloud back-end. There's nothing else quite like it out there that's publicly available, either commercial or open-source. This just set the industry back several years. Given that Apple has virtually zero interest in server-side development tools, I highly doubt us civilians will ever see this amazing technology again. :-(


Worth following the development of cockroachdb https://github.com/cockroachdb/cockroach

The long term goals of that project seem to align.


Also have a look into HyperDex: http://hyperdex.org/


Aren't they doing some kind of goofy open source/proprietary differentiation, just like FoundationDB was doing? It looks like "the full copy of HyperDex Warp" (whatever that means) is what people are expected to pay for.


Hyperdex is the result of their ongoing research at Cornell University. They have a commercial spinoff which sells Hyperdex Warp which adds full ACID transactions on hyperdex. So if you don't need full acid transactions, you can use the OSS version, if you want the extra services you have to pay.

It's the same with all software, really: to be able to do a large scale project you have to have funding from somewhere, as you need developers full time working on it to fix/write the stuff no-one wants to fix/write. Some OSS projects get this funding indirectly by sponsored developers who work at company X and write OSS code all day for the project (this is what Linux uses). Other projects are funded by VC money, licenses, support contracts or ads. It's not common a large scale OSS project is successful and stays successful without any funding from the outside.

Thus, if a piece of software gets its funding by selling licenses (like with Hyperdex Warp), it's the same as with a company getting funding by sponsoring: if the cashflow stops, the show ends. In the case of OSS you can grab the sourcecode at least, but in the end, to successfully maintain that it takes a lot of effort most of the time, as the projects are often large, complex and the internals unknown to the user.


Coackroach is to Spanner as FoundationDB is to F1


Cockroach and FDBs KV core are very similar. F1 is similar to FDBs SQL layer.


Thanks for the pointer to cockroachdb. The last I'd checked (over a year ago), FoundationDB was the only reasonable option for distributed ACID transactions, so it's good to see something else with the same goals.


Yes, we have very similar goals with CockroachDB as FoundationDB had: a distributed, transactional data store (ACID, CP, etc). We're starting with the key/value layer, although we have ambitions to also build more structured interfaces on top of it. It's not quite ready for use yet but we're approaching our first alpha release.


Great news, guys. Keep it up. I'll be watching the progress.


It is a sad example of what capitalism eventually leads to. Instead of having "modular" companies which can interface freely with eachother, we end up having a few opaque MegaCorps with an internal economy.


Well if we're in an ideal capitalism society, if what foundationDB created was really of such tremendous value, somebody will come along and do the same thing.


And no one will buy the product, they'll expect awesome for free, and the company will go for a song. Why expect a different outcome the second time?


Because people are not rational animals that follow predictable mathematical models. Too many variables to say it'll be the same again.


Not if they were able to encumber it with patents.


And then even more work will be pointlessly duplicated.


But capitalism is also what channels the human instinct to survive into productivity. Sure some subset of geeks can be motivated purely by creativity under a communist system, but I'd be dubious that it could overall match the velocity of Silicon Valley's innovation engine even granting the sizable implied waste reduction.


Free people not being able to match the pace of people desperately trying to free themselves from wage slavery doesn't strike me as a great justification of wage slavery.

Not that I agree with the premise that capitalism is what drives people to produce things to begin with.


Are these people trying to desperately free themselves from wage slavery the same people who spend most of their time watching tv, playing games or caring for children nobody forced them to have? If so they don't seem so desperate to me.

Generally the people I've met who work hard to develop themselves and develop skills society needs end up doing quite well for themselves, although I admit that as an Australian my experience is probably different from the US. Here we have more socialised education and healthcare, so anyone with motivation can go to college.


It's interesting that you suppose the AU and US results are likely to be different. As an USian I do not perceive that the system very often fails smart, motivated, hardworking people, healthcare and education regimes notwithstanding.


Then how do you explain the low socio-economic mobility in the US?

http://en.wikipedia.org/wiki/Socio-economic_mobility_in_the_...

The good news is everyone is getting richer (rising tide), the bad news is inequality is increasing and low mobility is nothing to be proud of. Like logicchains I'm Aussie and we fare better at the moment, though we're also starting to head in the wrong direction by these measures.


Isn't your government incredibly conservative at the moment?


Yes it is, and if they get their way I believe it will be a disaster for both equality and long term growth here. They are aiming to move education and health to more of a user pays model (which makes no sense given the evidence worldwide) and continue their party's privatisation-by-stealth in both areas.

Fortunately they are not very competent and do not have full control of the parliament, so most of their bills have been blocked by minor parties.


The scary thing is that the largest players in the private market here financially back (or lobby for) a lot of the shenanigans that happen in the capital...denial of climate change, privatization (or abolishment) of public services/utilities, slashing education, etc etc.

The not very competent people who wave snowballs around in congress, send bitchy letters to Iran and Israel, and squawk about #OBAMANET have full corporate sponsorship and the propaganda machine that is our media keeps getting these imbeciles elected.

The US is definitely treading water right now. I remain hopeful, but not optimistic.


From this forum, which is probably my largest source of interaction with Americans, I get the impression that most Americans think life is extremely hard for the lower class there.


Except you have to be in a privileged position to become smart, motivated and hardworking.


Most are born into this society and never consider the notion that things could ever be different.


* 5 million people dead in the Congo, stealing rare minerals and putting them in your iPhone.

* Destroying our earth with a pathetically broad plan for stopping (read: we won't).

* We're creating technological systems that pose the greatest threat against freedom ever.

* We're killing hundreds of thousands in the middle east because they pose threats to our Saudi oil fields.

Yay, productivity.

What I'm saying here, is that it doesn't have to be this way. Capitalism drives a population to productivity in the same way meth does: destructively.


There are far closer capitalist systems that can be used beyond the corporatist based one we have now. Socialism/Communism simply won't work on a large scale, just because of human nature.. the same way that unchecked capitalism doesn't.

The problem is breaking down the "rights" of corporations.

* eliminate corporate taxes * establish structures that allow corporations to hold on to underutilized/unutilized assets (follow through on this) * reduce intellectual property rights assigned to corporations * create a non-living entity legal classification with explicitely reduced rights * remove speach rights from corporate entities (employees, shareholders, etc still have those rights, companies don't) * restrict any propaganda spending by corporations

With those checks in place corporations can still exist, but would be geared towards growth (like Amazon) with continuous reinvestment, or towards paying dividends to those shareholders who are paying taxes.

With those checks in place, a basic/living wage and flat tax could be put in place, no loopholes, no tiered taxation.. everyone is taxed at 50%, everyone gets the same base wage check... the revenue is split between federal govts and state.. 25% to base wage, 25% to federal govt, 35% to states based on population, 15% to states based on land mass (perserve public lands).

Beyond any of this, there are way to utilize capitalism to serve the public interest.. just because this hasn't been done doesn't mean it can't be... and with appropriate checks in place (mainly in political finance which require the first steps outlined), stand a far better chance of succeeding than any alternative that has been tried.


Wouldn't every advertisement qualify as "propaganda spending"?


alternative?


<actual-advice>

I have no good answers for you. If the atrocities above bother you, you can do your part and opt out from the sides of society that requires you to be a part of it.

We just need a cultural shift to stop being such consumers. Stop buying a new phone every year, your current one can easily suit you for the next 10 years. Don't buy a new laptop. Start being cognizant of the influences brand names have on you and try to resist them where possible. Most importantly, we need to strengthen unions and support our local coops.

Start being aware of where the money you spend ultimately ends up.

</actual-advice>

<shill> I'm a libertarian, and I use that in the non-US definition of the word, which is to say I'm an anarcho-syndicalist. Unfortunately anarchism is widely regarded as unrealistic, but if it weren't for the Soviets mucking around in Spain in the 30's, it might be a very different story.

What is anarchism? Here's how Noam Chomsky, (the same Chomsky you know from your compilers / CS theory course) describes it:

""" Well, anarchism is, in my view, basically a kind of tendency in human thought which shows up in different forms in different circumstances, and has some leading characteristics.

Primarily it is a tendency that is suspicious and skeptical of domination, authority, and hierarchy. It seeks structures of hierarchy and domination in human life over the whole range, extending from, say, patriarchal families to, say, imperial systems, and it asks whether those systems are justified. It assumes that the burden of proof for anyone in a position of power and authority lies on them. Their authority is not self-justifying. They have to give a reason for it, a justification. And if they can’t justify that authority and power and control, which is the usual case, then the authority ought to be dismantled and replaced by something more free and just. And, as I understand it, anarchy is just that tendency. It takes different forms at different times.

Anarcho-syndicalism is a particular variety of anarchism which was concerned primarily, though not solely, but primarily with control over work, over the work place, over production. It took for granted that working people ought to control their own work, its conditions, [that] they ought to control the enterprises in which they work, along with communities, so they should be associated with one another in free associations, and … democracy of that kind should be the foundational elements of a more general free society. And then, you know, ideas are worked out about how exactly that should manifest itself, but I think that is the core of anarcho-syndicalist thinking. I mean it’s not at all the general image that you described — people running around the streets, you know, breaking store windows — but [anarcho-syndicalism] is a conception of a very organized society, but organized from below by direct participation at every level, with as little control and domination as is feasible, maybe none. """

One big misconception is that anarchism means that there should be no laws, and that murderers should be allowed to wander the streets. What a lot of people don't know is that anarchism, just like communism, was also a victim of the propaganda machine that we now call the red scare.

I think that a lot of people in tech, who can directly see how open source killed proprietary software, are the people who are most open to the idea this shift can happen.

Anyway, that's just my $0.02.

If you made it this far into my comment, give this a read: http://www.alternet.org/civil-liberties/noam-chomsky-kind-an...

And also read On Anarchism by Chomsky, it's fantastic. </shill>


> If you made it this far into my comment, ...

That's a tacit acceptance that the solutions youre porposing have absolutely no chance of ever happening at a scale that will ever matter. Politics, and economics, are the art of the possible.

Capitalism coupled to represenatative democracy is the best shot w've got at developing a fair, balanced and sustainable economic system. What we need to do is correctly and rigorously price in environmental costs into the financial costs of our economic activities. Otherwise you get Soviet Russia laying waste to vast swathes of territory with misconceived development programs, or China polluting it's own country and population to death due to zero political accountability. The problem with Anarch-syndicalism is that at scale people will syndicalise back into special-interest blocs and you'll be back where you started.


> Capitalism coupled to represenatative democracy is the best shot we've got at developing a fair, balanced and sustainable economic system.

Tend to agree, but that only addresses the economy. From what I've seen, capitalism cares very little for the advancement of society itself, and quite often works against it (eg, oil companies and global warming). You mentioned pricing in environmental costs, which obviously I would agree with, however I think the only way to achieve this is by banning companies from having any sort of political free speech and this is a tricky line to walk. In our current state, the environmental costs of our activities are a large point of contention because the people doing the damage are able to buy a large portion of "democratic" mindshare through propaganda. How do you regulate this? It's a hard problem.

Also, take something like consumerism (as in, buying a new phone every year). It doesn't make us happier, it doesn't make our lives better, nor does it do the planet we live on much good. However, consumerism and capitalism have grown into a feedback loop. It's an area where we spend enormous amounts of energy to derive very little benefit. The free market here does us no good, and in fact enables what I would consider a bad societal habit. Not that I think there's an easy fix or we should try to control people, but it's an example of capitalism working well but providing little value. It's self-referential existence.

> The problem with Anarch-syndicalism is that at scale people will syndicalise back into special-interest blocs

I am an anarchist, but I do believe this is true. Humans are not capable of this self-organizing yet.


Honestly, if you hadn't called yourself a libertarian in the start of your post, I never would have guessed it. Ideology-wise, I'm an anarcho-communist. Did a lot of research regarding it in my youth and still believe in it today. We share a lot of the same beliefs.

I think the goal is unattainable at humanity's current level of spiritual development (which is, to be blunt, maybe a few millimeters further along than our ape cousins). That doesn't mean that the concepts can't be applied in every day life, however.

Living in the US, it's hard not to be completely disgusted by what passes as a "libertarian." I've grown to hate the word, and avoid most people who parrot it. The "less regulation!" "small government!" "free market!" drum gets beat all too often without addressing the elephant in the room: Corporate America is a wild beast running amok over the entire globe. The last thing it needs is less regulation! All the innovation and progress in the world won't be worth a damn if we're all breathing in toxic air and birthing flipper babies.

A lot of this is driven by the American culture's need for the new (as you pointed out). The sickness of our culture is the fuel of our economy, which is now built upon the backs of third world nations (which, by the way, are starting to equalize...soon there will be no more backs to climb on, what then?). We don't produce anything anymore, we just consume.

Even in SV, where people are constantly crowing about how innovative everything is, there's very little real, actual change happening. An app that deletes photos you send to someone after 15s. Amazingly innovative. Another chat app. Useful? Sure. Innovative? No. In fact, tying this back into the parent comments, I'd argue that almost all of the innovation I've seen comes from open source. 99% of private companies in the valley are doing some that has been done 1000x before, but just slightly better. It makes money, sure, but it doesn't help the world or advance society. Capitalism at its best.

I think there's a strong balance that needs to be struck between what actually advances society as a whole and what allows the individual to prosper. In the US, at least currently, the two seem pretty mutually exclusive.

/rant


Well, that's the nature of capitalism. It naturally and inevitably leads to monopoly without government intervention. That's the primary reason of regulation in capitalism oriented societies.


As opposed to communism, which leads to everything being produced/owned by one entity, the state? I'm sure there are reasonably arguments for communism but I've never heard it asserted that encouraging market competition is one of them.


False dichotomy. Communism (or whatever approximations of it we've seen) isn't the only alternative.

The inherent problem in any case is concentration of power, whether it be in the state or in megacorps.


What other alternatives are there? Socialism? There are plenty of socialist countries, but I'm not aware of any that would prevent Apple from buying a relatively small company like that behind Foundation.

Economic freedom is a scale. On one side, there's 100% freedom of exchange/ownership, pure propertarian capitalism, which doesn't exist on Earth right now. On the other side, there's pure communal communism, which also doesn't exist on Earth right now. If neither extreme nor minimal economic freedom would address the issue, how could some intermediate level do any better?


If neither extreme nor minimal economic freedom would address the issue, how could some intermediate level do any better?

Simple. Both extremes concentrate power eventually (in the state or in megacorps). Somewhere in the middle, you have an open and capitalist market, but the government keeps large cooperations in check, and the population keeps the government in check.

It can be argued that the EU strives to follow this model (although imperfectly). E.g. by enforcing net neutrality, being relatively active at breaking down cartels, regulating roaming costs (since the industry kept them artificially high), etc.

Of course, it's never perfect, because the circumstances are never perfect. So, you have to finetune and adapt.


>Both extremes concentrate power eventually (in the state or in megacorps). Somewhere in the middle, you have an open and capitalist market, but the government keeps large cooperations in check, and the population keeps the government in check.

What in this middle ground would prevent Apple from buying FoundationDB? I can't see this happening in any middle-ground countries like Europe.


That's not an ideal question because it assumes that Apple and FDB already exist as technology owning/creating entities.

What if there were no large corporations at all? What if IP and status/cash-flow were set up as the property of individuals and/or small teams who collaborated on a per-project basis?

You could have a system where IP was still shared in a completely open way, remaining free for non-commercial use, but commercial use would require a per-use payment, and commercial modification would attract a revenue share of its own if it was useful to a market.

This might not be ideal - it doesn't solve the problem of actually making stuff, for example. (There are possible answers to that, but they're even weirder.)

But it shows it's at least possible to begin to think about systems that don't have dinosaur corporations stomping around the ecosystem predating anyone and anything who's small and interesting.

And it specifically solves the problem of useful IP being removed or suppressed just because it can be.


The post you responded to pointed to a problem with capitalism, but it hardly follows that this means they'd advocate for communism. More rationally you might suppose that there is some other way to curb failures of capitalism without jumping to other extreme. It is a scale, yes, but there is no reason to expect the best solution is at either end.

Let alone the fact that we are only talking about economic freedom now, which is only one of many interacting facets of politics that can't really be isolated.


Whether or not a company is allowed to buy another company is a matter of economic freedom. All other things being equal, the more freedom companies have to buy others, the greatest the level of economic freedom, as freedom to buy is a component of economic freedom.

Allowing companies to buy any companies has issues. Not allowing companies to buy other companies also has issues. So how could any intermediate situation not have issues? If it forbids in some cases, it will have some of the problems associated with forbidding. If it allows in some cases, it will have some of the problems associated with lenience.


Allowing companies to buy any companies has issues. Not allowing companies to buy other companies also has issues.

You seem to reason in really absolutist terms. You can have an open economy, where a government can still intervene if the current market situation has an extremely negative effect on society.

E.g. breaking cartels, monopolies or oligopolies where they seriously hurt a population does not throw away all the benefits of capitalism.

The disadvantage of one extreme is that you cannot have free enterprise, the other extreme is that you might end up with a few megacorps who control the market and ultimately society. In the middle you have a situation where there is free enterprise, but as a cooperation you also have to play by the rules that were set up to maintain fair competition and avoid centralisation of power.


>In the middle you have a situation where there is free enterprise, but as a cooperation you also have to play by the rules that were set up to maintain fair competition and avoid centralisation of power.

The middle situation faces the potential of http://en.wikipedia.org/wiki/Regulatory_capture. This state-enforced monopoly is a small-scale manifestation of the complete state monopoly associated with 100% state ownership. The only way to completely avoid regulatory capture is to have no regulatory agencies, but this of course brings troubles of its own. There's no perfect middle.


Sure. First of all, in all systems averse effects can happen. Secondly, regulatory capture is countered by democracy and transparency. If politicians become corrupted, you vote them away. In capitalism without regulation, there is no good way for citizens to intervene (except through violence).


How's voting the corrupt politicians away working in practice in the US? From what I hear, not very well.


Your argument now amounts to "there will always be issues", which is correct but not useful. What matters is what those issues are, how important, and how severe at each point on the scale. We're now talking at such an abstract level that it's not really meaningful...


>Your argument now amounts to "there will always be issues", which is correct but not useful.

The argument I responded to was essentially asserting the existence of a non-capitalist solution that would have prevented Apple from buying FoundationDB without negative consequences. If I successfully argue that there are no perfect solutions, than that undermines the argument to which I was replying, and forces the poster to engage with the issue in more detail than just "capitalism is bad".


I think Aerospike has a very similar featureset. It is ACID + NoSQL, replicates nicely across multiple data centers, and definitely has a similar/better latency and throughput profile than FoundationDB.


ACID on a single "record". I am glad they document this openly, so many other DBs try to make it seem like they support ACID transactions in general. From my understanding with FoundationDB, you did something of the form of "begin transaction" "do lots of stuff" "end transaction", and it just worked. I've yet to find another option that can do this.


How does influxdb compare?


It doesn't? Time series data is a pretty specific use case.


What really differentiated it was the fact that multi-key transactions allowed for you to reasonably build any number of logical data models on top of it in a linearly scalable way. It was all built to an extremely high degree of polish with an extremely good testing and simulation harness and a high degree of predictability in performance. It was basically Spanner for the rest of us, without atomic clocks (and they also shipped an F1, their SQL layer on top). As others have mentioned, the closest cousin at the moment is probably cockroach, but it relies on wall clocks which will probably lead to problems in certain cases, but gets an easier way to scale writes.

Here's the architecture diagram for FDB, it's pretty fun to read: https://foundationdb.com/files/Architecture.pdf


Except it turns out no, the layer thing is practically hard to pull off.

See my other comment. https://news.ycombinator.com/item?id=9262673


I think the closest cousin is ActorDB http://www.actordb.com/


Cockroach uses hybrid logical clocks and should generally tolerate reasonable amounts of clock skew. Atomic clocks can improve performance in some cases by putting a tighter bound on clock skew, but they're not necessary for correctness.


All the company's github repositories have been pulled:

https://github.com/FoundationDB/


Ok, this is a big problem. We've started a project that uses their fdb-sql-parser, which was an Apache-licensed fork of the Derby sql parser. I did not make a clone of it before they pulled it. Does anyone have a copy?

Fortunately, they can't pull the artifacts from Maven Central.

Pulling an open-source project upon which people may depend is total jerk behavior.


According to Google's cache, the latest version on Maven Central (v1.6.1) was only four commits behind master.

https://webcache.googleusercontent.com/search?q=cache:TllYak...

Of those four commits, one was a version bump, one a README tweak, and two were merge commits.

https://webcache.googleusercontent.com/search?q=cache:IzAL7H...

The source code is there on Maven Central, alongside the build artifacts, but the git history is gone.

http://search.maven.org/#browse|-1374863701


Apache calcite http://incubator.apache.org/projects/calcite.html contains a sql parser and the source is in much better shape. I made a bunch of contributions to fix things in fdb-sql-parser and discovered it's a real mess.


I've found a live fork with the latest changes.

https://github.com/rimig/sql-parser/tree/fix-docs-javarun


Try https://github.com/hudak/sql-parser

It's out of date - only goes as far as 1.5, but at the very least you can import the 1.6.1 source as one big commit.



If someone has a recent clone and the licence was compatible, is there a reason a fork can't be started?


Apple has incentives for engineers to file patents.

Always something to be mindful of.


This is quite frustrating. There's a lesson here in keeping mirrors of projects in multiple places. That things can be 'disappeared' from GitHub is because not enough people do this anymore.


I have a copy of the SQL layer from March 1st, 2015. It's up on github: https://github.com/jaytaylor/sql-layer


Put it on Google Code. Oh, hang on....


Their PyPI packages also seem to be pulled out[1], but thankfully, their Twitter account is still alive![2]

Also note that there seems to be a mirror for the PyPI packages.[3]

[1] https://pypi.python.org/pypi/foundationdb

[2] https://twitter.com/FoundationDB

[3] http://download.gocept.com/pypi/foundationdb/


If someone is desperate enough, they can recreate the repositories using google cache. See http://webcache.googleusercontent.com/search?q=cache:YiBFZmB...


Apple is the worst acquirer in the industry, at least for users of the acquired company's products. Nobody else would dare kill a _database_ with no warning, explanation, or migration plan - not even a goodbye blog post!

Whenever Apple acquires anything that runs on a competing company's platform, that version is immediately killed (see any of their mobile app acquisitions).

Thanks for making things that much harder for every other database startup.


You can also assign some blame to the database startup that chose not to prioritize its customers in the acquisition. It'd be nice if this pushed more startups to make a legally binding declaration of how customers would be affected by an acquisition, or it pushed more customers to demand it (instead of just giving up on startups). But I'm guessing no one wants to have their hands tied like that, especially when acquisition is a more likely outcome.


I think they're making it much easier for every other database start up: just pick up on the technical grounds left by FoundationDB (see their Architecture PDF) and knock yourself out. You have no competitors right now.


If you need evidence of Apple's ability to acquire and senselessly destroy viable products, look no further than Shake.

If you need evidence of Apple's ability to acquire companies while keeping product lines independent, look no further than Beats.

If you need evidence of Apple's ability to acquire industry leading technology to ensure exclusive advantage, look no further than AuthenTec (Touch ID).

If you need evidence of Apple's ability to sincerely invest in open source for the benefit of all, look no further than CUPS. Or LLVM. Or WebKit.

Apple has no pattern.


You could add "acquire and turn minimally viable products into world-spanning domination, look no further than Touchstream"


True, although said replacement will probably have to be open source to pick up any serious usage.


Do any of you know if Apple has acquired companies which are partially open-source (FDB SQL layer) with a substantial technologist user-base before?

Knowing their past behavior would be an interesting indicator of what is most likely to follow here.

According to the TC article (and their website), FDB is no longer available for download and there is a "goodbye"-esque type of message on their community site [0]. Their github [1] repos have all now been made private. This seems most unfortunate for anyone who's included FDB in their tech stack.

For reference, FoundationDB has been compared to Google's F1 database [2] [3], so this is Apple purchasing a pretty shiny piece of technology.

[0] http://community.foundationdb.com/index.html

[1] https://github.com/FoundationDB/

[2] http://blog.foundationdb.com/7-things-that-make-google-f1-an...

[3] F1: A Distributed SQL Database That Scales - http://research.google.com/pubs/pub41344.html


Apple acquired CUPS (Common Unix Printing System) which is still used by a lot of Linux distros (like Ubuntu and Fedora).

CUPS was bought by Apple in 2007 and continues to have open source releases and continues to be used by major Linux distros.

For FoundationDB, there isn't really the same market. Most sites don't need a distributed database. People that run sites is already niche and people that run sites trafficked enough to require a distributed database is even more niche. I use HBase and Kafka a lot and there are definitely high-profile users of the software, but it's nowhere near the number of users of MySQL or PostgreSQL.

Maybe Apple will look at FoundationDB as something where open-source in no way hurts them and gets them free development. Google won't be replacing F1/Spanner with it so it's not much help to what a lot of people would consider Apple's largest competition. Plus, open-source could help make it better.


CUPS was fully open source when it was acquired.


Technically, NeXT was partially open-source, and they maintained Darwin for awhile, but it's pretty much dead now.

Edit: More recently, they killed off OpenNI about a year after buying PrimeSense


Not an open-source project, and not a big user-base, but Apple bought P.A. Semi and canceled the PWRficient line of POWER-architecture chips.[1] Presumably, that was inconvenient for P.A. Semi's customers.

[1] http://www.eetimes.com/document.asp?doc_id=1168406


Looks like an acquire hire and product kill scenario.


So, database experts , what was specific about foundationDB ? Why apple chose this one ? There is planty of [a-bA-B]+DB's out there. Why this one ? , Sry I dont know anything about databases , But I would love to know about them.


Distributed ACID transactions. Very good testing regime. Good performance. Basically the NewSQL dream.


Mmm... they were just doing shared-data distributed MVCC. Nothing new there. My takeaway from spending some time with the documentation that their SQL layer was a hack.


This, and there is (or was) nothing else like it in the marketplace.


Well if it truly mattered out there right now there is a company that has a void that they need filling and all they have to do is speak up and help fund a replacement.


That won't do much good when all the rare talent needed to build it is locked up at Apple and Google.


There's other rare talent coming online all the time.


Well in that case, this problem should solve itself any day now. /s


Except the SQL part didn't work well.

If you want those same features, consider VoltDB. It makes different tradeoffs on what kinds of ops are fast, but has all of those features.

Essentially arbitrary two-key transactions are slower. Partition-friendly writes and global reads are faster. Global reads with real SQL can be orders of magnitude faster.


Based on a cursory glance at the properties of ACID, it seems like something that the financial world, especially the stock exchanges, have had to deal with when dealing with transactions.

I wonder if a similar architecture could be used for building a distributed database that could rival what was lost with foundation.


Financial services companies have three different issues: * Banks don't do multi-record SQL (see Brewer's article on the topic) * They don't do many transactions * Nobody got fired for buying oracle or DB2

An oracle replacement won't be built for financial services companies.


Can you link to Brewer's article please...


ActorDB is a distributed SQL database with ACID transactions http://www.actordb.com/ and open source. Unlike FoundationDB SQL is a fundamental component and not an addon.


Oh, we will build it, I have no doubt. There's a demand, so I'm sure in 5 years Apache will have some project that quenches our thirst.

The key with transactions is that they make life easy for the client. AP systems are easy for database engineers to write (and yet they still manage to screw them up :P), but systems that support ACID constraints are easy for application developers to use. That's why FoundationDB was so special; they promised the best of both worlds; the horizontal scaling of traditional NoSQL systems, with the ease of use (w.r.t. reasoning about concurrency) of SQL/ACID systems.


Just have a look at how they do testing (Simulation/Flow). It's amazing.


That's exactly the one thing i remember about them. I saw the video about their testing method and knew they had to go somewhere eventually.


> There is planty of [a-bA-B]+DB's out there

You probably meant [a-zA-Z]+DB!


I interviewed with FoundationDB and came very close to working there. They were about as smart as you would expect the people behind that kind of technology would be--scalable distributed transactions at speeds many zeroes higher than what was thought possible--and I'm glad to see them succeed, but wondering what's going to happen to their software. Whatever secret sauce made their software I'm not convinced the market can replicate anytime soon.

I also can't help but wonder how much my options would have been worth.


I interviewed with FoundationDB last August or so and got a very paltry offer. On top of that, they wanted me to pay my relocation and cover the cost of the signing bonus clawback. My stock options (had they fully vested at the time of the Apple sale) would have been worth under $14,000. I don't think I missed much.

edit: scratch that, $23 million is the amount they raised, not the amount of their sale. Regardless, unless they sold for 10 figures, I don't have any regrets.


This just lines up with what we've seen in the KV space over the last 5 years. Mutating data and key-lookup are all well and good, but without a powerful query language and real index support, it's much less interesting.

Quoting the Google F1 Paper: "Features like indexes and ad hoc query are not just nice to have, but absolute requirements for our business."

Cassandra got ahead of this with CQL. FoundationDB saw this coming and bought Akiban to add a SQL layer. But bolting SQL onto a KV store, even a really good one, isn't trivial to do. I'm not sure it ever delivered on the promise of a real query layer.

Still, I hope this is a good exit for the FDB team. The KV layer is pretty cool stuff.

Full disclosure: VoltDB engineer here.


> But bolting SQL onto a KV store, even a really good one, isn't trivial to do.

Sure, but isn't that basically what everyone does? All of the relational databases I know use ordered key-value storage engines. FoundationDB is the same, except it's distributed.

The point is to use FDB as a foundation. I heard how one of the FDB founders replaced SQLite's B-tree storage engine with FoundationDB, which turned SQLite into a distributed SQL database. It's much more difficult to make something like that if you had to tackle the hard distributed systems problems.

(I interned for FoundationDB a couple of years ago.)


> Sure, but isn't that basically what everyone does? All of the relational databases I know use ordered key-value storage engines. FoundationDB is the same, except it's distributed.

> The point is to use FDB as a foundation. I heard how one of the FDB founders replaced SQLite's B-tree storage engine with FoundationDB, which turned SQLite into a distributed SQL database. It's much more difficult to make something like that if you had to tackle the hard distributed systems problems.

So this is a really interesting point and I can see why it makes sense if you’ve not built a SQL engine. So why is building SQL on top of a KV store the wrong call? What’s the difference between MySQL’s or Postgres’s or VoltDB’s “storage engine” and what FDB had built?

First, I’m not claiming that putting SQL on top of a KV store like FDB is impossible, just that you’re going to have to either compromise the purity of the KV engine substantially, or you’re going to get slow SQL for anything other than single row CRUD.

It starts with metadata. FDB-SQL stores metadata in the KV store itself. This is great from a distributed correctness point-of-view. It’s also great from a simplicity point-of-view. If I trust the underlying system to be safe and consistent, then my metadata is also safe and consistent. But now I need to do a ton of reads before I run my SQL to know how to run my SQL. Where’s my data, for example? Compare this to VoltDB which replicates metadata to all processing sites, and basically has a second state-machine to ensure that each processing site has the right metadata at the right time. Updating metadata is more expensive, but the fast-path of using the metadata is several orders of magnitude faster.

Then you get to locking. I believe FDB-KV uses per-key read-write-sets to manage concurrency. This choice makes their two-key transaction benchmarks look great, but it scales poorly as the number of keys per transaction grows. SQL basically begs users to write operations that read and write to lots of keys. To make scans fast (even common partial-index-scans), you’re going to need more granular locks, or you’re going to need to relax consistency (ACID, not CAP).

Now we get to storage efficiency. If I have a SQL relation with a primary key and other columns, do I split it so the primary key is in the “key” and the other columns are in the “value” in the KV store? Do I store the whole record in the “value” and loose some efficiency? What about relations with no primary key? Say I’ve got a table that facilitates a many-to-many relationship and it’s just a set of integer pairs. How do I store that efficiently in a KV store? And what about that pesky metadata? Does the key need to include the identifier of the SQL relation (table)? Of course it does.

Secondary indexes. Oh man. To do them well on top of a KV store, you’re going to need pretty strong consistency to ensure that index records point to the right base record, and that there are no base records without an index record. This rules out the aforementioned trick of relaxing consistency to get faster many-key transactions. It’s also a metadata problem; I’ve got metadata I need to read to understand how to use the secondary index. More than that, when I update a record in the base relation, I’m going to need to find all affected secondary indexes and make sure they reflect the new information. That might mean lots of reads to the system to query metadata to update the secondary indexes. And again, we have the efficiency problem where I have to put a bit of extra stuff in each “key” to identify the secondary index the KV-pair belongs to. So secondary indexes have locking/consistency, storage efficiency and metadata problems in this model.

Finally you get data transfer. This is probably not a broadly applicable problem, but I think in the FDB implementation, there is a lot of needless data transfer between KV cluster nodes and SQL processing nodes. Too much data needs to be collected and processed, rather than pushing down that processing to the data’s resting location. In FDB-SQL, if I just want one value from a large record, do I have to move the whole record over the network? Most SQL systems build processing DAGs for each SQL statement and can in-process stream between nodes, or have efficient temp tables to buffer intermediate results.

This falls out of making the SQL layer so separate. In fact, I think the team working on SQL was almost a completely separate team in Massachusetts than the KV team in Virginia. If the SQL layer worked, then I can see how awesome this would be. The “Layers” model just sounds so appealing. It’s just technically quite hard.

So yes, I agree that under most SQL-relational systems, there is a storage engine that smells much like a KV store. Still, that system is much less “pure” than what they were going for at FDB. Locking, metadata, secondary indexes, native understanding of relations, moving the processing to the data — all critical to do right to get reasonable performance.

If FDB had more time to continue on the SQL path, I imagine it would dictate much deeper hooks into the KV side of things. I’m sure with time it could get a lot better. There’s a lot of research addressing some of the tradeoffs I’ve mentioned above, especially in the distributed transaction coordination space. Still, it’s always going to be easier to build the storage engine for the kinds of operations you want from day one.


One of their engineers gave a great talk at Strangeloop (https://m.youtube.com/watch?v=4fFDFbi3toc) that shows some of the amazing simulations they ran FoundationDB through. It's well worth a watch if you want to understand the punishment that they put this database through to make it stand out.


This actually makes me pretty sad. FoundationDB was the best shot the world-at-large had at a decent distributed database.


Well this is a very unusual announcement.

From what I know Apple is a big user of Cassandra and Teradata for iCloud, iTunes etc. Both of which are very solid databases that have been proven to scale.

I am not doubting FoundationDB's credentials but it's pretty extraordinary if they are having scalability issues with either.

My guess is that Apple plans to create an equivalent to Facebook's Parse. Either that or this is an acquihire.


It could be a compliment for Swift but seems like an acquihire. Distributed db startups are maturing, market is getting crowded. Yes, Apple is a heavy user of Cassandra last time I checked.

Still a success for the team and should be celebrated appropriately. :) Startups are hard.


Why antagonize the community by pulling repos if they only wanted the brains?


Apple has a 1984-like policy about acquisitions where they erase any trace of the acquired company, and they follow this policy to a fault.


The irony s thick considering that landmark 1984 ad they published during the Superbowl...

Apple used to claim to fight against Big Brother, today they are largely seen as Big Brother.

Thank god for Android for keeping them honest.


Isn't following a policy like that at all a fault?


Yes, if putting "Apple" and "1984-like" in the same sentence wasn't obvious enough :)


Two good reasons:

1. Why waste resources in maintaining something you don't intend to continue to support.

2. Given the litigious nature of the US, Apple probably doesn't want to deal with a codebase that could include all kinds of potential legal issues that are now suddenly Apples problem.


Apple released CloudKit last year. It's not web and multiplatform like Parse, but it's similar to Parse's initial offering.


Would be nice to know what this means for existing customers. Did we just waste a bunch of time with FDB? Is it going to be open-sourced?


You can no longer download from their website. So, I'd assume yes, we all wasted our time.


Yeah, noticed our CI failing :( ... so not cool


About FoundationDB, they have this blog post about achieving almost 15 million writes per second

http://blog.foundationdb.com/databases-at-14.4mhz

It's a pretty impressive number, though as always with benchmarks, should be taken with care.


As an anecdote, I was seeing outrageous, hard-to-believe numbers in my usage of FDB.

Nothing but good things to say, other than their multi-node configuration being a little strange.


One interesting aspect FDB was offering is the multi-model approach. Fortunately, there are still true open source alternatives for this like ArangoDB and OrientDB.


Neither of those even come close to the out-of-the-box performance you could see with FDB.


Numbers? I would like to see some test results out of curiosity.


And both are not key/value stores. They are multi-model databases with the focus on documents and graphs.


Can anyone explain what's so special about FoundationDB? Why Apple would want to acquire it? Why exactly FDB and not some other *DB, which we have tons now after last… I don't know… seven years? Just why? I don't get it.


FoundationDB was a performant, distributed (multiple machines), shared nothing (multi-master) key-value database with SQL on top (not sure if it was standards compliant SQL) that supported translations and serializable isolation (ACID complaint IIRC, but they were some size limitations).

In any case, given that narrow feature set, I can't think of many production quality NoSQL databases with that feature set, and I can't think of any that replicated the millions of writes/sec on GCE like Foundation did, so Foundation seems to be quite special.

Interestingly enough Apple is a heavy user of Cassandra, so if this related to iCloud I wonder if they decided to replace it and why. Or maybe they had enough money where the decided they could just own the database, and DataStax's valuation was too high.


Can someone give a quick description of what made FoundationDB different than the bajillion other database engines out there?


CP database (in the CAP theorem sense), distributed transactions via Paxos. Most popular databases (Cassandra, Riak, et al) are AP* systems. CockroachDB is an interesting project in progress that aims to be a CP system from my understanding.

EDITED: seancribbs corrected my acronym spelling in the early morning; *CA - DOES NOT EXIST, what I wrote initially was completely wrong


You mean AP, not CA.

CA doesn't exist.


> CA doesn't exist.

I'm not sure that's a good way of putting it. You can have databases that offer both consistency and availability, but they need to fit on a single machine (no partition tolerance).


> I'm not sure that's a good way of putting it. You can have databases that offer both consistency and availability, but they need to fit on a single machine (no partition tolerance).

Garbage collection requires (a delay that is indistinguishable from) partition tolerance. It's pretty illusory on single machines as well. Especially if they're multi-core.


They might be related formally (though I'm not really sure). In practice, relational databases generally implement very special-purpose concurrent garbage collectors that will totally lock up only under pathological circumstances (though I won't pretend that it doesn't happen). If you're talking about GC in general, with special-purpose hardware and/or real time kernels, GC times can be bounded deterministically and are thus distinguishable from a partition (except in rare cases, as with all garbage collectors with cycle collection, where they must revert to stop-the-world because they can't reclaim memory fast enough). And, of course, it is possible to write actual real time systems that do not require garbage collection at all (though scant few of them have been formally verified to the extent that one might like).


CAP doesn't really apply to non-distributed systems. So I'm with the original sentiment. There is no CA.


It exists, it's just that assuming partitions never happen isn't very practical.


I may be incorrect, but my impression was that concurrent systems formalization (at least, in ways that are actually relevant on real hardware) was still an area of very active research. Is saying they exist meaningful even in a strictly theoretical sense?


Thanks man, good you caught it, I totally knew that, I need to be more careful with acronyms right after waking up :)


Interesting, they invest heavily in Cassandra (something like 10PB, 80K+ nodes). I've not really seen much of FoundationDB but is it not similar?


It really wasn't. FoundationDB's USP was transactions. Cassandra was designed as a AP system, and using it as a strongly consistent system came at great latency costs. In addition, atomicity outside of a row was all but impossible. Features such as lightweight transactions tried to remedy that, slightly, but suffered from poor performance (was basically just bolting on paxos using the Cassandra protocol, as I under stand it), and correctness problems (see aphyr's post on Cassandra). FoundationDB, on the other hand, was designed from the ground up to support transactions. Features such as MVCC, multi key (read: multi-machine) transactions, all while providing a sane-ish datamodel and very good performance... well boy oh boy they had some real potential.

And now they have been subsumed by Apple. "and they were never heard from again"

EDIT: a while back I commented on a comparison of FoundationDB to Cassandra, and Dave Rosenthal, the FoundationDB CEO, took the time to respond. I completely disagreed with him at the time, but think I understand his points now. Link: http://news.ycombinator.com/item?id=8745462


I too am interested in this. I just took a look through the docs. SQL layer over a distributed key-value store...

Full ACID and looks to have support for FK/PK JOINs/ and multi-table queries. Their benchmarks page looks super awesome. How well does this work in practice?

One major difference from Cassandra (I think) seems to be that coordinator nodes are set statically (but can be changed on the fly). Cassandra is not this way. While there are coordinator nodes during client connections, they are chosen dynamically by the client during the session and not some fixed config point.

Another difference appears to be transaction limits (https://foundationdb.com/layers/sql/documentation/Concepts/k...). This is fundamentally different from Cassandra, but to be expected without tunable consistency.

Different tools for different problems. I think Cassandra fits Apple's content distribution model better (e.g. streaming music/movie blobs out of C* for iTunes all over the world), but for a traditional RDBMS that is distributed, this looks like a great escape.

Scaling seems pretty easy if it's as easy as copy-pasting the config file from node to node and bouncing the service. Anyone know about this in practice?


To support my previous comment. Quoting David the founder: http://techcrunch.com/2012/09/10/foundationdb-not-your-stand...

“The most important innovation with MongoDB is its API,” Rosenthal said. “We sell an amazing storage technology that could be compatible with NoSQL technologies like MongoDB.”


Not only they have SQL layer. But mongodb compatible document layer should have come out soon.


Woa, that was unexpected! So sad to see this great piece of software dragged to Apple dungeons.


If foundationdb was open source, can't the community continue with the last open sourced release?


It was closed source, with only parts of it being OSS.


Oh, from the comments here I got the impression that it was open source. If it was not open source to start with why are people saying that amazing tech is lost? It was never available(in the knowledge sense not as a product) in the first place was it?


The full source wasn't available, but at least the product was available to the market. With Apple buying it up and taking down the source repos and the full product download page, the tech is lost to anyone outside of Apple.


I wonder whether Apple is planning on open-sourcing the code. If they want it for internal use, then open sourcing it could make the whole project less expensive.


As optimistic as you might be, I have to imagine the odds of that happening are exceptionally small.

While you could point to the pulling of downloads, and pulling of GitHub accounts as a transitional 'rebranding', I just don't see what incentive Apple has to open source a system that is useful to bigger players and generates a lot of revenue.


> it could make the whole project less expensive

I'm sure a company that's richer than many countries is really, really concerned about the cost of development.

The benefit of making sure no-one else can benefit from the technology, on the other hand...


Is it worth it for Apple to deny the industry a cutting edge database with the only cost being a single day's profits or less? Of course it is.

And you're spot on about the cost of development. I'd imagine 12 - 48 hours of profits from Apple would be enough to fund its database R&D for the next 10 years.


Congrats Nick and company! Way to go! And crap! I was hoping to use that stuff.


Be happy that you didn't!


Actually not really. They had some really awe inspiring stuff working, ACID at speed. One of the things that could have enabled would be multi-user gaming at a scale that is only imagined today, literally several thousand players on engaged in the same space at the same time, all with predictable semantics that would allow you to provide for individual engagement statistics and client visibility (basically everyone would see the same things happening at the same time, the really "hard" bit of multiplayer gaming). I had started coding up a massively multiplayer version of rogue since my 3d art skills are crap, the idea being to see if we could get a thousand people into the same room of an ascii dungeon at the same time and usefully engage enemies and collect loot. That simple example touches many places where things have to work predictably in order for a virtual "space" to succeed.


Hehe, that's so scary :) I'm busy sketching out something exactly like that at the instigation of my son (who is an avid gamer). I'm nowhere near realizing the vision but I've learned an awful lot just studying the problem from different angles. The key element seems to be 'who owns that data?'.


Exactly, so the player can be modeled as an index attached to several database records. You end up doing a geo-box select (see earlier posting here about why geo information data bases are hard), an inventory select, an equipment select, and an attributes select by player-id. The linkage is (player-id, geo) -> key for 'world' based databases, (player-id) -> key for 'player specific' databases. When you go to do an action you combine the action mechanics (pick up -> delete from world, add to inventory), (attack -> (roll) -> edit other attributes of monsters/players near by), (defend (roll) -> edit player attributes)) and (drop -> delete from inventory, add to world). Where at 'n' frames per second you need to do a select for a given view point to identify what the rendering engine needs to "show".

Clearly doing this with ascii characters on a terminal screen is much easier than trying to render 3D avatars in the real world but the number of database transactions per second becomes massive. If we posit that there are 1,000 players visible to the current player, and just the player / world select is a single transaction, then at 30 fps, that is 60,000 or so queries per second, you throw actions between players, and between multiple players (area of effect attacks) and add another 2,000 monsters and you're easily over 250,000 transactions per second.

Since FoundationDB was talking millions of transactions per second with full ACID it seemed like it would be a useful back end for this. Or put another way, you could contemplate building this and see if their product was actually able to keep up at this rate. And they asserted this was on modest hardware (like 32 nodes).

So yes, I was looking forward to the interesting places that investigation would lead and the things I could learn.


I sent you that mail, please check your spambox since google has an alarmingly high rate of false positives these days, even with people that you've already exchanged email with in the past they'll happily bin messages. Highly annoying.


I think I have an interesting take on this that I'm not sure has been used in a production game yet. I'll send you an email.


Reminds me a little of Apple's buyout of the compositing software Shake from Nothing Real and their brutal "end of life" process.

http://en.wikipedia.org/wiki/Shake_%28software%29

"Existing maintenance program subscribers had the option to license the Shake source code for $50,000 USD."


What is needed is a middle ground between OSS and proprietary licensing. Towards this end we are attempting to define a new software license that is organized around the idea of a software cooperative. This provides the basis for a sustainable business model that OSS, frankly, does not have.

This also helps solve the larger problem of concentration of wealth/power within large corporations as the cooperatives would be distributing the wealth to all participants and not just to founders.

It also provides a better path for startups so they don't have to borrow VC money and can get the help of the community to move them toward profitability in the long term rather than taking the first available bag of money from disinterested investors who only want a return on "their" money.

Cooperatives are among the most stable forms of enterprise and are run via direct democracy (at least in our view.)

To discuss this further, join our mailing list here:

groups.google.com/d/forum/coopsource


Now, what I do not understand is that why would Apple wish to make an over-the-top TV set or whatever. I'd say that they have a loyal userbase--if it only consisted of the mac-using coffeeshop-goers, the business probably would still survive--and a sustainable business model, and they are not going in negative direction economically. Why install another arm to your business?

I also wonder if it is lawful to remove an opensource project overnight from public access, to which people outside the company might have contributed. I have not used this product before and I know neither its licence nor the amount or existence of its contributors, but if I have had made a significant contribution, I would be very annoyed in this case and have sued--if possible--Apple or whoever responsible for publication of the sources until the acquisition.


How did it compare to ActorDB [0]?

[0] http://www.actordb.com/


Congrats to the FoundationDB team from us at Aerospike. We respect what FoundationDB was doing with NoSQL and ACID and believe this validates the importance of reliability and enterprise-readiness in the NoSQL market.

I'd love to hear from anyone who was actually using FoundationDB in development or production, and what other NoSQL open source alternatives you are considering. pcorless@aerospike.com ; http://goo.gl/KVQxyq


The company I'm working for is looking for a way to scale our DB-layer. Anyway, FoundationDB was more or less the only candidate against MySQL.

Does anyone know of any good (and proven) alternatives to FoundationDB?


Consider MySQL compatible ActorDB


Interesting. I was checking out FDB earlier this month for solving some interesting scalability challenges. I'm quite glad I hadn't decided to re-architect a bunch things based on it!

Did any of you get bitten?



I feel really bad for all other database startups out there. I'm sure this is sending a negative shockwave through every devops team out there to not use up-and-coming databases.


They removed the npm package few hour before our very important deployment. As the result, our client didn't get a feature, we didn't get our money. Die, FoundationDB, die!


Funny how everyone accepts this as fact despite techcrunch printing it. The only facts available are that the repos and downloads have been pulled.


community.foundationdb.com is a little more informative ... not much but a little.

"Thank you for your support of FoundationDB over the last five years. We’re grateful to have shared our vision of building the best database software and we strongly value your participation in this community. We have made the decision to evolve our company mission and, as of today, we will no longer offer downloads. If you have any technical questions, please email info@foundationdb.com."


I don't want to believe it, but I'm starting to think Apple is the new Microsoft / Oracle.


What's so surprising? Apple has been trying to be a monopoly for more than three decades. Just because they have consistently failed doesn't make them less evil that companies that succeeded at it.


Actually nothing is surprising about it IMO. But I always get downvotes when I appear to be bashing Apple too hard :)


And this is why my chef broke todya.


Very well said!


I don't think people should have downvoted you for this, but you've also taken this thread way off-topic, and not in the unpredictable satisfying way.

(I've detached this subthread and flagged it as off-topic.)


What? I can't do more than just silently upvote if I have nothing to add? The "karma" here means nothing when you think about it - you lose karma in the real world with your purposeless negation.


> The "karma" here means nothing when you think about it

it affects sorting IIRC.

And in the case of your comment it makes it stand out as less readable as a sign for anyone who hasn't picked up the house rules yet. : )


It surely does affect sorting or silencing unpopular voices, but unless it's a top-level comment, it doesn't really matter. Plus, most people who one should care about when making the effort to post a comment, read at least all top comments and scan the rest. Second-level and deeper comments usually get lost unless they are attached to a "popular" top comment or the discussion is small. Also, if you have a unpopular opinion, you should be scared to reply to a reply as the same people who downvoted your first comment and invest their time in downvoting the reply as well. The commenting system here is prehistoric and lacks basic understanding of psychology. It stimulate ass kissers, not people with unique views and unpopular opinions that can actually add some additional perspectives to otherwise the monotonous highfiving.


You can't down vote your reply. Also if your comment is dead, it's hidden by default and remove the noise from the thread


Never said you can downvote a reply, but this gives more privileges to the silent downvoters as they keep downvoting and waiting others do the work and keep downvoting. I know this as in most cases, my unpopular comments get the same negative score (well, usually having -4, to be fair). I've seen some cowards who delete their comments when they start getting downvotes, which breaks the discussion as well and makes the replies out-of-context.

How can a -4 comment be "noise" if there are tons of replies? I often have a root comment with -4 and then tens and at times hundreds of comments underneath?


The comment regime on HN is broken. Down voting is anonymous & carries zero social cost. The fact of a homogenous karma space is a contributing factor: user x /may/ have gained karma points based on input on subject matters that have nothing to do with the topic at hand.

And I agree with you regarding the real world karma consequences of this for HN :)


HN is very important, but it is also greatly neglected. I'm not sure if the goal is not to compete with some of the startups or if Paul Graham has some future plans about it. Unfortunately, due to loyalty, people keep sticking in here even if better alternatives pop up from time to time. Some of those being Lobsters [0] and Monocle [1].

[0] https://lobste.rs/ [1] http://monocle.io/


I agree, what a bunch of horseshit.


I think the problems you describe are with people (so therefore should be dealt with through community standards etc.), not the karma system. How would you improve it?


The karma system is a crowd-sourced pseudo-automated, and greatly imperfect system to enforce community standards. It's not even using the "wisdom of the crowds" as it's mixing different things into a single meaningless number. It has some completely random ranges that give you different privileges (read "lift restrictions").

I haven't really thought about this, but here are some changes that I would borrow from here and there with some that I haven't seen elsewhere:

- a downvote costs you something (a karma point or half a point) ala StackExchange, which makes you think twice before you're urged to punish;

- categorize votes ala BuzzFeed reactions insteading mindlessly up-/downvoting;

- your karma gets a share of the collective karma of your reply tree - those who start vivid discussions should be rewarded;

- reward with own karma - if you really like somebody's comment, why not donate some of your own karma and reward quality comments with more than 1 point;

- there's no point to keep punishing somebody for their multiple replies - this silences voices; you should be able to pick and rate the overall participation of somebody in the thread, not punish them multiple times for each attempt for them to convey the same thing.


I don't even see a downvote button. When I came here first I thought nice you can only upvote here, so the community must be nice. Got down voted pretty quickly. What does even green username mean?


From some discussion earlier today, you must have a collective score of 500 upvotes before you have the ability to downvote. I would guess that is to enforce a period where new accounts can get used to the generally accepted standards for discussion here.


Thanks. I had 7 but now I have 5... lol


It means you're "green", in the sense of young or inexperienced - i.e. a relatively new account.


Huh. Cool. I just kind of figured I'd stumble across an answer to that one and here I am. Makes good sense too. Thanks.

Hey, here's a question, why does the comment reply link sometimes not render until I refresh the page? Are they on a timer or something?


Yes, if a thread is gaining posts too quickly, HN will suppress reply buttons to try and prevent people from making hasty/not well thought out posts.


yet another argument for rethinking the open source model.


yet more downvotes, with no reasoning, on a relevant comment. hacker news is like that significant other that abuses you and you keep going back to. i can't keep making excuses for you hacker news.


I believe you were downvoted because you tried to inject your agenda even though this has nothing to do with Open Source.

FoundationDB was closed source, they just allowed to use their product for free in small deployments.


I believe you don't know what you're talking about. Forbe's thought it was an open source issue. I don't think many would say Forbe's has a radical open source "agenda". http://www.forbes.com/sites/benkepes/2015/03/25/a-cautionary... EDIT: just to clarify, i'm not saying Forbe's said FDB is open source. i'm pointing to the fact that someone that writes for a national business magazine about open source issues thought it was an open source "issue". if persons working in open source can hope to make a living, counting business as a stakeholder is a good idea.


Are you for real? Are you telling me that a someone's blog on a business magazine has a larger knowledge of technology than someone who's actively involved in it? If you have a medical condition do you also read health industry section in Wall Street Journal?

By the way, if you did not see it there was an update posted that states that FDB was closed source. It also doesn't matter that some components were open source or not, because without FDB they are useless either way.


Hope app store can be faster




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: