In the coming weeks we will hopefully answer the questions people have and will be calling on the community to help forge the initial RIAK Roadmap.
One of the initial questions we have for the community is which OSS license would people like applied to the code? Our thought is the most open and permissive.
There are companies that refuse to use GPLed code. They want to modify and then sell closed forks of your product, without paying you, and by doing so they fragment the userbase. GPL prevents this.
If you want to allow them to do so you can always switch to BSD or dual-license - perhaps after a private agreement and if you find the companies reputable.
Will this include the Basho technical documents? Much of the better technical documentation lives on basho.com and was returning 404's.
[edit: and a +1 for Apache V2 (or MPL)]
As far as licensing goes, it would be great to have one that would allow me to link the client code libraries and run the servers without having to worry about anything. If I understand correctly, Apache2 goes way beyond that, while GPL is somewhat ambiguous on that. (Yes, one could implement their own client, but that'd be reinventing the wheel).
Thank you for keeping this work alive!
Apache 2 or MIT would be the most permissive.
Strange world we are living in ...
For libraries, yes, absolutely, companies should rightly be cautious, because it's relatively easy to expose yourself to legal risk.
Nobody is "going away with the work of others" – the whole point of a license is so contributors decide how others can use their work. What exactly is the issue here?
Apache 2.0 is the standard for almost every company I've dealt with, with MIT and BSD 3 clause being accepted as well. But Apache 2 seems to make most lawyers happy.
I'd speculate that for many companies the GPLv3 is too strong in its copy-left stance, by disallowing Tivoization.
The distinction between the two text is case of paten agreements.
GPL is a great license that ensures that improvements to the product are shared back with the community. In fact, for a database like Riak, Affero GPL, like the MongoDB license may even be better, as organizations that make improvements to Riak but don't redistribute it but run it internally on their servers would also be required to contribute back their modifications.
Two things wrong with this right off the bat. First, not all of these are GPLv3, which is the version with the problematic patent clauses that scare companies away. Second, it is not just in libraries that the GPLv3 is generally accepted to pose potential issues. You need look no further than Apple and the great investment they went through to avoid ever shipping GPLv3 software. The GPL kicks in as soon as you distribute the software.
So, if Linux were GPLv3 you're saying that most companies wouldn't use it?
> You need look no further than Apple and the great investment they went through to avoid ever shipping GPLv3 software.
Most companies aren't like Apple.
> The GPL kicks in as soon as you distribute the software.
That's right, and that's a good thing. If you distribute GPL software that you modify, you need to make your modifications public. But as Riak is not a library, it does not infect any other component. If you distribute software with (say, Android) Linux but don't modify Linux then the GPL doesn't infect the other components of your software, as they're not actually linked (or Linux has some explicit exceptions).
What about the client libraries? Do you consider them to be a part of the product? It might make sense to release the client code (library) and the server code under different licenses (Apache2/GPL, respectively) if GPL is to be used.
But that doesn't mean it would work for most projects.
My favorite story about that migration is that four months into the project, we were moving along nicely and ready to start cutting over to Riak after the Christmas holiday. My office had decided to go out to lunch our last work day before everyone headed off for the holidays, and I received a call from a SDB support team member informing me that we either needed to move off their platform by Christmas Day or they'd have to shut us off (we'd start degrading other customers). Fortunately we were nearly ready to pull off the cutover, so we quite literally cut all of our traffic over on Christmas Eve (one of our busiest days of the year).
Over the upcoming years, our database continued to grow and grow, and all the while Riak trucked along. It wasn't always a smooth road, and we certainly had our challenges from time to time, but I have yet to hear of anyone who has used a massive 100TB hot database and not had to do work and maintenance.
Throughout the years I used Riak, it served me very well. I'm grateful to the innovative work so many at Basho created during their tenure, and I'm glad to see Bet365 attempt to steward the project to a new phase of its life. If I could toast every last basho employee, I would. Thank you all!
Wat?! That's an excellent reason to burn fields and salt them.
> Amazon SimpleDB is designed to integrate easily with other AWS services such as Amazon S3 and EC2, providing the infrastructure for creating web-scale applications.
If I hadn't heard your story, I wouldn't see anything there to steer me away from it.
> While SimpleDB has scaling limitations, it may be a good fit for smaller workloads that require query flexibility.
> Within a datacenter, the Mean Time To Failure (MTTF) for a network switch is one to two orders of magnitude higher than servers, depending on the quality of the switch.
A switch is highly unlikely to fail. They seem to be bulletproof. But having worked with a datacenter (on the engineering team of an early AWS competitor), switch _misconfiguration_ was all too common. Maybe a tech accidentally plugs in the wrong ethernet cable and forms a switching loop. Maybe someone fat-fingers a tag and a broken VLAN gets automatically deployed to 10,000 nodes. Either way, the _switch_ is alive, well, and pushing packets - but they're the _wrong_ packets and the result is indistinguishable from hardware failure to the end user.
At datacenter scales, these things happen... not infrequently. If you engineer your database to expect that netsplits are rare, you're going to have a bad time.
EDIT: I don't mean to disparage it, just that it doesn't come as purely from one direction as Riak. It certainly appears to have won
- They raised $60mil (the last round being a Series G for $25mil!)
- They're being sold for what I can only this is a pittance / acquihire by Bet365
- Bet365 is buying them because they're heavily invested in Riak internally
- They're open sourcing the code (likely because they need help)
I'm not trying to disparage anyone here, but this makes my blood run cold.
Is Riak so critical to Bet365 that the right move was to _buy the company_ versus switching to a different storage system?
Is there a NoSQL vendor that's doing well out there?
Should we all just be using Postgres anyway?
Unlikely. I genuinely believe Martin Davies is doing it as a service to the Erlang community. bet365 doesn't need to open source anything. They've got deep enough pockets to hire developers to maintain Riak internally for the foreseeable future.
> Is Riak so critical to Bet365 that the right move was to _buy the company_ versus switching to a different storage system?
Yes. There isn't a storage system readily available which offers the same capabilities as Riak. At the scale at which bet365 operates, doing so in a gradual fashion takes years. By capabilities I mean the ability to make different tradeoffs in different use cases. Riak has a nice set of levers to tradeoff consistency for availability. Its built in support for CRDTs is quite amazing.
Quite possibly, yes. Data has incredible inertia. Not only because of the storage system used, but also all of the years of ops tooling, application integration, etc. that surrounds it. I don't know anything about Bet 365, but if they had any non-trivial amount of their business data locked up in Riak then from a cost/risk analysis they may have decided that purchasing the company was the safest and cheapest option to protect their own company.
Plus a lot of talented people jumped ship before it sank on account of the management team.
How much of a halo do you get from writing a nice post about moving to Postgres? How much of a halo do you get from buying and open sourcing something cool?
Open sourcing the software will surely be a good thing, though. I'm eager to see where it leads.
(Admittedly, our use of riak was clearly a case of premature optimization: we didn't have anything like the scale to require an available data-store.)
And, although clearly we're swimming in more data than ever, not all that many people have both massive data sets and a burning need to never lose any of it. A lot of data can be lossy without any real consequence.
However, the Dynamo design is just borderline unpalatable from a developer's perspective. I enjoy being aware of exactly the trade-offs being made, and Basho was always really open about it, but I don't blame anyone for not wanting to dig deep in the weeds to make it work for fairly common use cases.
(Disclaimer: former Basho technical evangelist & engineer.)
Suggestion for something much more meaty: https://youtu.be/3SWSw3mKApM
This talk on LVars by Lindsey Kuper blew my mind https://www.youtube.com/watch?v=8dFO5Ir0xqY
There's a fair number of RICON videos on Vimeo too. One of my all-time favorites from Joe Hellerstein: https://vimeo.com/53904989
Some time later, even the guy who interviewed me was out from Basho.
Idk, whole process just fell very weird.
I have a client with a support contract and while I don't know for certain that they didn't contact someone within the organization, I know that most people, including myself, found out only when someone asked if we could contact Basho support and they were informed that it wasn't there anymore.
Reminded me of a usenet post from years ago but can't find it now, anyone know which one I mean?
While looking for said usenet post I found something else though, here: https://www.slashdot.org/story/18304
Glad to hear their code will be open-sourced.
Not too much Open Source code released by them. Mainly some forks and two RPC libs.
For me, their restriction on usage in cluster mode, very less support for all languages and lack of file storing abilities like GridFS of Mongo made it unattractive.
Eventually Mongo seemed good choice because it was completely free for any kind of use with amazing features and lots of libraries. I am glad I made that decision :)