Not to take away from some of the great projects featured on HN recently, but statements by Simperium like this make me much happier as a developer and business owner: "We believe the best apps have both a great user experience and unique backend services.".
It is not a backend-as-a-service. We believe the best apps have both a great user experience and unique backend services. Our focus is to provide a great data layer between your frontend and backend while integrating with other providers of tools, hosting, and services.
That said, the data needs to be on the cloud to be useful. So...what to do?
My sense is that a dedicated, trustworthy company needs to store the data and only the data. In particular, they need to not be in any other business, e.g. advertising, or selling stuff to developers, etc.
They need to be owned by the people entrusting them with their data, full stop. That means they have to charge for it.
BTW, I wrote code to do this very thing back in 2009, called HubSync. It's great to see all of these other companies making a go at it.
It's sitting in Simperium's database and you're locked into their platform once you decide to use it. Maybe there's an export, but that's irrelevant because all your apps in the wild and all your users are relying on this service to move data around.
If Simperium gets hacked, you get hacked. If Simperium goes down, you go down. If AWS goes down, Simperium goes down, and then you go down.
Say your app gets huge, and Simperium can't scale (and really, "half a million users" is nothing for a free service; that's not any meaningful scale and not a number that can really be cited as a signifier of ability to scale) -- what happens then?
Say you outgrow Simperium, and it's now costing you too much. What happens then?
The above isn't really about Simperium, but really about all backend-as-a-service services. They're great for starting out, but if you actually have a successful product you're going to need to be running your own backend at some point.
If you get too big/successful/etc and things start dying and you end up like Friendster before Facebook decides to buy your company, you're going to look back and wish you built your own backend and knew what was going on behind the scenes.
I went through all of these business decisions for HubSync in late 2009, and came to the conclusion that you had to do the following to make "owning your own data" a reality:
1. Cannot charge developers or earn money on the actual data itself in any way. (Google will not be competing in this space.)
2. MUST, MUST, MUST allow removing any data from the cloud AND reloading it later, without losing any features/abilities.
3. USERS need to pay for cloud sync storage/service, NOT app developers.
4. Code must be open source, visible, and peer reviewed, just like a crypto algorithm.
If Simperium will do all of these, IMO they can raise money, and achieve a Dropbox-style exit. There's no doubt the potential is there.
2 is the most important. If you cannot remove your data from Simperium's backend and then reload it again later, either to their own backend, or a competitor, you have vendor lock in of the data, and you don't own your data.
3 means you have to have to do something like Dropbox, where you give away a certain amount of storage/sync for free. Ultimately, this is why I didn't pursue HubSync further, even though the products are nearly identical (even down to syncing CoreData on iOS with a web app running in the browser).
(In a nutshell, I didn't want to get funding and devote years of my life to data sync, when I'm in the middle of working on building a live action Pixar. But it's a great idea, and I really hope Simperium or someone else makes it happen.)
This extra degree of control, though not absolute, lets you develop your own solution (or integrate someone else's) in parallel should you stop using Simperium.
Our libraries are designed to have a small footprint; they're easy to both add and remove. And we plan to open source them in the future.
Beyond that, you cite valid concerns for any service.
2. You can write server-side logic
3. Operational Transformation baked in
Seriously I cannot find anything bad about this project. (I really wanted to.)
1. Is the server side written in Python?
(I tried this with my own library and I think it was a wrong direction -- it only made it more complicated than it should be)
2. We've experimented with it, we'll likely add it as an option in the future since there may be security concerns depending on the app.
1. peer-to-peer sync (no intermediate server)
2. cloud server isn't passive and stateless
No. 2 is very important for scalability. 1 is mostly a nice to have, especially on the WAN, where you can sync directly from, e.g. an iPad to your iPhone.
Both are doable (I did both in HubSync back in 2009).
iOS / Objective-C: https://github.com/couchbaselabs/TouchDB-iOS
Simperium team: can you give any guidance on when you'll announce pricing, or roughly what pricing is going to be?
What do you think about Urban Airship's model based on active monthly users? What we like is that the costs are obvious and map clearly to your business.
You guys are a hosting/infrastructure service, and it's probably for good reason that such services have historically charged based on usage. For you that could be something fairly raw like "GB transferred, GB stored" or something a little more abstract like "pushes" or "versions".
You do have control over the granularity of your updates. For example, we spoke to a game developer who would want to disable Simperium while a game is being played, and then enable it again at the end of levels. These coarse changes are supported since they'll resolve automatically when the client comes back online.
A nice product yes, but I wouldn't expect this changes what developers are going to be able to do.
The key here is (I believe) operational transformation (http://en.wikipedia.org/wiki/Operational_transformation), the algorithm behind products like Etherpad and Google Wave.
The observer pattern only handles the case where you have one client accessing the datastore. OT expands those ideas to handling multiple clients. Sync is a really hard problem, which is why almost nobody can get it right. Simperium seems to come closer than most and is packaged in a nice service.
It's really exciting to imagine the possibilities.
Fair enough that it is multiple observer context and I didn't spell it out. It's essentially locking in an OT consistency model/control algo. I'll give you that sync is a really hard problem, but Etherpad, Wave, Google Docs, etc. are just a few examples of some off the shelf solutions (the Wikipedia page details more and doesn't even cover the whole set of what is out there, particularly if you consider version control systems). The big commercial success for generic document sync was probably Lotus Notes, and they made the sync solution a separate product product.
Again, not knocking the product, it looks like a quality solution and one I might even recommend, but I'm not sure I grok how one sees this as opening up a new set of possibilities.
That leaves room for improvement, including some low hanging fruit like eventually giving you the ability to disable versioning for certain kinds of data (like multiplayer updates).
BTW, this is generally really awesome: I am (right now, as in I'm sitting there right now ;P) helping teach a class on cloud computing at UCSB that happens to currently be discussing database synchronization and replication; after spending a bunch of time today discussing "how PostgreSQL is implemented and the basis of different isolation levels in the SQL standard and in MVCC" I took the time to tell everyone about Simperium (which probably makes more sense if I mention that I've looked into building something similar before for my projects; I'm glad someone else finally seems to be coming at it from the correct mindset). Everyone here seems to agree: this is going in a great direction.
Perhaps it's too late to predict that 2012 will be the year of realtime interaction, but between this, Meteor, Firebase, etc., not only are all the tools converging in that direction, but they all appear to be drop-dead easy to use.
Thank you for this. Is there an IRC channel or Google Group for questions?
You can also reach us at firstname.lastname@example.org.
I'm amazed that we're on the same wavelength - we've had to build very similar infrastructure for ourselves for Unipost (www.post.fm). Can't believe we didn't collaborate on this, we'd happily be your first customer :(
A few interesting differences:
- Our approach is more like Meteor - web only, no iOS support
- The backend is a python tornado app that handles validation and conflict resolution before saving stuff to dynamodb
- We have a JS datastore backed by websql/indexeddb/memory that syncs with our backend datastore
- We have "live" Backbone collections that update themselves when datastore queries return different results
- We have a Backbone sync adapter that uses the datastore to persist data locally and kick off synchronization
- We sync a subset of the data (eg 3 months of mail) - thats a core requirement for us
- We sync all of the tables at once, not per bucket, cause queries are joining tables so the
datastore has to be consistent at all times
- No operational transforms cause it doesn't seem to apply to us - pretty "notepad" specific I think
- No versioning as we didn't see benefits for us
- We'll probably open source this stuff when we're done
What do you use for storage?
Right, we hear that quite a bit. A few comments:
- Here's our Backbone sync adapter:
- Dealing with subsets is a priority for us
- OT and versioning are generally helpful for managing changes/deltas
- We're using MongoDB for storage
- PouchDB: https://github.com/daleharvey/pouchdb
- Backbone PouchDB Adapter: https://github.com/jo/backbone-pouchdb
But the Map-Reduce paradigm didn't seem to fit well onto our data model.
Whereas CouchDB does master-master replication among instances of itself, Simperium can accomplish something similar with any database: e.g. sqlite on iOS, to MongoDB in our backend, and to whatever database you use in your own backend.
That's probably cause you're used to writing native iOS apps, but we're attempting a web app that works via phonegap across all devices from one source code base.
So for us storing all the data in Backbone collections isn't an option - models add too much memory overhead, versus native JSON objects.
As I understand multiple collections update independently of each other? How do you deal with relational data in that case?
You can model relational data in Backbone quite effectively, or use another library (or your own). For example: https://github.com/PaulUithol/Backbone-relational/
The data syncing is by far the priority
The biggest difference is that since we are not a general platform, we can make assumptions about the model and how version each release and we can built in some constrains and unique security models.
We took a lot of cues from the OData spec and Microsoft's reference design.
The rest of this comment is mostly targeted at the creators of Simperium.
You made very similar design decisions to us in a lot of ways. A lot problems though that you will face I see with your path here so I have a few tips for you.
* It sucks the iOS client isn't open source. I get scared of linking in third party libs into iOS projects because I have to account for anything you do when I go to Apple to submit my app.
* You really got to brush up on the objective-c naming conventions. Not to be harsh but `-(id)getCustomObjectForKey:(NSString * )key;` makes me cringe.
* Don't require me to have to know about your categories.
* Namespace your categories so you don't smash mine ("something like "SP_encodeBase64WithString" instead of "encodeBase64WithString")
* If you include third party libs, you MUST rename them and prefix with your prefix. I see you use ASIHTTPRequest, DDLogger, SocketIoClient, AsyncSocket, Reachability, and a few others. You will smash everyone else's implementations if they already had them included.
* Don't use ASIHTTPRequest internally (it's old and unmaintained and doesn't play nicely with ARC)
* PREFIX ALL YOUR CODE. We don't have any real name-spacing objective-c. As a framework developer, you have to be aware of that more than anyone else. I shouldn't be seeing DiffMatchPatch and SocketIoClient show up in my symbol list after linking your lib
* Your addDelegate/removeDelegate is funny. After you exhausted the need for one delegate, switch to NSNotificationCenter.
* DON'T USE XIB/NIBs. Interface builder for iOS was an after thought and it's only a 90% solution (unlike with Cocoa where interface builder was a first class product built side by side with AppKit). Especially don't make me have to include your XIBs in my bundle. At the very least give me a bundle with it in it.
* Separate your GIT repo. If I want to include your library as a submodule I have to take all the client libraries as well.
Now when it comes to the actual sync layer and how you generate JSON dictionaries and apply "patches" this is fine code.
Here are some feature requests:
* Instead of having to give you a single NSManagedObjectContext let us register them. We have a few (some use different concurrency types).
* Let us override what gets generated or if we want to ignore a field with userInfo keys in the core data model.
* Let us get an idea of your backend sync processes to be able to suspend and start them when we need and know if anything is pending. Give us a callback that you still have things to queued and when we are done so we can at our leisure set up UIBackgroundTasks on iOS 4+.
On an unrelated note, why not create this as a NSIncrementalStore and just put your code behind the persistent store coordinator instead of monitoring it? We are doing the same as you at the high level because we wrote our code pre iOS 5.0 but iOS 5 gives you an awesome new toy there.
A great quote!
There's a lot of good feedback here, particularly regarding playing more nicely with other code. We'll do a pass, thanks.
* You can add overrides in your model's userInfo but this isn't documented yet. We'll do that after cleaning up the naming a bit more.
* The next major release uses NSNotificationCenter for the reason you mentioned.
* We chose a single repo for now since our samples tend to span languages/devices. We'll revisit this eventually. Your point about submodules is a good one.
* We're not currently using NSIncrementalStore for the same reason as you (needed < iOS 5 support).
Renaming seems like an ugly solution.
Objective-C is dynamic and the last implementation wins if they are named the same so it's possible for third party libs to suddenly smash your implementation.
When storing data in services provided and managed by other companies, it becomes more difficult to be certain that the risk tradeoffs are reasonable, as I'm in essence trusting that you aren't just storing the data on a single server in RAM on memcache or something ;P. Some more details would thereby be much appreciated.
(I understand, btw, that I can also have a server keeping a mirror of all of my data, which is definitely awesome, but I also would then want to get a better understanding of whether I would be a fool not to be doing that, or whether I can feel comfortable with having data stored on your servers ;P.)
I have a few questions:
- As a potential customer, I'm curious as to how sustainable is the company. I understand there was a YC seed investment in '10, but is the company well funded for the next couple of years at least?
- The platform seems to be ahead of Meteor, Firebase, etc, in that it already seems to have implemented a basic login and security model based on expiring tokens, but from the docs it seems like "finer grained control of these permissions are under development". Does this mean that presently any user can erase/modify data from any bucket, such as global data, data from other users, etc? If so, that's a big deterred for me.
- Are there any plans to allow querying the data? Key-value stores are fine for simple games, to-do lists, etc, but for any non-trivial app querying arbitrary fields is a basic requirement.
Thanks again, those are some great strides in the right direction.
An example of where read-only permissions are useful is the live dashboard you see at simperium.com after you sign in. The "number of syncs" and alerts at the top are all pulled live from Simperium, but the token used on that page is a read-only token. We just need to expose the ability to create these read-only tokens to developers.
Actually, as a Simplenote user, you might be interested to know that our alerts and blog posts are pulled from Simplenote via Simperium. When we tag a note as "Alert" or "Published" it instantly appears on the dashboard.
Regarding querying, we're working on something for apps that can't or don't want to keep all data locally. In the meantime you can locally query however you'd like in your database of choice.
There is currently limited support for keeping a subset of data synchronized (we have plans to improve this) - if you're using the client libraries in an app, we've focused on supporting the common case of keeping all the data per bucket for a user synced. In the case of keeping all data mirrored to a backend, we provide an endpoint per bucket that you can listen to all changes for all users so you can keep the entire data store synchronized.
With keeping the objects on separate timelines, do you or can you run into any problems with CoreData inverse relationships? I'd be concerned that my objects would have issues where a change got synced for object A that had it linking to object B, but somehow my application crashed before it got the synchronization for object B sent. I can imagine various reasons why this concern is stupid, including "saurik doesn't understand how CoreData works" (as I only even learned of its existence a couple days ago).
Core Data relationships (including their inverses) that come over the wire can be lazily resolved by Simperium if necessary.
Question that may be a bit off-topic but one I'd really like to know: what editor are you using there when you're editing the python service? Thanks and congratulations on your launch.
The editor is Sublime Text 2: http://www.sublimetext.com/2
Releasing more libraries is a priority.
For auth, we'll be adding cross-origin support for https://auth.simperium.com, up till now we've been focused on supporting apps with existing backends which generally use the HTTP API to generate auth tokens from their server.
Clients will be iOS and Mac OS X only, using Core Data.
I'd much much rather use something that's a hybrid of CouchDB and Unhosted...
It's definitely the future to build syncing as the core operation though.
The Zelda-like game running across three screens separately and independently in the same game space was particularly inspiring.