You then add rules to limit users so they can only read/write their own data. This works fine for many simpler scenarios and avoids the need for a backend completely, with the potential risk for overuse.
If you need more control and protection then you should have a backend layer to get the data server-side.
As far as I can tell, any naked frontend-only client serving content without its own access management sounds quite open to a single bad actor spawning a large number of sessions on the same host, and on a small number of distributed hosts, to eat up all available resources. I'm not inclined to ever trust a number of users behaving nicely at n>100 personally.
Given that Firestore appears to advocate "serverless" client-driven applications, unless there's some foolproof DDoS mitigations I'm finding it a hard sell, especially on the whole "if you need more control and protection, do X" argument - you rarely need that until somebody straight-up exploits you, and when you do, you're not going to be particularly sympathetic towards Google's marketing speak.
At the time we were using the Firestore beta, Security Rules were also very poorly documented with no tooling apart from the editor (beta product - I know, I know).
Ultimately we found several vulnerabilities in our own implementation of Firestore that were difficult or practically impossible to resolve on our end. The workaround was to communicate with Firestore only by using the admin SDK.
I still completely vouch for Firestore and Firebase as a product.
I think the main problem is that they bolted security (access control) on top of their existing system, rather than make it an integral part of the design.
Two minor notes (one on security rules):
(1) in the tutorials I’ve seen, the security rules have received little attention and the documentation I’ve read is a bit confusing about the spectrum of possibilities for security rules. Tbh, I’ve punted security rules altogether until I’m closer to “beta ready” as opposed to writing them along the way. I suspect this is bad practice but haven’t found a better way. Much of this is my fault lol.
(2) “presence” systems still require the use of Real-Time DB. +1 feature request to support user connect/disconnect events in Firestore.
And yes, also agreed on needing a solution for presence system, either in Firestore itself, or separate from it.
Limiting reads is interesting. I think generally the idea is data is cheap enough that most people don't have to worry about it. If you do you could gate access through an API in a Function.
In terms of rate limiting the only one I'm aware of is writes to a single record are rate limited. This impacts normal development when making counters that need to be updated, which leads to an odd "sharded counter" model.
Works great so far for no cost and it hasn't required a single bug fix or any maintenance since launch. Firestore definitely has a lot of surprising caveats though and you need to design your app and data around this to avoid trouble later.
Does anyone have any stories to share about when they outgrew Firestore, what they migrated to and how? I wouldn't be keen to use NoSQL if my data became more complex.
EDIT: nvm, you answered the Stripe vs Paddle question in another reply.
> Stripe does not calculate any tax information for you.
> We automatically handle the calculation, collection and remittance of taxes in every single country on your behalf: you don’t need to do anything.
The gist of it is when a subscription is purchased a web hook triggers a Firebase Cloud Function that stores the subscription data (i.e. email address + subscription expiry date) in Firestore then when someone logs in via Firebase authentication you check if the email address is linked to a valid subscription. When the next payment is made to the subscription, you bump up the subscription expiry date. I've found there's little that can go wrong once it's up and running.
1. Disappointing that it has gone to GA without providing proper search (after a long beta). Can anybody explain why a service run by the worlds #1 search company continues to point users to third party services if you want to implement a basic text search in a database? I genuinely don't understand that.
2. Feature requests (or complaints) re backup, queries and documentation are not new. Nor are the answers (or excuses) we see here, which revolve around scalability and (until now) being in beta. BUT all users I have ever heard say they use the product for speed & ease of setup & convenience - for MVPs, not for massive googlesque data. It almost feels like the product market fit is not quite right. So, forgive my technical ignorance here, but worst case scenario, why not provide the features with the caveat that they are slow or expensive or won't work above a certain size of db? Isn't half a solution better than no solution?
The database and whole Google/Firebase app suite thing has some strong points. But to be frank, and I'm sorry if I'm being dismissive of hard work and technical wonders here, from the perspective of a customer and outside observer, a number of things smell quite off.
1. Mainly because it's both a different search problem (general DB vs specific to web search) and hard engineering wise given our model; we implement not only the cloud database, but embedded versions for iOS, Android, and Web - not to mention real-time functionality and tailoring it to how our index engine works, etc. While we have a lot of customers and use-cases that don't need Fulltext Search, we totally agree it's important and have done explorations on how we'd deliver something along these lines.
2. Agreed. During the beta program we've delivered the managed export and import service for backups, adding array contains capabilities to queries and have got close enough to delivering Collection Group queries to mention them as part of GA. For documentation our tech writing team as done a lot of updates, new pages, and fixes - we know there is always more to do. Cloud Firestore is definitely used in production and at scale by our customers, and with nearly 1 million databases being created the range of use cases and traffic/load patterns has been vast. Our beta program involved working with a lot of them to improve things like hardening and scalability to ensure we can meet our 5 nines of availability SLA.
"Isn't half a solution better than no solution?" -> In a lot of cases, absolutely not. A half solution that falls over when you tip a certain point of scale can result in extended downtimes, since the solution often ends up being "we need to completely rearchitect this", which isn't easy or quick when your business is out of commission.
"from the perspective of a customer and outside observer, a number of things smell quite off." -> Sorry to hear this, I can only hope the continued hard work from the team will turn you around.
e.g. yes I realise they are different search problems, but I'd presume that Google is nonetheless well-equipped to handle the document db one. the only apps I can imagine that couldn't benefit from a search box are games - anything content-focused or ecommerce focused needs one, and the majority of utilities benefit too (yes I can do chat without search, but it certainly benefits from being able to search through chats) - any examples? yes I realize having to do Backend/iOS/Android/Web is hard (as it is for everybody else), but with on device cases at least the db is smaller. Im sure you do have big users, I didn't mean to imply otherwise, but with my admittedly very limited knowledge I'd still wager that a majority do not see uptime and scalability as the most urgent improvements, but rather those we are discussing. In our case, give us just 2 9s of uptime, and give us the above queries and searches even if 2x as slow and expensive as you'd like them to be, and limited to a db the size of an average relational db, and that would beat the extra 3 9s of uptime and the super scalability any day. Not least because, I don't mean to be rude here, just candid, but if we were ever to reach a point where we needed that massive scale and uptime, I'm not sure I'd be keen to trust Google with user data.
To be clear - I like several things about Firebase/Firestore, which is why we use it and why I'm insisting on badgering you here. I just wish I could be completely comfortable with my choice rather than wondering every day if I shouldn't just use something else.
Disclosure: I've contributed to their database tech, but I don't work for Sanity. Mostly just a fan.
We have played with Firestore quite a bit, but rely heavily on the ability to do aggregate queries. Reading all of the documents and performing this on the client side is nowhere near good enough. Nor is triggering functions to update a "count" or "sum" property on a doc.
Edit: Looks like a PM answered on another thread...
"It's a point of internal discussion on scalable ways to achieve this, but nothing we can promise. We definitely see the need for it."
By far the biggest pain point of the RTDB was its poor querying capabilities. Firestore solved some of those problems, but it's still very limited.
We moved away from Firebase primarily because of the serious limitations of the 2 DBs.
The Firebase team is aware of that need but it’s very hard to deliver such features at scale.
Firebase is used everywhere these days , so a single change in the platform has impacts in petabytes and massive thousands of cpus load.
As a student developing a system for a friendly NGO I must say that having angularfire integrate with Angular made this an instapick because I wanted to get started asap.
The issue now is that it really feels like the docs need more examples. I'd take 4-6 hours to add a new feature and I'd spend them reading the docs. Sometimes I'm not sure if I don't get something or it's a missing feature.
I guess the community is still growing but it's really hard to find answers to some simple problems.
angularfire has an inactive gitter and there isn't one for firebase specifically. It would be really amazing if firebase devs could take some time to answer the community on gitter or something, at least at the beginning before there's more people knowing the answers.
EDIT: Found your link to the google group. I guess I'll ask there then :)
There's also the Firebase community Slack https://firebase.community/ and firebase-talk is our supported community channel for all things Firebase https://groups.google.com/forum/#!forum/firebase-talk
any chances you and your team are working on a GraphQL access layer for the Firestore?
It feels like nested queries(using refs for relations), basic mutations and (especially) graphql subscriptions would be a great match with Firestore and a big improvement in dev experience, imho.
Also an automatic backup would be nice (and can't be that hard to offer?).
Can you point towards papers that describe the algorithms and data structures that Firestore uses? What infrastructure does it rely on to perform reliably?
The SDKs are all open source and make use of other open source components:
* Android: https://github.com/firebase/firebase-android-sdk/tree/master/firebase-firestore
* iOS: https://github.com/firebase/firebase-ios-sdk/tree/master/Firestore
* Web: https://github.com/firebase/firebase-js-sdk/tree/master/packages/firestore
* .NET: https://github.com/googleapis/google-cloud-dotnet/tree/maste...
* Go: https://github.com/googleapis/google-cloud-go/tree/master/fi...
* Java: https://github.com/googleapis/google-cloud-java/tree/master/...
* Node.js: https://github.com/googleapis/nodejs-firestore
* PHP: https://github.com/googleapis/google-cloud-php-firestore
* Python: https://github.com/googleapis/google-cloud-python/tree/maste...
* Ruby: https://github.com/googleapis/google-cloud-ruby/tree/master/...
For example, Cloud Firestore uses TrueTime to provide similarly strong consistency guarantees to Cloud Spanner (see https://cloud.google.com/spanner/docs/true-time-external-con...)
I'm not aware of any whitepapers describing the listen feature that provides real-time updates.
Another option would be a typed JSON db, essentially you could store JSON that corresponds to typed structs a la serde. Would't solve a lot of my problems, but at least I'd have some built in validation.
This makes sense for many apps in a document-based/NoSQL model, and may let you avoid any need for your own back end.
But it makes less sense when you start introducing relations and data models become more complex.
e.g. pretty soon we'll need to join to a lookup table. That presumably means that that lookup table needs to somehow be opened up to all users, while for other tables the user can only access their own data. Starts getting complex real fast.
This might not be useful advice for everyone, but I can highly recommend this youtube video on DynamoDB data modelling: https://www.youtube.com/watch?v=HaEPXoXVf2k
Do you have a source for this? Thanks in advance.
If you want a relational database for just one user's data, you could read a SQLite data file over the network when your user opens the app, perform SQL queries over the data in memory, then write the entire database file back across the network when the user hits a "save" button.
But you probably don't want a "save" button, you just want the user's state to be constantly updating back to the server.
And you probably don't want to write back the entire data file every time you save, just the parts that have changed.
So Firebase is optimized to solve these problems, and I bet the data structures they use are not also optimized for supporting all kinds of relational queries.
If you want a relational DB as a cloud service, it sounds like you want Cloud SQL (as far as GCP is concerned).
(Disclaimer: I work on GCP.)
Do you have any other suggestions?
Firestore is a complete engineered Google solution based on the Google Datastore but with added features from recent years (spanner, cloud functions). This is very obvious if you compare the limits between Datastore and Firestore.
IMHO they are not really comparable.
Edit: As wsh91 says below, you are still responsible for the scheduling.
(I work for GCP)
(I work on Cloud Firestore.)
I am just using this https://github.com/ChristianAlexander/FirestoreRestore with cloud functions & cloud storage.
We decided to move out of the Firebase RTDB because not only we weren't happy about it but it was unreliable.
We still use it's static hosting and cloud functions.
What are good reasons to use NoSQL over SQL?
Another is when you need read and more specifically write performance that a single system cannot keep up with. When you hit these boundaries, it gets interesting. Sometimes it's just easier to design for such a system up front.
If it's an internal application SQL first is probably fine, if it's SaaS you may want to look at alternatives.
If you're doing a long term project, then Postgres makes sense since the deployment/setup costs are one time. But for short projects Firebase is very nice.
My understanding is for Firestore specifically, its a document store with some conflict resolution semantics that Postgres does not provide.
What makes Firestore interesting to me (not used it in production) is that you can avoid the need for a backend completely. Your normal architecture is Client <-REST-> Backend <-> DB. You can avoid all the deployment and development in the backend by doing the work in the client.
I’m using re-base  to synchronise Firestore to my local state. Do I not need to be?
Any pointers gratefully received, thanks.
The only local state I have at the app level is state that doesn't get stored in the DB.
Here's the helper HOC I created for it:
And an example of it being used as a render prop:
And another as an HOC:
And here's a mutation:
The HOC's make it so you can create just plain stateless functional components and not have to worry about componentDidMount.
Looking forward to it. If this comes in, pretty much it removes the need for creating top level collections or am I missing anything?
Polar is basically a document management and annotation platform. You put all your reading in it and maintain it long term along with annotations, highlights, etc.
Firestore is really nice in that you can target multiple-platforms pretty easily. There are SDKs for basically every platform.
It's definitely not perfect but I'm pretty happy with the decision.
To run any kind of specific query you’d need to handle that application side. Just something to consider.
I have a rather large dataset that's tough to scan over and I find the functions/transactions to build my own aggregates not entirely accurate enough since a document has rate limits for writes.
(Cloud Firestore eng. here.)
Where do you feel it is shoehorned in? Firestore is easy to access from App Engine and Cloud Functions (and GKE/GCE as well) using the built-in service accounts, it has managed export into BigQuery. Would love to know where you found friction when using it with other GCP services.
> Why is Google so bullish on this thing?
While I can't speak for Google as a whole, I'm personally bullish on Firestore for a few reasons:
1) Gives you the same "client-side only" model that Firebase popularized. This makes creating applications much faster as you remove the whole server side component and basically all your ops work.
2) True "serverless" pricing and ops. Because you only pay for reads/writes and not instance time, your costs scale linearly with your app's usage. And you don't have to worry about sharding or other operationally complicated things to scale your database as you grow. The caveat here is you have to structure your data and queries well, otherwise your costs will skyrocket .
3) Gives you the same "server side" functionality as Cloud Datastore (which Firestore is basically the next generation of ). This means you can use Firestore in place of a traditional NoSQL database like MongoDB.
4) Strong consistency. One of the biggest problem with Cloud Datastore (and most NoSQL databases) is eventual consistency. Datastore worked around this with "Entity Groups" which in my opinion were super clunky to work with and very limiting. Firestory is strongly consistent so you don't have to worry about this even at scale, which is super nice.
5) GCF Triggers. The fact that you can trigger a cloud function when something is written to the database is very powerful, it's basically like a traditional database trigger or stored procedure but you can do anything.
The biggest feature gap for Firestore is the querying abilities. While it is way better than the original Firebase DB, it's nowhere close to a relational database or something like Cloud Spanner. The team is working on it though.
The "real time" stuff is interesting but not really super relevant to the things I like to build.
I don't know what the roadmap is for removing these restrictions (or even if it is on the roadmap) but I'll update this post if I learn more.
We are... We're not massive yet but we have about 250 users who've updated about 20GB since our launch. Still really early as we haven't had a massive launch yet.
We're still soft launching.