This is not true, and not what we talk about when we say it's almost always the db.
The usual cause of database problems is a single big query that is causing a slow page, or a lot of medium size queries inside a poorly written loop. Whether it be bad joins, missing indexes, select statements in the select statements, operations in select statements that force the db to run the query line by line instead of as a set (the technical term suddenly escapes me), it's the queries that almost always cause db slowdowns.
It rarely has anything to do with writes, and then in my experience usually had to do with escalating locks, and I haven't seen that problem in ages as hardwares got faster.
I think part of why low-feature, specific-usecase DBs like Mongo can get such traction as a general purpose datastore (when marketed as such, of course, to sell more licenses, no matter how bad an idea it is—looking at you, Neo4j) is because so many devs don't know what a half-decent SQL DB can (and should) do for them in the first place.
I have a pet project that works with data that’s very graphy (timetabling app with lots of interdependent events); I tried using postges at first since I’m used to it, but found myself writing really ugly looking recursive queries. Neo4j seems to fit my use case a lot better, and their graphql plugin has been really useful.
[EDIT] it's also not nearly as well-supported so you'll find a lot of supporting libraries with multi-datastore support don't support N4J yet, or do but only in some crippled or poorly-tested (not widely deployed) fashion.
[EDIT EDIT] it's also not a great fit if you have a lot of constraints on or structure to your graph. At least as of when I used it ~1 year ago it had essentially no support for expressing things like "this type of node should only permit two outbound edges". You can work around that but the solutions will be less safe and/or suck.
Constraints and hosted multi-datastore solutions aren’t really an issue for me, but the type of query you mentioned about subgraphs might be. One query I know I’ll need is one that can quickly identify nodes with lots of neighbors that have a specific property.
The data is definitely very graphy, so I really wanted a graph database as the primary. Dgraph seems pretty good (advertises itself as more reliable and faster), but it sounds like graph databases might just be kind of oversold in general, and that I might want to reconsider. One other issue is speed of development, which the graphql plugin really increases, so I’ll probably stick with Neo4j for now. Swapping dbs theoretically shouldn’t be THAT painful if I don’t care about migrating data and keep the graphql layer the same.
It depends who's talking and what they are talking about.
If we're talking about fixing slow website, then sure, bad big ugly queries are one dominant issue, but not the only one. If we're talking about architecture of a greenfield project, then balancing between concurrent writes and consistency is going to be your ultimate problem, everything else can be "duplicated" (read-duplicated DB servers, caching strategies, multiple apps server, etc...).
Again, I just think you're not considering the vast majority of us here work in terms of thousands, 10s of thousands or millions of users, not the sort of scales where anything you're saying ever becomes relevant.
For most of us, if you spend any time thinking about that, you're utterly wasting it. It's pointless. Greenfield or not.
It comes in faster than you think. One of the company I work for was a small company, at the time I think we were 4 or 5 including the 2 founders, they sell some specific information as a service through APIs. They charge per call, so you have to update the customer funds in their account on each API call. Yes, people can want and make lots of APIs call very fast, and in a business which charges per API call, this is a good thing, but you have to find strategies and make business decisions on how to handle accepting and replying to new API calls while maintaining their account balance.
(BTW, the "can't reply" issue happens on HN when there's a long comment thread between just two people that's being rapidly updated. If you wait 10 minutes or so you'll be able to reply.)
So you only ever have one app server?
Sure, this specific issues could be handled other ways, but in general global state is hard.
You can get surprisingly far on one app server (or sometimes one app server with 2 mirrors for failover). See eg. the latest TechEmpower benchmarks , where common Java and C++ web frameworks can serve 300k+ reqs/sec off of bare-metal hardware. The OP indicated that these API requests are basically straight queries anyway, and only need to write to update the billing information. Reads can run incredibly fast (both because they can usually be served out of cache and because they can be mirrored), so if your only bottleneck is the write, take the write out of the critical path.
In general global state is hard. Don't solve general problems, solve specific ones. Atomic counters have a well-understood and highly performant solution.
Also, I never said that you couldn't use a single app server because of performance; as you said, you can handle quite a bit of traffic on a single box, even with slower languages.
Even if it did, as the other comment says, you could easily work round it. If that was the behaviour there was no good reason to have it utterly perfect.
It's possible to kill your performance very quickly with databases if you request features that you don't actually need.
You need to decide if you absolutely do not want to serve any API call unless you are sure they've been paid for, in which case you have to create a commit transaction on that account for each call. Or you decide how much you can let a would be rotten customer get away with, and uses queues which leads to eventual consistencies.