For me, charging for actual usage instead of reserved capacity actually aids in implementing those good design guidelines that were collected in books, such as low coupling and high cohesion of code, and this is what we wrote about in the paper as well. charging for reserved capacity creates financial incentives to bundle tasks and features into applications, creating runtime coupling (eg sharing the same /tmp dir, reusing security roles etc), even though they are designed to be isolated. charging for actual usage removes that incentive, so stuff that was designed to be isolated stays isolated when deployed.
Separation of concerns is easy to do without lambda or microservices. If you want separation of concerns write good code. Don't move your environment to a locked in circus.
Serverless enables the extreme (many tiny services with minimal upfront cost) but doesn't require it.
This sounds better, and perhaps I could be on board with this. I went a different route, which is kubernetes for my Django + Celery. Over 75% of my load is "stable", so kubernetes + celery ends up cheaper than lambda for me. I can basically throw my web servers in my kubernetes cluster "for free".. but if I had just a webapp, I would feel better about my entire Django in one Lambda, or at least 1 entire "app" of my Django in one Lambda. My biggest gripe is warmup speed, 3 seconds for a cold request is pretty bad compared to the 100ms I have now in kubernetes.
We use Lambda primarily to improve developer efficiency and to allow dev teams to own their own operations end-to-end, with cost efficiency only a secondary goal. It's been great for that as it's a small enough thing to integrate with that any given developer can learn the entire operations stack (for their team's services) well enough to work on it themselves.
3 seconds on cold request sounds unusual, it should be around a second or less - have you measured it? If you use VPC you'll unfortunately be in the realm of 20-30 seconds cold start, which is why we've avoided it for Lambda (and used stores like DynamoDB rather than RDS which work with IAM as an alternative to security groups). No Elasticache without VPC is going to be a big problem soon though...
For health, there's certainly more noise now: we can look at overall invocation error rates (a metric lambda gives us), but they're across several endpoints within the lambda function. This is still an open question for us, but solvable.
Having said that tooling is key and I find it somewhat lacking to turn on a switch and run my code as a lambda to say compare costs and such.