
How I write back ends - fpereiro
https://github.com/fpereiro/backendlore
======
tristanperry
A great write-up. Two interesting points:

1) No Ansible. No Kubernetes, nor containerisation. Starting on a single
server/vm and controlling deploys with a bash script.

2) An interesting take on microservices: "One server, one service... 2-3k of
code can hardly be called a monolith.... if two pieces of information are
dependent on each other, they should belong to a single server."

As a (mainly) backend dev who does like devops and microservices, I agree on
both points.

KISS - there's no need to start off with over-complicating everything for no
reason. When there's a need for Ansible, introduce it at that point. When you
think you need microservices, think about it a lot and move to them if you
really do require them.

But certainly don't start off using a full Ansible + Kubernetes + Docker +
microservices stack. You'll spend all your time managing these, and relatively
little time actually being productive.

~~~
bvm
> You'll spend all your time managing these, and relatively little time
> actually being productive.

I see this a lot, and I'm always somewhat surprised by it. It took a couple of
days to put together our kops-based cluster on EC2, wire it in to Gitlab CI
and get the various load balancers, ingresses et al plumbed in, and then
perhaps an hour a week maintaining it from there?

I do agree it's not the right place to start, but I think the standard
operational complexity of k8s is frequently overstated. The argument I _do_
buy against k8s is that when it goes wrong, debugging can be a nightmare.

~~~
fpereiro
I fully concur with your last statement. I find that the main advantage of
having less moving parts is that there's exponentially less things that can go
wrong - and that makes reliability possible with very small teams. The setup
cost is usually not steep when compared to the overall lifecycle of a project,
I fully agree.

The other argument I have against many DevOps tools (and most software tools
in general) is that the patterns of the solutions they provide don't generally
match the basic patterns of the problems they are solving - the consequence is
the sheer complexity of understanding the tool, especially when something goes
wrong. In the past, this hasn't only costed me time, but a lot of frustration
that I'd rather avoid - even at the price of having to build my own solutions.

~~~
bvm
> I find that the main advantage of having less moving parts is that there's
> exponentially less things that can go wrong - and that makes reliability
> possible with very small teams

Right but it is a trade off to some degree. For us, to have ephemeral feature
branches deployed to QA completely in a hands-off manner means we can do
multiple releases per day as a small team. Most (almost all!) of the
advantages we have seen from moving to a k8s infra have been DX, rather than
UX.

~~~
say_it_as_it_is
What is your workflow enabling this?

~~~
bvm
Nothing esoteric, git trunk with feature branches that build, test and deploy
in Gitlab CI to _a-feature-name.our-company-name.local_ on k8s, notifies QA
team that there's something to check, gets approved, merged to master and
released automatically, functionally tested on canary prod deploy and then
rolled out wider.

------
procombo
Awesome writeup.

You mentioned you're still "finding an elegant way to generate, use and clean
up CSRF tokens".... Consider implementing an "httponly", "samesite=strict"
cookie for security check. Apple recently got on board so all major browsers
are supporting this. One goal of samesite is to make CSRF tokens redundant.

We roll out at least three different cookies with varying expiration times.
Including each of "strict", "lax", and "none". They are there for various
reasons.

Of course additional checks occur, especially for big operations, but we no
longer keep track of multiple CSRF tokens for each user/device/browser combo.

Keeping track of CSRF tokens via JavaScript, considering multiple tabs, is
kind of a nightmare... and it's never as secure as you initially think.

~~~
fpereiro
Thank you for your suggestion about using samesite=strict! I'll look into it a
bit more - the only downside I see is support for older browsers, which is a
personal vice I can't kick. If I see that there's no elegant CSRF solution,
this will probably be my fire escape. Thank you for that!

I added more info in the document regarding cookie/session management,
particularly its expiry ([https://github.com/fpereiro/backendlore#identity--
security](https://github.com/fpereiro/backendlore#identity--security)). The
approach I take is to make cookies last until the server says so. I'm very
curious as to how you use cookies with varying expiration times and would love
to know more if you have the time.

~~~
pbowyer
Seconding @fpereio's request - I too would like to know more. CSRF handling is
painful, so your approach is very interesting, but I don't understand it fully
yet.

~~~
procombo
Until several years ago cookies couldn't be used for CSRF tokens, because they
were sent with every request to your site, regardless of where the request
originated. SameSite changes that because you can now control when a given
cookie is sent with a given request, based on how a link was followed. It puts
the work on the browser.

SameSite is just a property you use when setting a cookie, like the
expiration, or url path. There are three settable values: None, Lax, and
Strict.

'None' sends the cookie with every request, like how browsers "historically"
acted.

'Lax' is the new default [1]. If someone follows a link to your page, these
cookie are sent with the request. If a third-party site POSTs to one of your
pages, these cookies won't be sent.

'Strict' cookies will never be sent to your server if the request originated
from a third party, even if it's as innocent as following a link from a search
engine.

So use these types to your liking. Your needs vary. If you want the user to
appear logged in when they follow a link to your site you're going to use None
or (most likely) Lax.

As far as CSRF, you can generate a single CSRF token at login and set it in a
cookie as with SameSite=Strict. Regenerate it every so often, or at "big"
events, and voila. One CSRF token per device session. No worries about
multiple tabs open, and you have reasonable expectation to only have to keep
track of a single token at a time. The kicker is, you technically don't even
have to have a CSRF token if you don't want. If you're serious about nobody
linking to an "internal" authenticated page, just set the primary session id
cookie as strict. (I would personally do more than this, but hey.)

I do secure portals. So we set several cookies across the board. Varying
expiration times. They all have their purposes.

 _[1] While you can consider Lax as the new default, if you omit the SameSite
property on a cookie it is actually attributed as something referred to as
'NotSet', which is unsettable, and acts a hybrid between Lax and None.
Browsers vary a little here, but it's only to maintain reasonable backwards
compatibility at the moment, and both Mozilla and Chrome developers plan to
remove this hybrid functionality in the future. Basically.. a cookie will act
like None for 2 minutes, then act like Lax after that, etc._

~~~
fpereiro
Beautiful answer, thank you!

Since I want to support as many browsers as possible, I'm going to still use
CSRF tokens, but a single token per session as you suggested. I just outlined
the approach here:
[https://news.ycombinator.com/item?id=22268152](https://news.ycombinator.com/item?id=22268152)
.

Thanks for your detailed feedback, it's very valuable to me.

------
deathanatos
> _Speed - S3 is not very fast, even accounting for network delays. Latencies
> of 20-50ms are par for the course_

I've used S3 in past projects. These numbers are in line with my experience.
And frankly, I'm good with them. S3 is dirt cheap compared to EBS, and it IMO
doesn't generate a lot of production incidents (compared to, say, AWS RDS
MySQL). The latency is on the high side compared to some things, but if the
user is doing an "upload" operation of some sort, 50ms is tolerable, IMO. And
it always seems to be _way_ more efficient that just about everything else, in
particular, it's usually next to some Mongo/SQL DB that is stuggling under the
load of feature creep. S3 has basically never been on my bottleneck radar.

The biggest thing I find people doing wrong, particularly in Python, is a.
failing to keep the connection open (and so paying the TCP & TLS handshakes
repeatedly when they don't need to, which adds a _significant_ latency hit)
and b. making boto calls they don't need to, such as "get_bucket" which in
boto will check for the bucket's existence. You almost always can guarantee
the bucket exists as a precondition of normal system operation… and the next
call to GET/PUT object will fail anyways if something is amiss.

~~~
fpereiro
Fully agree. 20-50ms for something like S3 is fantastic and proof that we live
in the future. S3 is only slow when compared to interacting with redis or a FS
layer powered by node (by about an order of magnitude). That's why I feel that
complementing it with a self-managed FS can work well in certain settings. And
interestingly enough, because S3 is so solid, you don't have to be so paranoid
with your own FS layer, since S3 is there to bail you out in case of data
loss.

Regarding S3 pricing, ~0.09 USD per downloaded GB could potentially become a
significant bill (~90 USD per TB). I'm building a service that serves pictures
and limiting the financial downside is very important for the economics of the
business. More than the actual magnitude, the certainty that costs are a
linear and predictable function of storage makes me sleep better.

An implicit point that I just realized by re-reading your comment: if you have
a significant amount of data (let's say, > 100GB) a local FS is only cheaper
if you're using instances or dedicated servers with large disks - and probably
this is prohibitively expensive in AWS (EBS). I'm particularly considering
Hetzner dedicated servers with large disks.

~~~
say_it_as_it_is
Can you break down the financials for us?

~~~
fpereiro
At some point I'll do a public writeup of my planned cost structure. Hetzner's
EX62 servers ([https://www.hetzner.com/dedicated-
rootserver](https://www.hetzner.com/dedicated-rootserver)) are about 10
USD/mo/TB of disk and it would be hard to go over their bandwidth limits .
Even allowing for RAID and/or multiple locations/servers for a file node, the
cost is still low. The main advantage in terms of cost is the possibility of
not spending money on outgoing bandwidth.

~~~
ablekh
Hetzner's pricing for dedicated servers is impressive. However, I'm curious
about how do you deal with (or whether you are worried about) the lack of
their presence in North America, Asia, Oceania? Unless most of your customers
are in Europe, latency might be an issue ... What is your experience in this
regard?

~~~
fpereiro
That's a good question. I'm going to try first with Hetzner and measure
latencies from different places; if this becomes an issue, I'll definitely
consider complementing the infrastructure from elsewhere. But I'd rather start
simple.

I currently run a web service from North American servers and the performance
here in Europe is quite good. I think this experience - very, very limited -
gives me confidence that the wires across the world are good enough, provided
your data center is close to the main pipes.

~~~
ablekh
All right, good luck! Please share your Hetzner experiences here - I will be
watching this thread.

------
ketzo
I really love articles like this. There's so much value in quickly summarizing
years of experience with proven tools.

This writeup in particular is great because it starts with simple
architectures and progresses to horizontal scaling with a load balancer -- it
really is 75% of what you'd need to actually implement a production-ready
application backend, with clear links to get the other 25%.

~~~
juangacovas
+1 and I'm sure the article could inspire others to do the same with other
stacks. I like how it compares to other similar stack where OS is CentOS,
serverside lang is just PHP and some other layers added, like HAProxy and/or
Varnish cache, or even Apache or Lighttpd instead of the omnipresent Nginx for
the webserver

------
fpereiro
Wow. I wasn't expecting this level of interest, not in a million years. I'm
very grateful (and a bit overwhelmed!). Thank you for your thoughtful
comments. I just added a few more clarifications to the document, mostly
inspired by your interactions.

~~~
mercer
Your article showed up at exactly the right time for me :). A project I'm
working on has been increasing in 'devops' complexity, and I'd been doing some
research into Docker/Ansible/etc. because I felt that perhaps my simpler
solutions weren't ideal.

After some research I concluded my approach was still preferable, at least for
now, but I still felt some unease because everyone seems to be using Docker
and whatnot. Your article really helped me feel more confident in my choices.
Thanks!

~~~
say_it_as_it_is
Based on your analysis, how did you decide not to move to containers?

~~~
mercer
Argh. long comment got deleted. So, short version:

1\. I already know how nginx/apache/linux work, so it's been pretty easy to
spin up a new VPS and configure it to handle potentially multiple apps, rather
than learning how to work with Docker or the like.

2\. I run pretty much the same stack everywhere (Phoenix/Elixir) and I don't
foresee running into issues where I need to run multiple versions of, say,
elixir or node.

3\. For deployment I quite like git + hooks to handle things. and because of
some of Elixir's particulars, it's real easy to deploy a new version and
recompile without having to completely restart the app.

4\. I don't like the idea of adding another layer of complexity. The way I see
it, if anyone else needs to work with me on the 'devops' part, they better
know how to work with linux/nginx anyways. And if they already know, they can
pick things up pretty quickly. Knowing how to use Docker would just add
another thing for them to learn, another thing for me to keep an eye on, and
another thing to follow updates and security issues on.

5\. I'm very much of the article's school of thought that scaling stuff on a
single server for quite a long time is possible. Perhaps especially with
Phoenix/Elixir. So at worst I could see myself moving an app to a separate
VPS, but I don't need to 'orchestrate' things and whatnot.

6\. I haven't properly researched this, but I suspect that, for quite a few of
our clients, running things on a separate server is a legal requirement. A
container would not only be an alien concept to them, but it might very well
just not be allowed (would love to hear input on this though).

tl;dr: I just don't need it. I can spin up a server and get everything set up
in < 30 mins, some of it done manually and some of it with some simple bash
scripts. and I can't think of a very good reason to learn and implement
another layer that sits in between my server and the running app(s).

All that said, I don't know what I don't know, so I'd really love to hear
where perhaps I might benefit from using Docker/Ansible or the like! I'm not
at all against these tools or anything.

EDIT: I'll add that I do think perhaps Docker might be useful for local
development, especially when we start hiring other devs. Am I correct in
assuming that's one of the use cases?

~~~
Izkata
> EDIT: I'll add that I do think perhaps Docker might be useful for local
> development, especially when we start hiring other devs. Am I correct in
> assuming that's one of the use cases?

As I understand it, something like that is supposed to be one of the big
draws: Since everything runs in a container the environment is the same
regardless of the system packages, so you shouldn't have any "works on my
machine" bugs.

That said, we had a session recently where a system that was just converted to
docker was being handed off from development to maintenance, that also acted
as a crash course for anyone interested in docker on the other development
teams. I think only a fraction of us actually got it working during that
session, the rest having various different issues, mostly with docker itself
rather than the converted system.

~~~
mercer
Interesting. Thankfully most of my current work involves a stack that works
fine on Linux/Mac, and I don't foresee many other devs needing to work with
it, least of all devs using Windows.

But I recall doing contract work where it took me and every new developer
literally a day (at least) to get their Ruby on Rails stack working with the
codebase, and where we often ran into issues getting multiple projects running
alongside on account of them needing different versions of gems or Ruby. I can
see how Docker would be a great timesaver there.

~~~
AlchemistCamp
Are you using ASDF? I generally commit a .tool-versions file to source control
for every project. It specifies the versions of each language (usually
Erlang/Elixir/Node) to be used for a given project.

I started using ASDF for Elixir projects, but over time I've basically
replaced RVM, NVM and all other similar tools with it.

~~~
mercer
Back in the RoR days I guess it was RVM that I used. I've also used NVM in the
past.

But honestly in my current work I've not needed it yet. As far as
Elixir/Phoenix goes, things have been stable enough so far that I've not
needed to run multiple versions, and I try to rely on node-stuff as little as
possible (more and more LiveView), so I can get away with either updating
everything at once, or leaving things be.

Furthermore, when various apps diverge too much, I usually find there's a need
to run them on separate VPS', which I find cleaner and safer than trying to
make them run alongside each other.

But thanks for reminding me about ASDF. It seems like a much nicer solution
than using multiple tools!

------
Epskampie
Awesome writeup, I agree with mostly everything. A few small tips:

* Use `npm ci` instead of `npm i --no-save`. This gives you the exact packages from the lockfile, which is more reliable.

* You might like systemd .service files instead of `mon`. It's available by default on ubuntu LTS, and you're down a dependency that way. The .service files can be really simple: [https://techoverflow.net/2019/03/11/simple-online-systemd-se...](https://techoverflow.net/2019/03/11/simple-online-systemd-service-generator/)

~~~
fpereiro
Thanks for your comment!

\- I don't use lockfiles - instead I use very very few dependencies, and I
always target a specific version of each of them. I know this is not the same
as a shrinkwrap, but I'm crazy enough to feel that package-lock.json adds
visual bloat to my repo.

\- I am biased towards mon because it's probably less susceptible to changes
over time than Ubuntu itself, and because it can be installed with a single
command. But systemd definitely is a good option too if you're using Ubuntu!

~~~
Merad
The main problem with your approach is that you have full control over your
dependencies but no control over _their_ dependencies, unless you're
inspecting every single package in your dependency graph to make sure that
every one of them targets specific versions in its package.json.
Realistically, using package-lock.json and npm ci is the only way to ensure
that everyone on your team and your deployments are all working off of the
same node_modules folder.

------
aprdm
If you replace NodeJS with Django and Redis with Postgres it pretty much maps
1:1 with how I deal with it.

Once/if postgres becomes a bottleneck I simply add redis to cache, the next
step is doing some stuff asynch with RabbitMQ + Celery and you can scale
really really really far...

~~~
StavrosK
To many people's surprise, I can crank out a fully-functional side project in
two to five days (latest ones were [https://imgz.org](https://imgz.org) and
[https://nobsgames.stavros.io](https://nobsgames.stavros.io)) by having
already completely automated devops.

Much like you, I have a Django template project and use Dokku for deployment,
so I can use the template to create a new project, commit, push, and it's
already up on production.

Here's mine:

[https://github.com/skorokithakis/django-project-
template](https://github.com/skorokithakis/django-project-template)

Having all the best practices and boilerplate already sorted helps immensely.

~~~
williamdclt
I've worked a lot on this sort of thing in my company, as we spawn new
projects every few weeks. We went from 4 days to deploy a fully-functioning
Django+React+CI on AWS to literally 30min (much of it just waiting) using
similar approaches.

And now we have the same thing for API Platform, NestJS, Vue, NextJS... Even
React-Native with Fastlane and CodePush :) It's been invaluable, and a very
strong commercial argument

~~~
say_it_as_it_is
The situation isn't clear to me. What changed that lead to more productive
deployments?

~~~
williamdclt
Using templates, boilerplates and generators.

Creating a new project is simply "make generate" from our generator, then
answering a bunch of questions about what tech stack you need. This works
mostly with templating (using Plop) and a sprinkle of automation for shell
tasks (eg `pipenv install`)

~~~
say_it_as_it_is
Thanks for clarifying. What used to take 4 days to set up a new server can now
be done in <1hr. How often is that being taken advantage of? "Any time we need
a new server" is determined by a policy determining whether to bolt onto
additional servers or not. Curious about that policy..

------
hardwaresofton
Is there a specific reason you haven't replaced the Vagrant stuff with a
container-based approach? You can also do stuff like using the lxc backend[0]
for Vagrant if that's more your speed. While containers aren't strictly
better, they are certainly lighter weight and quicker to spin up/tear down,
though performance characteristics and OS settings/etc can definitely differ.

I've found linux containers to be fantastic for both replicating pieces of the
environment and making your e2e tests a little more offical. Running postgres
itself (migrating it like you would your real app, inserting data like the app
would, then tearing it all down in seconds), apps like mailcatcher[1][2] and
actually letting your app send mail (this way you don't even need a local/log-
only implementation), running redis locally, and replicating S3 with a project
like minio[3] is very easy and fast.

You can take this even further also by building a way to completely spin up
your production environment to a certain point in time locally -- though all
the resources for the ecosystem of apps may not be on the same VM image
(otherwise you could just snapshot and technically go to the point in time),
if you can assemble the state and the backups, you can hook them all up to the
containers running locally (or in a VM running locally to be fair) and get an
approximation of your entire system.

Of course, it is somewhat less true-to-production (given that production is
you running a VM), but you can remove that barrier with things like
linuxkit[4].

[0]: [https://github.com/fgrehm/vagrant-
lxc](https://github.com/fgrehm/vagrant-lxc)

[1]: [https://mailcatcher.me](https://mailcatcher.me)

[2]:
[https://github.com/sj26/mailcatcher](https://github.com/sj26/mailcatcher)

[3]: [https://www.minio.io/](https://www.minio.io/)

[4]: [https://github.com/linuxkit](https://github.com/linuxkit)

~~~
fpereiro
You raise interesting points!

In general, I don't use Vagrant. I simply use my host OS (Ubuntu) and my local
node & redis. There's usually no further moving parts than these, so it is
easy to achieve development in an environment that is practically equivalent
to the Ubuntu servers/instances in the cloud, the main difference being a
different folder path, different files stored and different info on the DB.

When running the app locally, I still use services like S3 and SES - although
I might not send emails, to keep things as "real" as possible.

I do use Vagrant when the project requires installation of other tools beyond
my basic toolkit (like MongoDB or Elasticsearch) - in this case, I want that
software to stay within the VM.

In general, I'd rather avoid virtualization because it represents an extra
layer, but if it is necessary for idempotence or having a dev environment that
mimics the prod environment, I'd definitely embrace it.

~~~
hardwaresofton
I'd really recommend getting more intuition/familiar with containerization and
introducing it -- it is an essential part of the modern toolkit and really
isn't very hard to use these days. It will absolutely introduce/encourage
idempotence and bring your dev environment just a little closer to your prod
environment. Modern container runtimes can even run rootless containers, swap
out the "virtualization" engine underneath (as in you can even run your
container in QEMU without changing it for more isolation), so it can be a net
positive for security too. There's also no need to swap out folder paths and
file locations -- just moving around and disconnecting/reconnecting your data.

Containerization is actually not virualization (which is why it was in quotes
ealrier), they're basically better-sandboxed local processes (just like your
local redis instance), and the better sandboxing has benefits for both
development and production. So if you squint, it's actually very similar to
just running redis yourself on a similar OS -- just now when you shut it down
you don't have to manage any folders.

Of course, use what works -- there's no need to fix things that aren't broken,
but there is a reason that the defacto deployment artifact of the modern
application is very quickly becoming a container (if it isn't already) -- in
relatively trusted environments they give just the right amount of isolation
and reproducibility, and centralized management (systemd even has space for
containers via systemd-nspawn[0]).

[EDIT] - somewhat unrelated but if you haven't I'd _really_ suggest you give
GitLab a try for your source code management -- not only does it come with a
free to use docker image registry, it comes with a fantastically easy CI
platform which can make automated testing easier. You get 2000 free minutes a
month, and actually if you bring your own hardware (which is as easy as
running a container on your local machine and pointing it at gitlab.com to run
your jobs for you) it's all free. There are lots of other integrations and
benefits (kanban style issue boards, gitlab pages, wiki, etc), but it's one of
my gotos for small projects and I introduce it to corporate clients every
chance I get. Microsoft-owned Github is doing it's best to compete now that it
has endless pockets, but GitLab has been offering an amazing amount of value
for free for a long time now.

[0]: [https://www.freedesktop.org/software/systemd/man/systemd-
nsp...](https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html)

------
rakoo
While this is a fantastic resource, I feel like it's indicative of a larger
problem: why is all of this still needed ? Is there no way to abstract all of
this and focus on the actual functionality of the webservice ?

This is probably the answer that Google App Engine, by providing all the
bricks needed to replace Redis or the FS or S3, and all you need is to write
your own application code. All the provisioning and monitoring is taken care
of. Is there an open alternative to it I could run locally or on my own server
?

~~~
osener
Depending on what you're building, you can even go with
[https://hasura.io/](https://hasura.io/) and focus on the user-facing parts of
your application! It still makes it reasonably straightforward to implement
custom logic, and you only need to write backend code that directly benefits
your domain, skipping all the CRUD plumbing.

~~~
rakoo
Hasura seems to be very specifically about a database and its API. What I'm
looking for is an environment that is more focused on application code, and
which gives you the API to communicate with the external world. That
environment would provide all that is given in the article by default.

------
cdoxsey
If you're going to go the redis route Amazon has a managed service:
[https://aws.amazon.com/elasticache/](https://aws.amazon.com/elasticache/)

It shouldn't be to expensive and can save you a lot of time in setup.

I think I'd also recommend using ansible instead of plain SSH. It has a lot of
stuff built in and for simple deployments shouldn't be too hard to pick up.

Future you will probably appreciate an off-the-shelf solution, instead of
something custom :).

It is nice seeing something like this put together though. There are so many
moving pieces in a project it can lead to decision making paralysis where you
find yourself going down rabbit holes, second guessing everything you do and
seemingly never getting ahead. At least that's what happens to me when I'm
doing something new.

~~~
fpereiro
Wow, I didn't know that AWS had a managed redis service! I thought Elasticache
was only for Elasticsearch. Thank you for pointing this out.

In my experience, setting up redis is very easy and not time-consuming; what
is not trivial is to store large amounts of data in it, which requires either
getting more RAM or considering a clustering option.

I'll very much have this option in mind for upcoming projects where I'm not
the sole owner.

Regarding Ansible vs bash/ssh, I'll probably (hopefully!) come to the point
where I'll need something more sophisticated. Ansible is definitely my
benchmark for an elegant solution to idempotent provisioning and
infrastructure as code. I might not help writing something custom in js, but I
definitely will have to go beyond bash and purely imperative approaches. I'll
be sure to share what I learn along the way.

~~~
devjam
Another option over Ansible/Saltstack for managing remote hosts, is using
something like Fabric [1]. It's a "remote execution framework"; basically a
Python wrapper around SSH for defining hosts and writing tasks that are
executed in a shell on your fleet.

[1] [https://www.fabfile.org/](https://www.fabfile.org/)

~~~
fpereiro
Will take a look at Fabric, thank you!

------
marai2
I liked this comment in the article (might use a variation myself):

"If you must sting, please also be nice. But above all, please be accurate."

------
Scarbutt
_An HTTP API allows us to serve, with the same code, the needs of a user
facing UI (either web or native), an admin and also programmatic access by
third parties. We get all three for the price of one._

At a high cost, writing an intermediate JSON API is no free launch. For solo
dev and, unless the UI is someone else problem, ask yourself if you really
need all this(many frontends, third party access, etc..) before blindly
following that advice. Server-rendered-html(with some JS) might be more than
enough.

~~~
fpereiro
You raise a great point.

The web frontends I write always draw views entirely through client-side js,
and interact with the server mostly through JSON. The only HTML served is that
to bootstrap the page, plus static assets (js & images). If you use server-
side rendering, writing an HTTP API might indeed represent more work. And I'm
in no position to say that client-rendering is better than server-rendering.

Just added a clarification in this regard in the document. Thanks for bringing
this point up.

------
keyle
Fantastic article.

Less "how we used fancy schmancy crazy stuff your mom warned you about, and
you should too, or feel bad."

More real world truth.

------
randomsearch
Thanks a lot for the article! Loved it. Within my limited experience, I also
agreed a lot, which I find encouraging.

A few quick questions for the author if I may:

\- Do you have recommendations for other articles and books or courses like
this?

\- What’s your take on a few alternatives: Python (rather than node) and
Google Cloud or Azure (rather than Amazon)?

\- do you lose sleep worrying about the security of your servers? :-)

Thanks!

~~~
fpereiro
Glad you liked the article!

I don't have recommendations for other articles like these; most of what I
wrote came from personal experience or things I learned directly from other
people working with me. I also have learned a lot from writeups by people
running web services, especially when they encounter issues and/or suffer
downtime or data loss. These real-world lessons can be highly enlightening,
particularly if they're written in an honest way - I cannot point to any one
in particular, but that's what I would look for, especially for services that
run a stack similar to the one you're planning to use.

As for Python vs node, I recommend using the language you love the most, as
long as the runtime you use is decently performant. node itself is blazing
fast, but I would probably use it anyway if it was 10x as slow. I have no
experience running Python as a web service (I've only done some local
scripting with it) but I'm sure it can work perfectly well and I know of at
least a couple projects which use it successfully at big scales.

I recently read a writeup by DHH about Ruby representing only 15% of the
infrastructure cost of Basecamp ([https://m.signalvnoise.com/only-15-of-the-
basecamp-operation...](https://m.signalvnoise.com/only-15-of-the-basecamp-
operations-budget-is-spent-on-ruby/)). The takeaway here is that your choice
of database and architecture will probably determine your costs and
performance, not the programming language.

Regarding alternatives to AWS, I haven't taken the time to consider them,
honestly. I use AWS very sparingly (with the exception of S3 - which has a
track record that no equivalent service offers, if only for the sheer amount
of time they've been at it - and SES, which I use only because I'm already
using S3 and I might as well just have one cloud provider). For work with
clients, they all seem to prefer embracing the dominant vendor and I don't
have any powerful arguments to dissuade them from this.

At the moment I'm only responsible for a couple of services with low traffic
(and handling no financial transactions nor pacemakers), so I sleep with my
phone off. I've been responsible for largish infrastructures and in those
cases, the quality of your sleep drastically declines. Soon and hopefully,
however, I'll be running a service with a lot of user data and reliability
guarantees, so I won't be able to turn off my phone anymore at night. This
time, however, I'll be fully in control of all design decisions and I hope
that, by systematic treatment and elimination of sources of failure, outages
and emergencies should happen asymptotically less as time goes by - and with
any luck, the lessons learnt can also help others that are in the same
position.

~~~
randomsearch
Brilliant reply, thank you! I think you just persuaded me to switch to AWS.

Do you do any work with web sockets? I'm a bit worried about that wrt Python,
whether it will scale well.

I guess my biggest security concerns are messing up my REST implementation,
dependencies in Node being compromised, and vulnerability in servers I'm
using, e.g. nginx, redis.

~~~
fpereiro
I normally don't work with websockets, just JSON, but perhaps that's because
all the applications I work on are basically glorified CRUDs. If your
application is essentially a CRUD, however, JSON might be more than enough.
I've seen thousands of JSON requests/responses per second being handled by
small virtual instances running node.

The first thing I'll do as soon as the product I'm building starts bringing in
revenue will be to pay a security expert to perform a security audit of it
all. In the meantime, I try as much as possible to eliminate backdoors (use
the API for everything), simplify the data flows, drastically reduce the
amount of dependencies and of moving parts, and as much as possible use
technology that's been around for a few years (nginx, node and nginx all make
the cut :).

~~~
randomsearch
As I need to push events/data (collaborative app) I'll need websockets unless
I decide to do some insane polling.

Totally with you on the security expert, definitely a priority.

Thanks a lot for taking the time to reply, you've been really helpful!

~~~
fpereiro
Using websockets for a collaborative app makes all the sense in the world
(incidentally, I realize I've never written one an app of that kind).

My pleasure! Being part of this thread has been an amazing learning
experience.

------
_bxg1
Good stuff. I love seeing really practical articles that just lay out, "Given
my experience and the lessons I've learned, these are a good baseline of
general best practices".

------
a_imho
Regarding Architecture B, I was always curious why use nginx as a proxy to
node? In my experience it is perfectly capable as an application server as
well (OpenResty in particular), no need for yet another component.

Also, I consider deployment via Docker a sensible default with all the
niceties in tooling. For example I find it convenient to set a restart policy
on the container process.

~~~
fpereiro
The main reason I use nginx is because it is _so easy_ to configure HTTPS with
LetsEncrypt & Certbot. Perhaps there's an almost equally easy solution now
that only uses node, but last time I checked (2018) it was messy.

In this article ([https://medium.com/intrinsic/why-should-i-use-a-reverse-
prox...](https://medium.com/intrinsic/why-should-i-use-a-reverse-proxy-if-
node-js-is-production-ready-5a079408b2ca)) there's another good reason: by not
making node manage the HTTPS certificates, no node library that I use
(directly or indirectly through a dependency) can represent an attack surface
regarding the certificates. But I must confess this is marginal compared to
the ease of installation and maintenance that I already mentioned.

~~~
mercer
Another good reason is that once you start running multiple apps with
different tech stacks, it's _so_ much nicer to be able to take care of the
config using the regular sites-enabled 'workflow'! I have a server that runs a
node app, and a few Phoenix apps, and all I have to do on the nginx end is
forward to the correct port (and of course set up https with LetsEncrypt is
real easy too).

------
say_it_as_it_is
Can anyone recommend resources related to CAP theorem decision making? I want
to go further than I have for understanding tradeoffs.

------
vearwhershuh
Great article in general, but this is _absolute gold_ :

"Before you rush to embrace the microservices paradigm, I offer you the
following rule of thumb: if two pieces of information are dependent on each
other, they should belong to a single server. In other words, the natural
boundaries for a service should be the natural boundaries of its data."

~~~
fjp
I'm curious what's a good rule of thumb for "dependent on each other". A
foreign-key relationship? That's... pretty much everything unless you have two
completely separate business lines.

~~~
fpereiro
This is not an easy question to answer, but in my experience it pays off to
ponder it carefully.

So far, the heuristic that seems to work better for me is to keep the overall
surface of the app small and try to keep everything in a single app. Lately,
however, I'm convinced that extracting parts of the code as separate services
(in particular, for handling users, logs, and statistics) would make the
overall structure simpler. Dissecting this a little bit, an external user
service is equivalent to middleware, whereas logs and statistics are write-
only, after-the-fact side effects (from the perspective of the app), so
bidirectional interaction with them is not needed for the purposes of serving
a particular request. What I try to avoid is to have a server route that
involves (necessarily) asynchronous logic between services to check the
consistency of an operation.

In terms of "business logic", I feel that whatever lies within a transactional
boundary should be in the same service; i.e., if two users share a resource,
both the information of who owns the resource and who can access the resource
should be in the same service. But I suspect there might be a lot more of
thinking to be done in this regard.

------
swah
Thanks for the writeup - I feel like trying redis again just because it opens
some interesting possibilities (yes yes could be done in SQL whatever).

Regarding "ulog" in your example application, do you keep those lists
unbounded? Does this work well in practice?

~~~
fpereiro
I am consdering to move "ulogs" out of redis and into files because of the
potential memory usage - perhaps a hybrid approach where only new logs are in
redis would work best. I don't have a definitive solution yet. However, this
would be a problem when handling tens of thousands of users or more, which
makes it a problem I'd love to have!

I'm also considering letting users delete their own history of logins that are
older than N days, since it's their data in the end, not mine.

------
zeveb
My only feedback is to consider pure Debian stable rather than Ubuntu,
_particularly_ on the server side. Other than that, it sounds like a pretty
decent way to get things done.

~~~
fpereiro
I'm curious as to the advantages of using Debian over Ubuntu on the server -
is it considered generally more stable or better debugged? This is something
I've never explored and am interested in your thoughts about it.

At some point in the future, I'd like to give OpenBSD a try.

~~~
notyourday
Debian stable moves very slow, which makes it an excellent base OS for a
server since predictability dictates that your app brings with it whatever
that it needs to run.

For example, nodejs on debian would be outdated, which would force you to
setup your app to use a specific version of nodejs that you want to use.

------
nymanjon
So, why not use KeyDB if you need even higher throughput? Supposedly it is
supposed to get 5 times the performance.

~~~
fpereiro
Before your comment I didn't know about the existence of KeyDB, but now I do -
thank you for that!

My main issue with scaling Redis is not performance or throughput, but doing
it in a controlled, understandable manner which maintains consistency and
ideally controlling what information goes on which node. In any case, it's
great to find out about KeyDB. Thanks!

~~~
nymanjon
Thanks for the write up. I didn't know you could use Redis as a primary data
store that is reliable enough to put in production. It has really made me
curious into how to model outside of a SQL database. My next personal project
I think I might try it with KeyDB free tier service (you can host your own,
but I would just use the service since it is unlikely to gain steam and is
more for having fun).

My mind wants to always normalize the data to the the extreme in a SQL
database. So, it will an interesting exercise for myself!

Glad I could introduce you to KeyDB. It is a project I found in the Hacker
Newsletter that I've been curious about for a while but I've been thinking
SQL, SQL all the time that I never thought I would have a use case for it.

~~~
fpereiro
I added recently a bit more of detail in the document explaining when I would
use redis (and when I wouldn't) to build production systems, here:
[https://github.com/fpereiro/backendlore#redis](https://github.com/fpereiro/backendlore#redis)
. For most applications, I think it's a viable alternative.

Modeling data with redis is great fun, and quite different to the relational
paradigm. I hope you enjoy a lot your upcoming project!

------
christiansakai
Please enlighten me about why Redis is a good use case here.

Won't Redis cache eviction policy make it not a good use case to be a general
purpose database, compared to, MongoDB?

~~~
fpereiro
I added a few more notes on when/why to use Redis and when to use something
else:
[https://github.com/fpereiro/backendlore#redis](https://github.com/fpereiro/backendlore#redis)
. Redis can take you surprisingly far as a general purpose database. I prefer
to use either Redis or a relational database, but I'm in no position to tell
you that you should do that too, particularly if MongoDB works well for you.

~~~
mercer
Could you elaborate on why your first choice would be Redis? I tend to go for
Postgres, but mostly out of habit and familiarity.

~~~
fpereiro
I love working with fundamental data structures (sets, lists, hashes) much
more than working with relational databases. I feel I'm more productive and my
code is shorter. It does, however, require far stricter validations in the
code; in general a lot of what a relational database does for you (in terms of
type validation, schemas and even data consistency), you must do yourself in
code or by using redis carefully.

------
kristianp
I like the openness here. It makes me nervous that the only db used is redis.
I prefer a database that's backed by disk files, not one that's backed up
occasionally. You get the performance of an in-memory database with redis
though.

~~~
foxbarrington
Redis is my favorite database, and it’s interesting to me how many people are
afraid of it losing data.

It’s easy to use both RDB and AOF persistence methods if you want a degree of
data safety comparable to what PostgreSQL can provide you. See
[https://redis.io/topics/persistence](https://redis.io/topics/persistence)

~~~
Scarbutt
_and it’s interesting to me how many people are afraid of it losing data._

Its more about data integrity.

------
gyrgtyn
A lot of good stuff here I'm gonna steal.

How about leveldb instead of redis to start? One less moving part, though then
you might have to migrate to a networked db at some point.

abstract-blob-store might help with the s3/filesystem woes?

~~~
gyrgtyn
also, browserify --node or noderify for deployment?

------
jopsen
Curious, why deal with updating Ubuntu instead of just using heroku?

I've generally found it quite capable of serving nose.js

~~~
fpereiro
I like dealing with the underlying OS - well, not precisely like, but rather
the absence of an intermediate layer between the OS and me. Fortunately,
dealing with the OS is very straightforward.

However, if Heroku works well for you, by all means use it! Your time is
probably better spent working on your app instead of in the infrastructure.
Things that work should only be reconsidered when they start to create real
world problems - and the changes that you do in response to real challenges
tends to be wiser and more lasting.

------
ablekh
Nice write-up. However, I would strongly recommend to replace your text-based
diagrams, which are difficult to read, with ones produced by relevant tools
(e.g., draw.io or a wonderful Mermaid project: [http://mermaid-
js.github.io/mermaid](http://mermaid-js.github.io/mermaid)).

~~~
fpereiro
Thank you for your suggestion! However, I'm a sucker for text diagrams (ever
since I saw them in Jonesforth:
[https://github.com/nornagon/jonesforth/blob/master/jonesfort...](https://github.com/nornagon/jonesforth/blob/master/jonesforth.S#L199)),
so I strongly prefer to leave them as they stand. Hopefully they're still
clear enough and get the point across.

~~~
swah
I was finding your code [1] very compact, reminded me of Forth or those old
array languages. This comment kinda explains it :)

[1]
[https://github.com/altocodenl/acpic/blob/master/server.js](https://github.com/altocodenl/acpic/blob/master/server.js)

------
raziel2p
Using /tmp for a program's binary files as well as logs seems like a bad idea.

~~~
fpereiro
Please see
[https://github.com/fpereiro/backendlore/issues/5](https://github.com/fpereiro/backendlore/issues/5)
.

------
dangerface
Great design, simple stable and scalable! no need for trendy gimmicks.

------
nnq
...why not use a managed db, eg. AWS RDS from the get-go?

~~~
fpereiro
It's actually a fine solution and I often do this with infrastructures that I
don't intend to maintain myself. In general, however, I try to minimize
complexity, and having an external service feels a tad more complex than
running the database inside the instance. But you could probably make a
successful case to the contrary - namely, that using a service rather than
running your own is actually much simpler and easier to maintain.

I must confess that in this decision, the complexity of the AWS console (which
I consider awash in buckets of accidental complexity) usually tilts my balance
towards self-hosting.

------
Scarbutt
Why not start with ELB from the beginning?

~~~
seanhunter
Because it's easy to add later and you probably don't need it at the
beginning.

~~~
Scarbutt
Using ELB(yes, even for one machine) from the start means one less process to
worry about, you get HTTPS and don't have to deal with certs, its also cheap.

Since he's striving for "simplicity" and already is in AWS.

~~~
aprdm
I feel it is one more thing to worry about as I now have a dependency on
something that I can't run locally nor control. It is also more expensive and
locked in.

For someone with experience setting up nginx with https from letsencrypt is
the same complexity as a hello world. Same thing for running redis yourself
instead of adding a cloud provider.

