1) No Ansible. No Kubernetes, nor containerisation. Starting on a single server/vm and controlling deploys with a bash script.
2) An interesting take on microservices: "One server, one service... 2-3k of code can hardly be called a monolith.... if two pieces of information are dependent on each other, they should belong to a single server."
As a (mainly) backend dev who does like devops and microservices, I agree on both points.
KISS - there's no need to start off with over-complicating everything for no reason. When there's a need for Ansible, introduce it at that point. When you think you need microservices, think about it a lot and move to them if you really do require them.
But certainly don't start off using a full Ansible + Kubernetes + Docker + microservices stack. You'll spend all your time managing these, and relatively little time actually being productive.
I see this a lot, and I'm always somewhat surprised by it. It took a couple of days to put together our kops-based cluster on EC2, wire it in to Gitlab CI and get the various load balancers, ingresses et al plumbed in, and then perhaps an hour a week maintaining it from there?
I do agree it's not the right place to start, but I think the standard operational complexity of k8s is frequently overstated. The argument I do buy against k8s is that when it goes wrong, debugging can be a nightmare.
The other argument I have against many DevOps tools (and most software tools in general) is that the patterns of the solutions they provide don't generally match the basic patterns of the problems they are solving - the consequence is the sheer complexity of understanding the tool, especially when something goes wrong. In the past, this hasn't only costed me time, but a lot of frustration that I'd rather avoid - even at the price of having to build my own solutions.
Right but it is a trade off to some degree. For us, to have ephemeral feature branches deployed to QA completely in a hands-off manner means we can do multiple releases per day as a small team. Most (almost all!) of the advantages we have seen from moving to a k8s infra have been DX, rather than UX.
Agree on the other points, no need for k8s or anything on a small server.
I love the idea of infrastructure as code and at some point I will try to write a small js library that performs these types of tasks - but first I'm trying to get to the point where I'll actually need it. I feel that bash won't scale beyond a certain point (or rather, a scalable bash solution would be much easier to write in js or another high level language). In any case, I'll have Ansible as my benchmark of a great solution to run a large infrastructure.
You mentioned you're still "finding an elegant way to generate, use and clean up CSRF tokens".... Consider implementing an "httponly", "samesite=strict" cookie for security check. Apple recently got on board so all major browsers are supporting this. One goal of samesite is to make CSRF tokens redundant.
We roll out at least three different cookies with varying expiration times. Including each of "strict", "lax", and "none". They are there for various reasons.
Of course additional checks occur, especially for big operations, but we no longer keep track of multiple CSRF tokens for each user/device/browser combo.
SameSite is just a property you use when setting a cookie, like the expiration, or url path. There are three settable values: None, Lax, and Strict.
'None' sends the cookie with every
request, like how browsers "historically" acted.
'Lax' is the new default . If someone follows a link to your page, these cookie are sent with the request. If a third-party site POSTs to one of your pages, these cookies won't be sent.
'Strict' cookies will never be sent to your server if the request originated from a third party, even if it's as innocent as following a link from a search engine.
So use these types to your liking. Your needs vary. If you want the user to appear logged in when they follow a link to your site you're going to use None or (most likely) Lax.
As far as CSRF, you can generate a single CSRF token at login and set it in a cookie as with SameSite=Strict. Regenerate it every so often, or at "big" events, and voila. One CSRF token per device session. No worries about multiple tabs open, and you have reasonable expectation to only have to keep track of a single token at a time. The kicker is, you technically don't even have to have a CSRF token if you don't want. If you're serious about nobody linking to an "internal" authenticated page, just set the primary session id cookie as strict. (I would personally do more than this, but hey.)
I do secure portals. So we set several cookies across the board. Varying expiration times. They all have their purposes.
 While you can consider Lax as the new default, if you omit the SameSite property on a cookie it is actually attributed as something referred to as 'NotSet', which is unsettable, and acts a hybrid between Lax and None. Browsers vary a little here, but it's only to maintain reasonable backwards compatibility at the moment, and both Mozilla and Chrome developers plan to remove this hybrid functionality in the future. Basically.. a cookie will act like None for 2 minutes, then act like Lax after that, etc.
Since I want to support as many browsers as possible, I'm going to still use CSRF tokens, but a single token per session as you suggested. I just outlined the approach here: https://news.ycombinator.com/item?id=22268152 .
Thanks for your detailed feedback, it's very valuable to me.
I've used S3 in past projects. These numbers are in line with my experience. And frankly, I'm good with them. S3 is dirt cheap compared to EBS, and it IMO doesn't generate a lot of production incidents (compared to, say, AWS RDS MySQL). The latency is on the high side compared to some things, but if the user is doing an "upload" operation of some sort, 50ms is tolerable, IMO. And it always seems to be way more efficient that just about everything else, in particular, it's usually next to some Mongo/SQL DB that is stuggling under the load of feature creep. S3 has basically never been on my bottleneck radar.
The biggest thing I find people doing wrong, particularly in Python, is a. failing to keep the connection open (and so paying the TCP & TLS handshakes repeatedly when they don't need to, which adds a significant latency hit) and b. making boto calls they don't need to, such as "get_bucket" which in boto will check for the bucket's existence. You almost always can guarantee the bucket exists as a precondition of normal system operation… and the next call to GET/PUT object will fail anyways if something is amiss.
Regarding S3 pricing, ~0.09 USD per downloaded GB could potentially become a significant bill (~90 USD per TB). I'm building a service that serves pictures and limiting the financial downside is very important for the economics of the business. More than the actual magnitude, the certainty that costs are a linear and predictable function of storage makes me sleep better.
An implicit point that I just realized by re-reading your comment: if you have a significant amount of data (let's say, > 100GB) a local FS is only cheaper if you're using instances or dedicated servers with large disks - and probably this is prohibitively expensive in AWS (EBS). I'm particularly considering Hetzner dedicated servers with large disks.
I currently run a web service from North American servers and the performance here in Europe is quite good. I think this experience - very, very limited - gives me confidence that the wires across the world are good enough, provided your data center is close to the main pipes.
This writeup in particular is great because it starts with simple architectures and progresses to horizontal scaling with a load balancer -- it really is 75% of what you'd need to actually implement a production-ready application backend, with clear links to get the other 25%.
Doesn't Redis generally keep all keys in memory and have features for deleting keys? (mostly meant as a cache)
After some research I concluded my approach was still preferable, at least for now, but I still felt some unease because everyone seems to be using Docker and whatnot. Your article really helped me feel more confident in my choices. Thanks!
1. I already know how nginx/apache/linux work, so it's been pretty easy to spin up a new VPS and configure it to handle potentially multiple apps, rather than learning how to work with Docker or the like.
2. I run pretty much the same stack everywhere (Phoenix/Elixir) and I don't foresee running into issues where I need to run multiple versions of, say, elixir or node.
3. For deployment I quite like git + hooks to handle things. and because of some of Elixir's particulars, it's real easy to deploy a new version and recompile without having to completely restart the app.
4. I don't like the idea of adding another layer of complexity. The way I see it, if anyone else needs to work with me on the 'devops' part, they better know how to work with linux/nginx anyways. And if they already know, they can pick things up pretty quickly. Knowing how to use Docker would just add another thing for them to learn, another thing for me to keep an eye on, and another thing to follow updates and security issues on.
5. I'm very much of the article's school of thought that scaling stuff on a single server for quite a long time is possible. Perhaps especially with Phoenix/Elixir. So at worst I could see myself moving an app to a separate VPS, but I don't need to 'orchestrate' things and whatnot.
6. I haven't properly researched this, but I suspect that, for quite a few of our clients, running things on a separate server is a legal requirement. A container would not only be an alien concept to them, but it might very well just not be allowed (would love to hear input on this though).
tl;dr: I just don't need it. I can spin up a server and get everything set up in < 30 mins, some of it done manually and some of it with some simple bash scripts. and I can't think of a very good reason to learn and implement another layer that sits in between my server and the running app(s).
All that said, I don't know what I don't know, so I'd really love to hear where perhaps I might benefit from using Docker/Ansible or the like! I'm not at all against these tools or anything.
EDIT: I'll add that I do think perhaps Docker might be useful for local development, especially when we start hiring other devs. Am I correct in assuming that's one of the use cases?
As I understand it, something like that is supposed to be one of the big draws: Since everything runs in a container the environment is the same regardless of the system packages, so you shouldn't have any "works on my machine" bugs.
That said, we had a session recently where a system that was just converted to docker was being handed off from development to maintenance, that also acted as a crash course for anyone interested in docker on the other development teams. I think only a fraction of us actually got it working during that session, the rest having various different issues, mostly with docker itself rather than the converted system.
But I recall doing contract work where it took me and every new developer literally a day (at least) to get their Ruby on Rails stack working with the codebase, and where we often ran into issues getting multiple projects running alongside on account of them needing different versions of gems or Ruby. I can see how Docker would be a great timesaver there.
I started using ASDF for Elixir projects, but over time I've basically replaced RVM, NVM and all other similar tools with it.
But honestly in my current work I've not needed it yet. As far as Elixir/Phoenix goes, things have been stable enough so far that I've not needed to run multiple versions, and I try to rely on node-stuff as little as possible (more and more LiveView), so I can get away with either updating everything at once, or leaving things be.
Furthermore, when various apps diverge too much, I usually find there's a need to run them on separate VPS', which I find cleaner and safer than trying to make them run alongside each other.
But thanks for reminding me about ASDF. It seems like a much nicer solution than using multiple tools!
Docker for me became easier after the whole switchover and not being involved in Linux land for a few years. Many parts that I’d built up muscle memory from years back are different. Now I login to a recent rev of a Linux distro and I have to spend time looking up how network interfaces are now handled, how to create a service, how to look at logs. Docker CLI has been more stable “api” for me the past few years. Though lately I’m doing embedded, so I just took a Nerves x86 image and got it running on a VPS. Get a minute of downtime when updating a rarely used service seems worthwhile.
Yup. I have a few umbrella apps where I'll do Application.stop() and .start(), and sometimes recompiling isn't enough, but 99% of the time it's all I need to do.
For embedded stuff I suppose Docker could be very useful too. Any reason you don't run Erlang 'bare' and use releases? Apologies if that question makes no sense; I have no experience (yet) in that area but the coming year I'm excited to start working with Nerves and various IoT stuff :).
* Use `npm ci` instead of `npm i --no-save`. This gives you the exact packages from the lockfile, which is more reliable.
* You might like systemd .service files instead of `mon`. It's available by default on ubuntu LTS, and you're down a dependency that way. The .service files can be really simple: https://techoverflow.net/2019/03/11/simple-online-systemd-se...
- I don't use lockfiles - instead I use very very few dependencies, and I always target a specific version of each of them. I know this is not the same as a shrinkwrap, but I'm crazy enough to feel that package-lock.json adds visual bloat to my repo.
- I am biased towards mon because it's probably less susceptible to changes over time than Ubuntu itself, and because it can be installed with a single command. But systemd definitely is a good option too if you're using Ubuntu!
Lock files are used to lock dependency versions all the way down your dependency tree, not just your immediate dependencies.
Thanks everyone for pointing out this issue.
I wonder whether one of the npm alternatives has a "--sane" mode that would always pick the oldest possible dep...
Once/if postgres becomes a bottleneck I simply add redis to cache, the next step is doing some stuff asynch with RabbitMQ + Celery and you can scale really really really far...
Much like you, I have a Django template project and use Dokku for deployment, so I can use the template to create a new project, commit, push, and it's already up on production.
Having all the best practices and boilerplate already sorted helps immensely.
And now we have the same thing for API Platform, NestJS, Vue, NextJS... Even React-Native with Fastlane and CodePush :) It's been invaluable, and a very strong commercial argument
Creating a new project is simply "make generate" from our generator, then answering a bunch of questions about what tech stack you need. This works mostly with templating (using Plop) and a sprinkle of automation for shell tasks (eg `pipenv install`)
This is something that did not occur to me until I have done a lot of consulting for small and medium size companies. All of them have the same basic challenges - they don't have tooling to do basic crap for example, they can blast some notification to everyone that is registered in their service but sending an email from a email-message-to-specific-set-of-people requires a workflow deploy. Same goes for provisioning a new domain and wiring it up to nodejs backend, etc, etc, etc.
Automate basic devops first, from DNS management to email sending to deploys to workflows, etc. Think of every project you are going to to later being a "client" of your devops "infrastructure".
Even though there is some mention of one system doing multiple things, it would really help if OP adds a section about justification of an all-js approach. And especially, what OP's opinion is about programming in 'JS The Good Parts' (a-la Crockford) vs 'anything goes'.
I believe OP may have good opinions about how to approach JS programming. It would be great if that is added to the document.
In any case, I highly appreciate OP sharing this work. Thank you.
edit: OP address part of this in his comment 
- I choose js/node as my server-side language of choice. This is because I love both js the language and node itself, and so far haven't seen the need to try something else. That doesn't mean, however, that it should be the solution for everyone. Particularly if you don't like js - you're probably better off programming in a language you enjoy!
Additionally it offers better fault tolerance and zero-downtime online upgrades. For more advanced use cases it can be deployed in a geo-replicated topology and configured with partitioning for low latency transactions when confined to a single region while proving strong consistency and a single logical database globally.
Disclaimer: I work on cockroach.
I've found linux containers to be fantastic for both replicating pieces of the environment and making your e2e tests a little more offical. Running postgres itself (migrating it like you would your real app, inserting data like the app would, then tearing it all down in seconds), apps like mailcatcher and actually letting your app send mail (this way you don't even need a local/log-only implementation), running redis locally, and replicating S3 with a project like minio is very easy and fast.
You can take this even further also by building a way to completely spin up your production environment to a certain point in time locally -- though all the resources for the ecosystem of apps may not be on the same VM image (otherwise you could just snapshot and technically go to the point in time), if you can assemble the state and the backups, you can hook them all up to the containers running locally (or in a VM running locally to be fair) and get an approximation of your entire system.
Of course, it is somewhat less true-to-production (given that production is you running a VM), but you can remove that barrier with things like linuxkit.
In general, I don't use Vagrant. I simply use my host OS (Ubuntu) and my local node & redis. There's usually no further moving parts than these, so it is easy to achieve development in an environment that is practically equivalent to the Ubuntu servers/instances in the cloud, the main difference being a different folder path, different files stored and different info on the DB.
When running the app locally, I still use services like S3 and SES - although I might not send emails, to keep things as "real" as possible.
I do use Vagrant when the project requires installation of other tools beyond my basic toolkit (like MongoDB or Elasticsearch) - in this case, I want that software to stay within the VM.
In general, I'd rather avoid virtualization because it represents an extra layer, but if it is necessary for idempotence or having a dev environment that mimics the prod environment, I'd definitely embrace it.
Containerization is actually not virualization (which is why it was in quotes ealrier), they're basically better-sandboxed local processes (just like your local redis instance), and the better sandboxing has benefits for both development and production. So if you squint, it's actually very similar to just running redis yourself on a similar OS -- just now when you shut it down you don't have to manage any folders.
Of course, use what works -- there's no need to fix things that aren't broken, but there is a reason that the defacto deployment artifact of the modern application is very quickly becoming a container (if it isn't already) -- in relatively trusted environments they give just the right amount of isolation and reproducibility, and centralized management (systemd even has space for containers via systemd-nspawn).
[EDIT] - somewhat unrelated but if you haven't I'd really suggest you give GitLab a try for your source code management -- not only does it come with a free to use docker image registry, it comes with a fantastically easy CI platform which can make automated testing easier. You get 2000 free minutes a month, and actually if you bring your own hardware (which is as easy as running a container on your local machine and pointing it at gitlab.com to run your jobs for you) it's all free. There are lots of other integrations and benefits (kanban style issue boards, gitlab pages, wiki, etc), but it's one of my gotos for small projects and I introduce it to corporate clients every chance I get. Microsoft-owned Github is doing it's best to compete now that it has endless pockets, but GitLab has been offering an amazing amount of value for free for a long time now.
This is probably the answer that Google App Engine, by providing all the bricks needed to replace Redis or the FS or S3, and all you need is to write your own application code. All the provisioning and monitoring is taken care of. Is there an open alternative to it I could run locally or on my own server ?
That's the million dollar question! It also vexes me that all of this is necessary. I hope like crazy that there will be a radically better approach in 5-10 years. I am not aware of solutions that truly abstract all these problems without creating 1) further, larger problems; and/or 2) severe vendor lock-in.
I also hope that at one point I can just write my code, wrap it behind a simple HTTP API, and give it to some orchestrator that manages everything (monitoring, logs, deployment, TLS, gives me a FS API, manages authentication, ...). I feel like web platforms such as OpenResty (https://openresty.org/en/), Caddy (https://caddyserver.com/) already provide part of it, but can probably go a bit further
https://github.com/dokku/dokku/releases shows a release a week ago, and https://github.com/dokku/dokku/pulse/monthly shows pull requests and issues being addressed.
Seems dokku picked up pace since.
Specifically, I used alt version for the nginx setup
It shouldn't be to expensive and can save you a lot of time in setup.
I think I'd also recommend using ansible instead of plain SSH. It has a lot of stuff built in and for simple deployments shouldn't be too hard to pick up.
Future you will probably appreciate an off-the-shelf solution, instead of something custom :).
It is nice seeing something like this put together though. There are so many moving pieces in a project it can lead to decision making paralysis where you find yourself going down rabbit holes, second guessing everything you do and seemingly never getting ahead. At least that's what happens to me when I'm doing something new.
I only set up a few VMs a year, and found the open-source Ansible modules I used weren't maintained any longer or weren't updated. So I was having to maintain them to keep my playbook working.
In the end I decided bash plus some manual work to copy / update files was going to be quicker overall.
In my experience, setting up redis is very easy and not time-consuming; what is not trivial is to store large amounts of data in it, which requires either getting more RAM or considering a clustering option.
I'll very much have this option in mind for upcoming projects where I'm not the sole owner.
Regarding Ansible vs bash/ssh, I'll probably (hopefully!) come to the point where I'll need something more sophisticated. Ansible is definitely my benchmark for an elegant solution to idempotent provisioning and infrastructure as code. I might not help writing something custom in js, but I definitely will have to go beyond bash and purely imperative approaches. I'll be sure to share what I learn along the way.
"If you must sting, please also be nice. But above all, please be accurate."
At a high cost, writing an intermediate JSON API is no free launch. For solo dev and, unless the UI is someone else problem, ask yourself if you really need all this(many frontends, third party access, etc..) before blindly following that advice. Server-rendered-html(with some JS) might be more than enough.
The web frontends I write always draw views entirely through client-side js, and interact with the server mostly through JSON. The only HTML served is that to bootstrap the page, plus static assets (js & images). If you use server-side rendering, writing an HTTP API might indeed represent more work. And I'm in no position to say that client-rendering is better than server-rendering.
Just added a clarification in this regard in the document. Thanks for bringing this point up.
But still when I see how locked down applications are today and all the bloat in clients, I wish more services would provide an API to allow people to build their own clients.
Less "how we used fancy schmancy crazy stuff your mom warned you about, and you should too, or feel bad."
More real world truth.
A few quick questions for the author if I may:
- Do you have recommendations for other articles and books or courses like this?
- What’s your take on a few alternatives: Python (rather than node) and Google Cloud or Azure (rather than Amazon)?
- do you lose sleep worrying about the security of your servers? :-)
I don't have recommendations for other articles like these; most of what I wrote came from personal experience or things I learned directly from other people working with me. I also have learned a lot from writeups by people running web services, especially when they encounter issues and/or suffer downtime or data loss. These real-world lessons can be highly enlightening, particularly if they're written in an honest way - I cannot point to any one in particular, but that's what I would look for, especially for services that run a stack similar to the one you're planning to use.
As for Python vs node, I recommend using the language you love the most, as long as the runtime you use is decently performant. node itself is blazing fast, but I would probably use it anyway if it was 10x as slow. I have no experience running Python as a web service (I've only done some local scripting with it) but I'm sure it can work perfectly well and I know of at least a couple projects which use it successfully at big scales.
I recently read a writeup by DHH about Ruby representing only 15% of the infrastructure cost of Basecamp (https://m.signalvnoise.com/only-15-of-the-basecamp-operation...). The takeaway here is that your choice of database and architecture will probably determine your costs and performance, not the programming language.
Regarding alternatives to AWS, I haven't taken the time to consider them, honestly. I use AWS very sparingly (with the exception of S3 - which has a track record that no equivalent service offers, if only for the sheer amount of time they've been at it - and SES, which I use only because I'm already using S3 and I might as well just have one cloud provider). For work with clients, they all seem to prefer embracing the dominant vendor and I don't have any powerful arguments to dissuade them from this.
At the moment I'm only responsible for a couple of services with low traffic (and handling no financial transactions nor pacemakers), so I sleep with my phone off. I've been responsible for largish infrastructures and in those cases, the quality of your sleep drastically declines. Soon and hopefully, however, I'll be running a service with a lot of user data and reliability guarantees, so I won't be able to turn off my phone anymore at night. This time, however, I'll be fully in control of all design decisions and I hope that, by systematic treatment and elimination of sources of failure, outages and emergencies should happen asymptotically less as time goes by - and with any luck, the lessons learnt can also help others that are in the same position.
Do you do any work with web sockets? I'm a bit worried about that wrt Python, whether it will scale well.
I guess my biggest security concerns are messing up my REST implementation, dependencies in Node being compromised, and vulnerability in servers I'm using, e.g. nginx, redis.
The first thing I'll do as soon as the product I'm building starts bringing in revenue will be to pay a security expert to perform a security audit of it all. In the meantime, I try as much as possible to eliminate backdoors (use the API for everything), simplify the data flows, drastically reduce the amount of dependencies and of moving parts, and as much as possible use technology that's been around for a few years (nginx, node and nginx all make the cut :).
Totally with you on the security expert, definitely a priority.
Thanks a lot for taking the time to reply, you've been really helpful!
My pleasure! Being part of this thread has been an amazing learning experience.
src: "GCP Cloud Architect" cert and lots of hands-on experience with the other two.
Also, I consider deployment via Docker a sensible default with all the niceties in tooling. For example I find it convenient to set a restart policy on the container process.
In this article (https://medium.com/intrinsic/why-should-i-use-a-reverse-prox...) there's another good reason: by not making node manage the HTTPS certificates, no node library that I use (directly or indirectly through a dependency) can represent an attack surface regarding the certificates. But I must confess this is marginal compared to the ease of installation and maintenance that I already mentioned.
Also Fail2Ban reads Nginx logs
Public Web -> Reverse Proxy -> Application Server
Public Web -> Nginx -> Node (In his example)
Public Web -> Node reverse proxy -> Node Application Server
Your reverse proxy can also handle blocking dodgy requests (Fail2Ban) or act as a lightweight static file handler, leaving your application server to do the grunt work.
It's amazing that this exists, I would have never thought it possible (because of what I saw as inherent limits in what can be expressed in terms of nginx configuration), but now that I see it, of course it is :). From what I gather, custom code can be written as Lua scripts, and many of the modules are written in Lua themselves.
Thank you for sharing this!
Are you building a project (or projects) on OpenResty?
> [nginx's] main use is to provide HTTPS support
I've been using nginx purely for HTTPS stuff because I have been too lazy to learn how to do it with node. When working on little no-user-info sites that don't need SSL, I have cloudflare in front with no encryption to the origin (just an A record pointing at the IP of the server), and then just have a simple express app which proxies the requests on port 80 to apps on 3000, 3001, 3002, etc. based on the host header - makes me so happy compared to playing around with nginx config files.
"Before you rush to embrace the microservices paradigm, I offer you the following rule of thumb: if two pieces of information are dependent on each other, they should belong to a single server. In other words, the natural boundaries for a service should be the natural boundaries of its data."
So far, the heuristic that seems to work better for me is to keep the overall surface of the app small and try to keep everything in a single app. Lately, however, I'm convinced that extracting parts of the code as separate services (in particular, for handling users, logs, and statistics) would make the overall structure simpler. Dissecting this a little bit, an external user service is equivalent to middleware, whereas logs and statistics are write-only, after-the-fact side effects (from the perspective of the app), so bidirectional interaction with them is not needed for the purposes of serving a particular request. What I try to avoid is to have a server route that involves (necessarily) asynchronous logic between services to check the consistency of an operation.
In terms of "business logic", I feel that whatever lies within a transactional boundary should be in the same service; i.e., if two users share a resource, both the information of who owns the resource and who can access the resource should be in the same service. But I suspect there might be a lot more of thinking to be done in this regard.
Regarding "ulog" in your example application, do you keep those lists unbounded? Does this work well in practice?
I'm also considering letting users delete their own history of logins that are older than N days, since it's their data in the end, not mine.
At some point in the future, I'd like to give OpenBSD a try.
For example, nodejs on debian would be outdated, which would force you to setup your app to use a specific version of nodejs that you want to use.
My main issue with scaling Redis is not performance or throughput, but doing it in a controlled, understandable manner which maintains consistency and ideally controlling what information goes on which node. In any case, it's great to find out about KeyDB. Thanks!
My mind wants to always normalize the data to the the extreme in a SQL database. So, it will an interesting exercise for myself!
Glad I could introduce you to KeyDB. It is a project I found in the Hacker Newsletter that I've been curious about for a while but I've been thinking SQL, SQL all the time that I never thought I would have a use case for it.
Modeling data with redis is great fun, and quite different to the relational paradigm. I hope you enjoy a lot your upcoming project!
Won't Redis cache eviction policy make it not a good use case to be a general purpose database, compared to, MongoDB?
> _redis can be replaced or complemented by another NoSQL database (such as MongoDB) or a relational database (such as Postgres). Throughout this section, where you see redis, feel free to replace it with the database you might want to use._
It’s easy to use both RDB and AOF persistence methods if you want a degree of data safety comparable to what PostgreSQL can provide you. See https://redis.io/topics/persistence
Its more about data integrity.
Lots more info here: https://redis.io/topics/persistence
Also please be aware that if you use AOF persistence, and need to restore a redis instance from an AOF file, you MUST have at least the same amount of RAM available as the AOF file is in size or you cannot restore, and you can even corrupt the AOF in that instance.
I work somewhere that previously (before I started) decided to use redis as a database for many applications and it has caused a lot of pain and the engineering department severely regrets it. I do still recommend it as a great cache/session store, especially if you outsource it to elasticache.
Definitely agree. I would not, for example, create a financial transaction system that is not backed by an ACID database that persists to disk.
If you have time/interest, I'd love to know more about the problems encountered by your team when using redis. In my experience, the main problem is scaling it to multiple instances (because RAM grows linearly with use). If the team also encountered other reliability issues, it would be valuable knowledge to me and perhaps others.
Redis still has an edge on use cases with auto-expiration.
Besides having all the data structures, I love just connecting to Redis without specifying a database name (there's just a database number, which defaults to 0), a database user, or needing to create a table or a schema. Redis is, so to speak, just there. Not saying this is better than Postgres objectively - I just like it more.
As for a redis-like disk-first database, it would be amazing, but I don't know if it is really possible (unless performance was severely sacrificed). This could be an interesting area to explore.
How about leveldb instead of redis to start? One less moving part, though then you might have to migrate to a networked db at some point.
abstract-blob-store might help with the s3/filesystem woes?
I've generally found it quite capable of serving nose.js
However, if Heroku works well for you, by all means use it! Your time is probably better spent working on your app instead of in the infrastructure. Things that work should only be reconsidered when they start to create real world problems - and the changes that you do in response to real challenges tends to be wiser and more lasting.
It's really nice when you can type "less readme.md" in a shell and actually get something useful. Markdown was designed to be human-readable as plain text too.
plain text is hard to read? I've heard it all
Diagrams not so much.
I must confess that in this decision, the complexity of the AWS console (which I consider awash in buckets of accidental complexity) usually tilts my balance towards self-hosting.
If you don't need the features (updates, slaves etc) you can get away with just a normal DB in an EC2 instance.
Since he's striving for "simplicity" and already is in AWS.
For someone with experience setting up nginx with https from letsencrypt is the same complexity as a hello world. Same thing for running redis yourself instead of adding a cloud provider.