The former examples are all managed! That's amazing for scaling teams.
(SQL can be managed with, say, RDS. Sure. But it's not the same level of managed as Dynamo (or Firebase or something like that). It still requires maintenance and tuning and upkeep. Maybe that's fine for you (remember: the point of this article was to tell you to THINK, not to just ignore any big tech products that come out). But don't discount the advantage of true serverless.)
My goal is to be totally unable to SSH into everything that powers my app. I'm not saying that I want a stack where I don't have to. I'm saying that I literally cannot, even if I wanted to real bad. That's why serverless is the future; not because of the massive scale it enables, but because fuck maintenance, fuck operations, fuck worrying about buffer overflow bugs in OpenSSL, I'll pay Amazon $N/month to do that for me, all that matters is the product.
> I'll pay Amazon $N/month to do that for me
Until you pay Amazon $N/month to provide the service, and then another $M/month to a human to manage it for you.
In this case you're only shifting the complexity from "maintaining" to "orchestrating". "Maintaining" means you build (in a semi-automated way) once and most of your work is spent keeping the services running. In the latter, you spend most of your time building the "orchestration" and little time maintaining.
If your product is still small, it makes sense to keep most of your infrastructure in "maintaining" since the number of services is small. As the product grows (and your company starts hiring ops people), you can slowly migrate to "orchestrating".
I see this a lot and it bugs me, because it implies that it's all zero sum and there's nothing that's ever unconditionally better or worse than anything else. But that's clearly ridiculous. A hammer is unconditionally better than a rock for putting nails in things. The beauty of progress is that it enables us to have our cake and eat it too. There is no law that necessitates a dichotomy between powerful and easy.
All tools are better at one thing then another. The point of examining trade-offs is to decide which advantages are more appropriate to which circumstance.
Not necessarily - to start with, a good trade-off is unconditionally better than a bad trade-off.
Also, progress brings with it increasing complexity. Recognizing the best path includes assessing many more parameters and is far more difficult than deciding whether to use a hammer or a rock to nail things. The puzzling nature of many people to over-complicate things instead of simplifying them makes the challenge even more difficult.
By the way, the closest thing I've ever found to a silver bullet in software development (and basically any endeavor) is "Keep it simple". While this is a cliche already, it is still too often overlooked. I think this is because it isn't related to the ability to be a master at manipulating code and logic, but the ability to focus on what's really important - to know how to discard the trendy for the practical, the adventurous methods to the focused and solid - basically passionately applying occam's razor on every abstraction layer. If this was more common, I think articles like "You are not Google" would be less common.
Are you better off with a simple dokku server, or a k8s cluster?
There's lots of good/bad on both sides... it really depends on your needs. You could use one locally or through dev layers and another for production.
Hammer pro’s: very efficient
Hammer cons: costs money, only available in industrialized societies
Rock pro’s: ubiquitous and free
Rock cons: inefficient
The point is that in tech, there is no choice which has only pros, and no choice which has only cons.
Given any important problem field, it's quite likely that among the extant top solutions there is no product that is strictly dominant (better in every dimension) - most likely there will be trade offs.
However, if you bend any nails you can't pull the nail out like with a normal framing hammer. Plus you can drive a nail faster with a proper hammer. So just use a hammer.
Problem is with software the tool also become part of what your working on. So it's never quite like this hammer and stone analogy.
Really a better analogy would be fasteners. So for instance we just have screws, nails, and joints. A simplified thing comparison would look like this, and they all become part of what your building.
Fast to drive
Still pretty strong
Easier to remove mistakes or to disassembly.
Not as strong as other methods of fastening
May crack wood if driven at the end of a board.
Almost as fast as nails with a screw gun.
Stronger joint than nails.
Can crack wood like nails without pre-drilling.
Slower to remove.
Strong as the wood used.
Last as long the wood.
Very slow requires chiseling and cutting wood into tight inter locking shapes.
If it's any type of joint with adhesive it can't be taken apart.
Screws can be pretty finicky, for example using a too-small Philips screwdriver can strip the screw-head, and make it very difficult to tighten further or to remove.
I'm pretty fond of those screws that can take a Flathead or Philips screwdriver, though.
How about nuts+bolts?
For instance I made an email newsletter system which handles subscriptions, verifications, unsubscribes, removing bounces, etc. based on Lambda, DynamoDB, and SES. What's nice about is that I don't need to have a whole VM running all the time when I just have to process a few subscriptions a day and an occasional burst of work when the newsletter goes out.
It is strictly for dev and staging. Not actual production use because prod doesn't exist yet anyways.
My question is what kind of maintenance should I be doing? I don't see any maintenance. I run apt upgrade about once a month. I'm sure you'd probably want to go with something like Google Cloud or Amazon or ElephantSQL for production purely to CYA but other than that if you don't anticipate heavy load, why not just get a cheap VM and run postgresql yourself? I mean I think ssh login is pretty safe if you disable password login, right? What maintenance am I missing? Assuming you don't care about the data and are willing to do some work to recreate your databases when something goes wrong, I think a virtual machine with linode or digital ocean or even your own data center is not bad?
Just like early electricity, people ran their own generators. That got outsourced, so the standard "sparky" wouldn't have the faintest idea of the requirements of the generation side, only the demand side.
The same is happening to programming.
2. The effects of the back end have a nasty habit of poking through in a way that the differences between windmill generators and pumped hydraulic storage don't.
"For example, we ran Basecamp on a single server for the first year. Because we went with such a simple setup, it only took a week to implement. We didn’t start with a cluster of 15 boxes or spend months worrying about scaling.
Did we experience any problems? A few. But we also realized
that most of the problems we feared, like a brief slowdown,
really weren’t that big of a deal to customers. As long as you keep people in the loop, and are honest about the situation, they’ll understand."
I wouldn’t work for a company that expects devs to manage resources that can be managed by a cloud provider and develop.
How well can you “manage” a Mysql database with storage redundancy across three availability zones and synchronous autoscaling read replicas?
How well can you manage an autoscaling database that costs basically nothing when you’re not using but scales to handle spikey traffic when you do?
If you think using a cloud "service" out-of-the-box will "just work" your scale is probably small enough that a single server with the equivalent package installed with the default settings is going to "just work" too.
We could also have 4 other servers running all of the time even when we weren’t demoing anything in our UAT environment.
We could also not have any redundancy and separate out the reads and writes.
No one said the developers didn’t need to understand how to do it. I said we didn’t have to worry about maintaining infrastructure and overprovisioning.
We also have bulk processors that run messages at a trickle based on incoming instances during the day but at night and especially st the end of the week, we need 8 times the resources to meet our SLAs. Should we also overprovision that and run 8 servers all of the time?
You assume this is bad, but why? Your alternative seems to be locking yourself into expensive and highly proprietary vendor solutions that have their own learning curve and a wide variety of tradeoffs and complications.
The point being made is that you are still worrying about maintaining infrastructure and overprovisioning, because you're now spending time specializing your system and perhaps entire company to a specific vendor's serverless solutions.
To be clear, I don't have anything against cloud infrastructure really, but I do think some folks really don't seem to understand how powerful simple, battle-tested tools like PostgreSQL are. "Overprovisioning" may be way less of an issue than you imply (especially if you seriously only need 8 machines), and replication for PostgreSQL is a long solved problem.
So having 4-8 times as many servers that unlike AWS, we would also have five times (1 master and 8 slaves) as much storage is better than the theoretical “lock-in”? You realize with the read replicas, you’re only paying once for storage since they all use the same (redundant) storage?
Where is the “lock-in”? It is Mysql. You use the same tools to transfer data from Aurora/MySQL that you would use to transfer data from any other MySQL installation.
But we should host our entire enterprise on digital ocean or linode just in case one day we want to move our entire infrastructure to another provider?
Out of all of the business risks that most companies face, lock-in to AWS is the least of them.
How are we “specializing our system”? We have one connection string for read/writes and one for just reads. I’ve been doing the same thing since the mid 2000s with MySQL on prem. AWS simply load balances rate readers and adds more as needed.
And you don’t see any issue on spending 5 to 9 times as much on both storage and CPU? You do realize that while we are just talking about production databases, just like any other we company we have multiple environments some of which are only used sporadically. Those environments are mostly shut down - including the actually database server until we need them and scales up with reads and writes with Aurora Serverless when we do need it.
You have know idea how easy it is to set the autoscsling read replica up do you?
If for production we need 5x of our baseline capacity to handle peak load are you saying that we could get our server from a basic server provider for 4 * 0.20 ( 1/5 of the time we need to scale our read replicas up) + 1?
Are you saying that we could get non production servers at 25% of the cost if they had to run all of the time compared to Aurora Serverless where we aren’t being charged at all for CPU/Memory until a request is made and the servers are brought up. Yes there is latency for the first request - but these are our non production/non staging environments.
Can we get point in time recovery?
And this is just databases.
We also have an autoscaling group of VMs based on messages in a queue. We have one relatively small instance that handles the trickle of messages that come during the day in production that can scale up to 10 at night when we do bulk processing. This just in production. We have no instances running when the queue is empty in non production environments. Should we also have enough servers to having 30-40 VMs running with only 20% utilization?
Should we also set up our own servers for object storage across multiple data centers?
What about our data center overseas close to our offshore developers?
If you have more servers on AWS you’re doing it wrong.
We don’t even manage build servers. When we push our code to git, CodeBuild spins up either prebuilt or custom Docker containers (on servers that we don’t manage) to build and run unit tests on our code based on a yaml file with a list of Shell commands.
It deploys code as lambda to servers we don’t manage. AWS gives you such a ridiculously high amount of lambda usage in the always free tier it’s ridiculous. No, our lambdas don’t “lock us in”. I deploy standard NodeJS/Express, C#/WebAPI, and Python/Django code that can be deployed to either lsmbda or a VM just by changing a single step in our deployment pipeline.
I see cloud services as a proxy to hire talented devops/dba people, and efficiently multiplex their time across several companies, rather than each company hiring mediocre devops/dba engineers. That said, I agree that for quite a few smaller companies, in house infrastructure will do the job almost as well as managed services, at much cheaper numbers. Either way, this is not an engineering decision, it's a managerial one, and the tradeoffs are around developer time, hiring and cost.
It's only a proxy in the sense that it hides them (the ops/dbas) behind a wall, and you can't actually talk directly to them about what you want to do, or what's wrong.
If you don't want to hire staff directly, consulting companies like Percona will give you direct, specific advice and support.
But in your experience, what has “gone wrong” with AWS that you could have fixed yourself if you were hosting on prem?
Locally hosted basic synchronous read replicas are a solved a problem?
If you do not do something regularly, you tend to lose the ability to do it at all. Personally, and especially organizationally.
There is no value add in the “undifferentiated heavy lifting”. It is not a companies competitive advantage to know how to do the grunt work of server administration - unless it is. Of course Dropbox or Backblaze have to optimize their low profit margin storage business.
For a trivial website in production: 2 web servers + 1 database + 1 replica.
For internal tooling: 1 CI and build server + 2 development and testing servers + 1 storage, file share, ftp server + 1 backup server.
For desktop support: At least 1 server for DHCP, DNS, Active Directory + firewall + router.
That's already 10 servers and not counting networking equipment. Less than that and you're cutting corners.
Build server - CodeBuild you either run with prebuilt Docker containers or you use a custom built Docker container that automatically gets launched when you push your code to GitHub/CodeCommit. No server involved.
Fileshare - a lot of companies just use Dropbox or OneDrive. No server involved
FTP - managed AWS SFTP Service. No server involved.
DHCP - Managed VPN Service by AWS. No server involved.
DNS - Route 53 and with Amazon Certificate Manager it will manage SSL certificates attached to your load balancer and CDN and auto renew. No servers involved.
Active Directory - Managed by AWS no server involved.
Firewall and router - no server to
Manage. You create security groups and attach them to your EC2 instances, databases, etc.
You set your routing table up and attach it to your VMs.
Networking equipment and routers - again that’s a CloudFormatiom template or go old school and just configuration on a website.
We weren’t allowed to use regular old FTP in the early 2000s when I was working for a bill processor. We definitely couldn’t use one now and be compliant with anything.
I was trying to give you the benefit of a doubt.
Amateur mistake that proves you have no experience running any this.
If it doesn’t give you a clue the 74 in my name is the year I was born. I’ve been around for awhile. My first internet enabled app was over the gopher protocol.
How else do you think I got shareware from the info-Mac archives over a 7 bit line using the Kermit protocol if not via ftp? Is that proof enough for you or do I need to start droning on about how to optimize 65C02 assembly language programs by trying to store as much data in the first page of memory because reading from the first page took two clock cycles on 8 bit Apple //e machines instead of 3?
We don’t “share” large files. We share a bunch of Office docs and PDF’s as do most companies.
Yes, you do have to run DNS, Active Directory + VPN. You said you couldn’t do it without running “servers”.
No we don’t have servers called
either on prem or in the cloud.
Even most companies that I’ve worked for that don’t use a cloud provider have their servers at a colo.
We would still be using shares hosted somewhere not on prem. How is that different from using one of AWS storage gateway products.
Not to mention the four lower environments some of which the databases are automatically spun up from 0 and scaled up as needed (Aurora Serverless)
Should we also maintain those same read replicas servers in our other environments when we want to do performance testing?
Should we maintain servers overseas for our outsourced workers?
Here we are just talking about Aurora/MySQL databases. I haven’t even gotten into our VMs, load balancer, object store (S3), queueing server (or lack there of since we use SQS/SNS), our OLAP database (Redshift - no we are not “locked in” it users standard Postgres drivers), etc.
AWS is not about saving money on like for like resources as you would on bare metal, but in the case of databases where your load is spiky you do. It’s about provisioning resources as needed when needed and not having to either pay as many infrastructure folks. Heck before my manager who hired me and one other person came in, the company had no one onsite that had any formal AWS expertise. They completely relied on a managed service provider - who they pay much less than they would pay for one dedicated infrastructure guy.
I’m first and foremost a developer/lead/software architect (depending on which way the wind is blowing at any given point in my career), but yes I have managed infrastructure on prem as part of my job years ago, including replicated MySQL servers. There is absolutely no way that I could spin up and manage all of the resources I need for a project and develop at the efficacy level at a colo as I can with just a CloudFormation template with AWS.
I’ve worked at a company that rented stacks of servers that sat idle most of the time but we used to simulate thousands of mobile connections to our backend servers - we did large B2B field services deployments. Today, it would be running a Pythom script that spun up an autoscaling group of VMs to whatever number we needed.
This obsession with making something completely bulletproof and scalable is the exact problem they are discussing. You probably don't need it in most cases but just want it. I am guilty of this as well and it is very difficult to avoid doing.
We have a process that reads a lot of data from the database on a periodic basis and sends it to ElasticSearch. We would either have to spend more and overprovision it to handle peak load or we can just turn on autoscaling for read replicas. Since the read replicas use the same storage as the reader/writer it’s much faster.
Yes we need “bulletproof” and scalability or our clients we have six and seven figure contracts with won’t be happy and will be up in arms.
Just have a cron job taking backups in a server and sending somewhere else?
It has been working for the last 40 years...
Can a cron job give me automatic failover to another server that is in sync with the master? Can it give me autoscaling read replicas?
Who is going to get up in the middle of the night when the cron job fails?
Yes I am sure my company that has six and seven figure contracts would be just as well served self hosting MySQL on Linode.
Cost to run it for 4 years: $70000.
QPS ? 5 was the highest I ever found.
You don't need this capacity. You just don't.
But it's got to be a good business to be in ...
Storage cost of 500Gb of data for four years - $2400
Backing up is free up to the amount of online storage.
IO request are $0.12 per million.
Transfer to/from VMs in the same availability zone is free.
A r5.large reserved instance is $6600 a year.
Of course if your needs are less, you could get much cheaper cpu and memory.
So QPS really low, <5. But very spread out, as it's used during the workday by both team leader and team members. Every query results in a database query or maybe even 2 or 3. So every, say 20 minutes or so there's 4-5 database queries. Backend database is a few gigs, growing slowly. Let's say a schedule is ~10kb of data, and that has to be transferred on almost all queries, because that's what people are working on. That makes ~800 gig transfer per year.
This would be equivalent to something you could easily store on say a linode or digital ocean 2 machine system, for $20 month or $240/year + having backup on your local machine. This would have the advantage that you can have 10 such apps on that same hardware with no extra cost.
And if you really want to cheap out, you could easily host this with a PHP hoster for $10/year.
So how do you calculate the AWS costs here ?
And storage for AWS/RDS is .10/gb per month and that includes redundant online storage and backup.
I’m sure I can tell my CTO we should ditch our AWS infrastructure that hosts our clients who each give us six figures each year to host on shared PHP provider and Linode....
And now we have to still manage local servers at a colo for backups....
We also wouldn’t have any trouble with any compliance using a shared PHP host.
We are a B2B company with a sales team that ensures we charge more than enough to cover our infrastructure.
Yes infrastructure cost scale with volume, but with a much lower slope.
But it’s kind of the point, we save money by not having a dedicated infrastructure team, and save time and move faster because it’s not a month long process to provision resources and we don’t request more than we need like in large corporations because the turn around time is long.
At most I send a message to my manager if it is going to cost more than I feel comfortable with and create resources using either a Python script or CloudFormation.
How long do you think it would take for me to spin up a different combination of processing servers and database resources to see which one best meets our cost/performance tradeoff on prem?
Just for reference we ""devops"" around 2 thousand VMs and 120 baremetal servers and a little of cloud stuff through same scripts and workflows.
We don't really leverage locked in cloud things because we need the flexibility of on prems.
In my business hardware is essentially a drop in the bucket of our costs.
P.s: I totally think there are legit use cases for cloud, is just another tool you can leverage depending on the situation
Lambda vs VMs (yes you can deploy standard web apps to
Lambda using a lambda proxy)
Serverless Aurora vs Regular Aurora.
Tx instances vs regular instances
DELETE FROM orders WHERE $condition_from_website;
You also get point in time recovery with BackTrack.
The redundant storage is what gives you the synchronous read replicas that all use the same storage and the capability of having autoscalinv read replicas that are already in sync.
So... A cronjob?
I also have done enough recoveries to know that actually placing back a checkpoint is admitting defeat and stating that you
a) don't know what happened
b) don't have any way to fix it
And often it means
c) don't have any way to prevent it from reoccuring, potentially immediately.
It's a quick and direct action that often satisfies the call to action from above. It is exactly the wrong thing to do. It very rarely solves the issue.
Actually fixing things mostly means putting the checkpoint on another server, and going through a designed fix process.
We are just focusing on databases here. There is more to infrastructure than just databases. Should we also maintain our own load balancers, queueing/messaging systems, CDN, object store, OLAP database, CI/CD servers, patch management system, alerting monitoring system, web application firewall, ADFS servers, OATH servers, key/value store, key management server, ElasticSearch cluster etc? Except for the OLAP database. All of this is set up in some form in multiple isolated environments with different accounts in one Organizational Account that manages all of the other sub accounts.
What about our infrastructure overseas so our off shore developers don’t have the latency of connecting back to the US?
For some projects we even use lambda where we don’t maintain any web servers and get scalability from 0 to $a_lot - and no there is no lock-in boogeyman there either. I can deploy the same NodeJS/Express, C#/WebAPI, Python/Django code to both lambda and a regular old VM just by changing my deployment pipeline.
How a company which seems entirely driven by vertical integration is able to convince other companies that outsourcing is the way to go is an absolute mystery to me.
Yes “the people in the know” are at AWS. They handle failover, autoscaling, etc.
We also use Serverless Aurora/MySQL for non production environments with production like size of data. When we don’t need to access the database, we only pay for storage. When we do need it, it’s there.
By the way what is autoscaling and why are we autoscaling databases? Im guessing the only resource thst autoscales is the bandwidth? Why can't we all get shared access to a fat pipe in production? I was under the impression that products like Google Cloud Spanner have this figured out. What exactly needs to auto scale? Isn't there just one database server in production?
In dev (which is the use case I'm talking about) you should be able to just reset the vm whenever you want, no?
For regular autoscaling of read replicas. It brings additional servers on line using the same storage array. It isn’t just “one database”.
That’s just it. There isn’t “just one database”.
One worked, one didn't ¯\_(ツ)_/¯
That's when I gave up on servers, virtual or not
The thing separating you (in this case) and Amazon/Google/* is that they make sure their processes produce predictable outcomes that if they were to fail can be shutdown and the process can be redone until it yields the correct outcome.
Devs are more than capable of pushing each other to the same conclusions.
Block storage is for adding volumes to your droplets (like adding a second hard drive to a computer). One advantage is that they are separated from your droplet and replicated, if your droplet disappears or dies, you can reattach the volume to another droplet. So one common use is to place the data dir of PG for example, in the external volume. You should still have proper backups. DO also has a managed PG service.
Edit: those don't include a propoer DB, though, I have been basically using S3 as the DB. Not sure how much a RDS instance or something would add to the bill.
Just run it as a systemd service or a Docker/Kubernetes-managed container on one of the VMs that run the rest of the application.
Also no need to use all those AWS-proprietary services, just use an existing task queue system based on PostgreSQL/RabbitMQ/etc.
Some times, the benefits from something compared to another thing are so great that there are no downsides. If the rasp in the closets doesn't suffice you can always add another one.
12 outages since 2011, and none of them are anything like what you're describing: https://en.wikipedia.org/wiki/Timeline_of_Amazon_Web_Service...
Using AWS has been a legitimate excuse for an outage or service delivery issue a dozen times since 2011. The end. Write your apps to be more resilient if you've had more than that many outages at those times (and honestly even those outages weren't complete).
AWS is a hell of a lot better than most people here can come up with for solving AWS specific set of problems and priorities. But those problems and priorities generally do not match the problem space of the people using it exactly.
For example, AWS has a lot of complexity for dealing with services and deployment because it needs to be versatile enough to meet wildly different scenarios. In designing a solution for one company and their needs over the next few years, you wouldn't attempt to match what AWS does, because the majority of companies use only a small fraction of the available services, so that wouldn't be a good use of time and resources.
AWS is good enough for very many companies so they appeal to large chunks of the market, but let's not act like that makes them a perfect or "best" solution. They're just a very scaleable and versatile solution so it's unlikely they can't handle your problem. You'll often pay more for that scaleability and versatility though, and if you knew perfectly (or within good bounds) what to expect when designing a system, it's really not that hard to beat AWS on a cost and growth level with a well designed plan.
Edit: Whoops, s/dell/well/, but it probably works either way... ;)
diminoten -- relax, take a breath. Your rude and condescending tone is unnecessary. We don't see eye-to-eye, but I'm not discounting your experience. I can't say I'm getting the same vibe from you.
And again, I cannot over emphasize how much more control over your own time AWS gives you. It's night/day, and discounting that is a mistake.
The reality is, if you're having problems with AWS, it's you, and not AWS, for 99.9999999% of your problems. Continuing to pretend it's AWS is a face-saving, ego protecting activity that no rational person plays part in.
-No changes to the config
-I'm the only person with access
-No major outages at the time
It went down, and the AWS monitor didn't actually see it go down until 45 minutes later, which it needs to be down for a certain amount of time before their techs will take a look.
It was my first time using AWS, and I didn't want to risk waiting for tech so I rebooted the instance and it started back up again. I have no idea why, but it failed with out reason, and on reboot worked like it always did.
My point is that AWS has been solid, but they are like anything else, there are tradeoff's in using their service, and they aren't perfect.
Many applications won't be as resilient. That's the trade-off. We don't have a single stateful application. RDS/Redis/Dynamo/SQS are all managed by someone else. We had to build things differently to accommodate that constraint, but as a result we have ~35M active users and only 2 ops engineers, who spend their time building automation rather than babysitting systems.
If you lean in completely, it's a great ecosystem. Otherwise it's a terrible co-lo experience.
MY point is that AWS is very solid, and while there are plenty of trade offs, to be sure, the tradeoff is "operational" vs. "orchestration", and operational doesn't let you decide when to work on it whereas orchestration does.
I haven't actually said anything about my own experience, so it's funny you claim I have...
Ah, I had assumed that your internal information was from experience, either your own or your organisation's. Since that is not the case I am curious where your "internal information" comes from considering you completely disregarded @bsagdiyev's personal experience.
What you tried to say was I was claiming that since it worked for me it was therefore good. That isn't the case. I'm saying because it worked for me AND EVERYBODY ELSE, it's therefore good.
I have noticed that too. With some managed services you are trading a set of generally understood problems with a lot of quirky behavior of the service that's very hard to debug.
The daily/monthly maintenance cycle on a self hosted SQL server is “generally understood” but you still have to wake up, check your security patches, and monitor your redeployments.
You can do some of that in an automated fashion with public security updates for your containers and such. But if monitoring detects an anomaly, it’s YOU, not Heroku who gets paged.
It’s a little like owning a house vs renting. Yes if you rent you have to work around the existing building, and getting small things fixed is a process. But if the pipes explode, you make a phone call and it’s someone else’s problem. You didn’t eliminate your workload, but you shrunk the domain you need to personally be on call for.
I've found & reported a bug in RDS whereby spatial indexes just didn't work; merely hinting the server to not use the spatial index would return results, but hinting it to use the spatial index would get nothing. (Spatial indexes were, admittedly, brand new at the time.)
I've had bugs w/ S3: technically the service is up, but trivial GETs from the bucket take 140 seconds to complete, rendering our service effectively down.
I've found & worked w/ AWS to fix a bug in ELB's HTTP handling.
All of these were problems with the service, since in each case it's failing to correctly implement some well-understood protocol. AWS is not perfect. (Still, it is worth it, IMO. But the parent is right: you are trading one set of issues for another, and it's worth knowing that and thinking about it and what is right for you.)
Further, didn't I say that it's trading one set of issues for another? Or at least, I explicitly agreed with that.
I feel like you didn't read what I wrote honestly, and kind of came in with your own agenda. All I ever said was that the issues you trade off are orchestration issues vs. operational issues, and operational issues are 10x harder than orchestration issues because you don't get to decide when to work on operational issues, you tend to have to deal with them when they happen.
What deathanatos wrote sounds awfully like problems with the service to me.
I don’t think S3 taking 100+ seconds to respond to a GET request can be solved by orchestration alone.
Guess what, if your service is not required to be up because the consuming service is super tolerant to it timing out after 140 seconds, self-hosting it becomes even more of a no-brainer. After all, you clearly need none of the redundancy AWS features.
Sorry, but AWS/GCP is infinitely better at managing infrastructure than you or your company will ever be.
And I don't think you know what an echo chamber is if you think one person can create one alone...
You sound like someone who hasn't had much real world experience and thinks AWS or whatever is the best thing because it's the only thing you know.
But this kind of toxic fanatism(yet trying to sound logical) is just harmful for the HN community.
dang: Can this kind of behavior be punished?
I don't think anyone is actually upset, do you? I certainly hope I haven't upset anyone... :/
When the tool breaks under correct use, criticize the tool. Maybe the user should also have redundancy. The tool is still failing!
Even when you're supposed to have redundancy, there are still certain failure rates that are acceptable and some that are not. And redundancy doesn't solve every problem either.
Of course there are unacceptable failure rates. AWS doesn't have them, and pretending like they do is simply lying to yourself to protect your own ego.
Until the managed service simply goes away, of course, taking your data with it.
EDIT: I should be clear, our PEs are generally pretty good, but because their product isn't seen by upper management as the thing which makes money, they're perpetually understaffed.
I can't explain just how much developing on GCP has helped me simply by having such amazing documentation. I don't think I appreciated how little I knew: every company where we worked with on premise/ internal services, we would have to use custom services built by others. With GCP, you have complete freedom, not just to design your application architecture from scratch, but to understand how others (coworkers mostly) have designed _their_ applications too! And as a company, it allows the sharing of a common set of best practices, automatically, since its "recommended by Google".
Its kinda like Google/Amazon are now the System/Operations engineers for our company. Which they're good at. And its awesome.
And it _really_ is a lot. A company I work with switched from running two servers with failover to AWS, bills went from ~€120/m to ~€2.2k/m for a similar work load. Granted, nobody has to manage those servers any more, but if that price tag continues to rise that way, it's going to be much cheaper to have somebody dedicated to manage those servers vs use AWS.
Also, maybe that's just me, but I prefer to have the knowledge in my team. If everything runs on AWS, I'm at the mercy of Amazon.
However, often teams will use or attempt to use _a_lot_ more resources than what's needed, or they simply don't optimise for cost when using AWS.
My anecdote is that a company I worked for had an in-house app that run jobs on 8 m4.16xlarge instances costing around $20k/m and they were complaining it took hours to run said jobs. The actual tasks those servers were running were easily parallelizable and servers would run at capacity only for around 10% of the time as those jobs were submitted by users every few-up-to-24 hours. The app was basically a lift-and-shift from on-prem to AWS. The worst way one can use cloud. I created a demo for them where the exact jobs they were running on those m4.16xlarge would run on lambda using their existing code modified slightly. The time a job took to run went from many hours to few minutes with around 2k lambda functions running at the same time. The projected cost went from $20k/m to $1-5k/m depending on workload. I was quite happy with the result, unfortunately they ended up not using lambda and migrating off the app that in its entirety cost around $50k/m for the infrastructure. The point I'm trying to make here is that properly used cloud can save you a lot of money if you have a spikey workload.
Also, for internal apps that are used sparingly one can't beat a mix of client side JS code served from S3, using AWS Cognito tied to AD federation for auth with DynamoDB for data storage and lambda for typical server-side stuff. Such apps are easy to write, cost almost nothing if they are not used a lot and don't come with another server to manage. The only downside is that instead of managing servers, now you have to manage changes in AWS's api..., but nothing's perfect.
Yes, some places will need a full time team of multiple people, but a lot of places don't, and can get a tailored solution to suit their needs perfectly, rather than just trying to fit into however it seems to work best using the soup of rube-goldberg like interdependent AWS services.
Not even getting into the fact that an m5.xlarge has roughly as much muscle as an old laptop.
Get normal disks of a larger size, it comes with 3 IOPS per GB.
"We'd rather not manage the database ourselves" was the answer from the lead dev, and I do understand that: it's not the developers' job to manage servers. In this company, management has removed themselves from anything technical and the devs don't like managing servers, so they say "let's use AWS, we'll write some code once to create our environment and Amazon takes care of the rest" - and by this point they are committed, with months having been spent on getting things running on AWS, changing the code where necessary etc, so reversing it would be a hard thing to sell to investors and the team.
I've seen this with many AWS deployments, which on a pure hardware cost is 3-5 times more, but with the way it makes it 'easy' to scale instead of optimise instead costs 10-20 times more. When the investor cash starts drying up and the focus is going to be on competing for profits in a market that is much smaller in general, many organisations are going to find themselves locked into AWS and for them it's going to feel like IE6 and asp.net all over again.
And back in dotcom days, it was ASP, not ASP.NET. Which is to say, much like PHP, except with VBScript as a language, and COM as a library API/ABI.
We make a decent business out of doing just this, at scale for clients today.
We manage about 150T of data for company on relatively inexpensive raid array + one replica + offline backup. It is accessible on our 10Gbps intranet for processing cluster, users to pull to their workstations, etc. The whole thing is running on stock HP servers for many years and never had outages beyond disks (which are in raid) predicting failure and maybe one redundant PSU failure. We did the math of replicating what we have with Amazon or Google and it would cost a fortune.
We help make that happen, and help folks manage the lifecycle associated with it.
Chances are you will find tens of instances without any naming or tagging, of the more expensive types, created by previous dudes who long left. Thanks to AWS tooling it's easy to identify them, see if they use any resource and either delete or scale down dramatically.
Our goal is to provide actual value and do real DevOps work, regardless of whether you're AWS, Azure, GCP, etc. This includes physical, mixed-mode, cloud bursting, and changing constantly. <----- That's what keeps it interesting.
Do you think you are gonna get AWS's attention with that $10 s3 instance when something goes wrong?!? You will have negative leverage after signing up for vendor lock in.
I'll take linux servers any day, thanks.
I’m sure with all of your developers using the repository pattern to “abstract their database access”, you can change your database at the drop of a dime.
Companies rarely change infrastructure wholesale no matter how many levels of abstraction you put in front of your resources.
While at the same time you’re spending money maintaining hardware instead of focusing on your business’s competitive advantage.
Netflix does NOT use aws for their meat and potatoes streaming, just the housekeeping. They use OWS for the heavy lifting.
But re: maintaining hardware, I only maintain my dev box these days. We are fine with hosted linux services, but backup/monitoring/updating is just too trivial, and the hosting so affordable (and predictable) w/linux it would have to be a hype/marketing/nepotic decision to switch to aws in our case. The internet is still built on a backbone and run by linux, any programmer would be foolhardy to ignore that bit of reality for very long.
Netflix is by far AWSs largest customer.
From the horses mouth on why they decided to move to AWS:
You could also go to YouTube and watch any of the dozens of talks NetFlix has done at ReInvent.
Of course Netflix caches it’s videos in the ISPs data center. But caching is not the “heavy lifting”.
I built a system selling and fulfilling 15k tshirts/day on Google App Engine using (what is now called) the Cloud Datastore. The programming limitations are annoying (it's useless for analytics) but it was rock solid reliable, autoscaled, and completely automated. Nobody wore a pager, and sometimes the entire team would go camping. You simply cannot get that kind of stress-free life out of a traditional RDBMS stack.
I have noticed that people tend to get into trouble when they force the datastore to be something it's not. For example, I mentioned that it's terrible for analytics - there are no aggregation queries.
In the case of the tshirt retailer, I replicated a portion of our data into a database that was good for analytics (postgres, actually). We could afford to lose the analytics dashboard for a few hours due to a poorly tested migration (and did, a few times), but the sales flow never went down.
The datastore is not appropriate for every project (which echoes the original article) but it's a good tool when used appropriately.
DynamoDB vs RDS is a perfect example. Most of that boils down to the typical document store and lack of transactions challenges. God forbid you start with DynamoDB and then discover you really need a transnational unit of work, or you got your indexes wrong the first time around. If you didn't need the benefits of DynamoDB in the first place, you will be wishing you just went with a traditional RDBMS vs RDS to start with.
Lambda can be another mixed bag. It can be a PITA to troubleshoot, and their are a lot of gotchas like execution time limits, storage limits, cold start latency, and so on. But once you invested all the time getting everything setup, wired up...
In for a penny, in for a pound.
The default config for a LAMP stack will easily handle 100 requests per second. 10 if you app isn't optimized.
Run apt upgrade once a month and enable automatic security updates on ubuntu.
That is neither hard nor "vodoo".
I've used managed services and I don't see the point until you hit massive scale, at which point you can afford to hire your engineers to do it.
Now that person I pay to operate the service can focus on tuning, not backups and other work that's been automated away.
Sounds like a massive win to me.
That is, the shift to managed services does largely remove a large portion, and just changes another.
That's going too much to the other extreme, ec2, droplets, etc.. are fine.
That's nice in theory, but my experience with paying big companies for promises like that is I still end up having to debug their broken software; and it's a lot easier to do that if I can ssh into the box. I've got patches into OpenSSL that fixed DHE negotiation from certain versions of windows (especially windows mobile), that I really don't think any one's support team would have been able to fix for me / my users , unless I had some really detailed explanation -- at that point, I may as well be running it myself.
 And as proof that nobody would fix it, I offer that nobody fixed this, even though there were public posts about it for a year before my patches went in; so some people had noticed it was broken.
I've been informally tracking time spent on systems my team manages and managed tools we use. (I manage infra where I work.)
There is very little difference.
And workarounds and troubleshooting spent on someone else's system mean we only learn about some proprietary system. That's bad for institutional knowledge and flexibility, and for individual careers.
Our needs don't mesh well with typical cloud offerings, so we don't use them for much. When we have, there has yet to be a cost savings - thus far they've always cost us more than we pay owning everything.
I mean, I personally like not dealing with hardware or having to spend time in data centers. But I can't justify it from a cost or a service perspective.
Based on all the articles I see from time to time, I fully expect if I worked for a company that was built on AWS, I would spend a lot of time debugging AWS, because I end up debugging all the stuff that nobody else can, and it's a lot easier to do when I have access to everything.
But you criticize AWS based on a few articles and hardly any real world experience?
On this specific issue; I worked with sales support, and also looked on their forums; and found things like this: https://forums.aws.amazon.com/message.jspa?messageID=721247
Which says 'yes, we know there's a limit; no we won't tell you what it is' and ignores the elephant in the room, that apparently if you reconfigure the network rules, the problem evaporates.
Excuse me for not banging my head against the wall or watching random videos from third parties. Actually, I only know there's a way to avoid this because I complained about it for a long time in HN threads, and finally, some lovely person told me the answer. In the mean time, I had continued my life in sensible hosting environments where there wasn't an invisible, unknowable connection limit hovering over my head.
I will reply on that because I have experience with both.
If you worked in big companies, you do have to constantly debug half baked internal tools written by other people, with little or no documentation. It's horrible when you change company later, the institutional knowledge is lost when you leave and it's worthless for you to get a new job.
AWS is absolutely nothing like that. The tooling is working and polished and documented. It's much much better than any internal tools you will find in big companies. When you're out of your depth, you can just google for help. When changing company you can continue to apply the experience and never worry about the tooling in place.
And just “extrapolating”?
What am I supposed to do to satisfy you here? After finding the network does weird things, and not being able to get timely help with, I should conclude that AWS will solve my needs, if only I paid them more money? If only I had wasted more time watching videos that don't come up against reasonable search terms, maybe I wouldn't have come to the conclusion that having an amorphous, uninspectable, blob of a network between my servers and the internet is a bad thing?
P.S. This may be a little of pot and kettle, but what makes it so important to you that I'm convinced that AWS is unicorns and rainbows and having more control and visibility is a bad thing?
I'm sure AWS has its perks; it's great if it works for you, and if you can take advantage of running drastically different number of instances at peak and off-peak, I think there's cost savings to be had. But when it comes down to it, I'm not willing to hitch my availability to complex systems outside my control, when I don't have to, because I don't like the feeling of not being able to do anything to fix things; and I get enough of that from dealing with the things that are broken that I can't control and don't have any choice about depending on.
Because I was developing and maintaining web servers, database servers, queueing systems, load balancers, mail servers and doing development at small companies before AWS in its current form existed. I’ve done both.
The problem most posters here have isn’t lack of visibility, it’s not understanding the tools - it’s a poor carpenter who doesn’t understand their tools.
Do you usually jump in on a new technology or framework and get frustrated because you don’t know how something works when you didn’t take the time to learn the technology you use?
I’m sure you’ve never had s problem with any piece of open source software that was “out of your control” or did you just download the software and patch it yourself?
The truth is that yes any piece of technology is complex and you thought you could just jump in without any research and start using it. I don’t do that for any technology. When I found out that we were going to be using ElasticSearch for a project, I spent a month learning it and still ran into some issues because of what I didn’t know. That didn’t mean there was something wrong with ES.
I usually jump in and try to solve my problem with the tools that seem right for the job. Reviewing the documentation as needed. I get frustrated when the documentation doesn't match what I have observed. In this case, the connection limits of the default stateful firewall on EC2 were not mentioned anywhere I could find (knowing exactly what to to look for and where to look, I think I could find a vague mention of the general limit today). The specific limits per instance type certainly still aren't. I found forum posts about the connection limit with useless responses from employees. I reached out to my support contact and got useless information (they did get me access to bigger instances though, which just have a larger, unspecified, connection limit, that was still small enough that I could hit it). I reached the conclusion that AWS must be great, because people love it, but it has very non-transparent networking, and useless support for non-trivial issues. I'm sure if we were spending more, we would get better support people and they'd share insights into their networks -- I've seen that with our other hosting providers, although network insights there weren't needed for everyday things, just more details were nice when the network broke, so we could help them detect future issues and plan for failures.
If the technology mostly works, but I need deeper knowledge, maybe for optimizing, I'll seek out deeper references, but third party references during discovery is very dangerous. When there is a conflict between what is observable, what is documented in first-party sources and what is documented in third-party sources, I would have no context to know which documentation indicates intended behavior and which is most out of date.
> I’m sure you’ve never had s problem with any piece of open source software that was “out of your control” or did you just download the software and patch it yourself?
Download and patch myself, and upstream the patches if I have the time and patience. Isn't that the point of open source? Everything is broken, but at least I can inspect and fix parts of the system where I can see the code. Hell, binary patching is a thing, although I wouldn't want to do that on the regular. I've had patches accepted in the FreeBSD kernel, OpenSSL, Haproxy to fix issues, some longstanding I ran into.
At the end of the day, I'm responsible for the whole stack, because if it's not working, my users can't use my product. It doesn't matter if it's the network, software I wrote, open source software, managed software; even software on the client devices is my problem if it doesn't work. The more I can inspect, the better.
In the official documentation they do mention that different instance sizes have different networking capabilities. This is stuff you learn early on when learning AWS - from the official books.
In this case, the connection limits of the default stateful firewall on EC2 were not mentioned anywhere I could find
The fact that security groups are stateful and Nacls are stateless are questions I ask junior AWS folks when interviewing. That’s like one of the level 1 questions you ask to know whether to actually bring them in for an on-site.
and useless support for non-trivial issues. I'm sure if we were spending more, we would get better support people and they'd share insights into their networks
So what did you find when you turned on VPC logging for the network interface of the EC2 instance? Again this level of troubleshooting is a question I would ask junior admins during an interview process.
I’ve had 100% success rate with our business support using live chat. With things that were a lot hairier and with my own PEBKAC issues.
I’m sure my company wouldn’t have any trouble with approving our running our own patched version of our production database an OpenSSL....
Hey --- this sounds like something my support tech should have asked me, or should have been mentioned by the employee response in the forum. I didn't look at VPC logs; I did see that SYN packets (or maybe SYN+ACK responses, I don't remember) were mysteriously missing in tcpdump between the ec2 instance and the server.
Either way, I'm not interviewing for a AWS position; I was just trying to use a managed off the shelf service. Looking through the docs now, I did find the mention of connection tracking , but even there, there's no mention of a limit (of course, as someone familiar with firewalls, I know there's always a limit with connection tracking, which is why I only rarely write stateful rules, and wouldn't have assumed default rules were stateful. I had read the bit about the default security group, which says:
> A default security group is named default, and it has an ID assigned by AWS. The following are the default rules for each default security group:
> Allows all inbound traffic from other instances associated with the default security group (the security group specifies itself as a source security group in its inbound rules)
> Allows all outbound traffic from the instance.
> You can add or remove inbound and outbound rules for any default security group.
Unfortunately, "allows all outbound traffic" was misleading, because it's really allows all outbound traffic subject to connection limits.
> I’m sure my company wouldn’t have any trouble with approving our running our own patched version of our production database an OpenSSL....
I'm assuming that's sarcastic, and you're actually saying you don't think you would be able to get approval to run a patched version of software. Are you saying that if you find a problem in OpenSSL (or whatever enabling technology) that causes your system to be unreliable, your employer will not let you fix it; you'll need to wait for a fixed release from upstream? From experience, upstream releases often take weeks and some upstreams are less than diligent about providing clean updates; not to pick on OpenSSL, but a lot of their updates will fix important bugs and break backwards compatibility, occasionally breaking important bits of the API that were useful. I guess, if you're in an environment where you have no ability to fix broken stuff in a timely fashion, it really doesn't matter whose responsibility it is to fix it, since it won't be fixed.
I really hope you didn't need my patch for your databases systems; but maybe you want/wanted it for your https frontends if you were doing RSA_DHE with windows 8.1 era Internet Explorer or windows mobile 8.1 so that your clients could actually connect reliably. Anyway, if you're running OpenSSL 1.0.2k or later, or 1.1.0d or later (might have been 1.1.0c), you've got my patch, so you're welcome. Fixes the issue well described here 
No, when you actually need to debug in production that's usually not what you want. Changing or restarting the software you are debugging might well make the behaviour you want to understand go away.
> introduce tracing
Yeah, well, that's basically "logging in". Just over less mature and likely less secure protocol than SSH.
You don't need ptrace and tcpdump to debug software. It's just that it can shave a few weeks off your time when you need to reproduce something in the more tricky cases.
These discussions tend to surface in the context of containers but that's all very irrelevant. You need to debug software isn't affected by the way you package it.
Perhaps whenever a developer wants to troubleshoot the orchestrator could make a clone of the container. The clone continuously receives the same input as the original and gets to believe that it affects the back end in the same way. That way the developer can dissect the container without impacting real data.
That sounds like a nightmare to me, and I'm not even a server guy, I'm a backend guy only managing my personal projects.
> I'll pay Amazon $N/month to do that for me, all that matters is the product.
I don't want to pay Amazon anything, I want to pay Linode/Digital Ocean 5 or 10 or 20 dollars per month and I can do the rest myself pretty well. My personal projects will never ever going to reach Google's scale and seeing that they don't bring me anything (they're not intended to) I'm not that eager to pay an order a magnitude more to a company like Amazon or Google in order to host those projects.
but companies throw money, hardware and vendor lock-in at problems way before commiting to sound engineering practices.
i have a friend who works at an online sub-prime lender, says their ruby backend does hundreds of queries per request and takes seconds to load. but they have money to burn so they just throw hardware at it. they spend mid-five figures per month on their infrastructure and third party devops stuff).
meanwhile, we run a 20k/day pageview ecommerce site on a $40/mo linode at 25% peak load. it's bonkers how much waste there is.
i think offloading this stuff to "someone else" and micro-services just makes you care even less about how big a pile of shit your code is.
Seems the insanely bad coding and ignorance of speed and performance goes all the way to the backend too!
When you can't use more than a couple seconds of CPU time per transaction, or you have a fixed and spartan heap size for a big algorithm, you learn good habits real quick.
I imagine a lot of engineers today haven't had the experience of working within hardware limits, hence all the waste everywhere.
hmm, i guess that explains why salesforce is universally known for being slow; 2 seconds is an eternity.
we deliver every full page on the site within 250ms from initial request to page fully loaded (with js and css fully executed and settled). images/videos finish later depending on sizes and qtys.
i think it's expected that if you're doing some editing/reporting/exporting en masse, then a couple seconds could be okay (assuming it's thousands or tens of thousands of records, not a hundred). but not for single-record actions / interactions.
One large bank I used to work for had a payment processing system doing about 40 thousand transactions per day. It ran on about $8 million worth of hardware - mainly IBM mainframe. I was impressed until I found out each transaction was essentially a 60-100 byte text message. I can confidently say a well designed system can do 1000 times that load on an iPhone.
I mean if you can't trust your dev team, or the procedures surrounding them, you are kinda screwed.
Nothing to do with trust of devs.
Between that and the dogshit documentation, it's truly thrilling to be paying them a ridiculous amount of money for the privilege of having a managed service go undebuggably AWOL on the regular, instead of being able to resolve the issues locally.
The article isn’t asking why use Dynamo vs SQL in the context of the traditional NoSQL vs SQL way
It’s why Dynamo vs QLDB.
Or why Cloud Firestone vs Cloud Spanner.
Or Firehose + some lambda functions vs EMR
It’s wholly separate of the managed angle and focuses on the actual complexity of the products involved and the trade offs therein.
In the surface the two sound similar, but there’s much more nuance to this article's point (take the example of Dynamo, the author isn’t opposed to it for being NoSQL, they’re opposed to it for low level technical design choices that are a poor fit for the problem their client had)
You're gaining an expertise, but instead of it being with the underlying technology, it's with the management services for that technology. I think the premise of your argument is correct that understanding these services is simpler than the technology itself, but I seem to inevitably find myself in some edge use case of the service that is poorly documented or does not work as described, and trolling stack overflow or AWS developer forums is the only solution other than trial and (no) error.
Let me tell you: it's not fun when you're entrenched into a proprietary service that is now outdated and kept only for legacy reasons, and the version you are on doesn't allow you to do things like update your programming language to a more recent version. Now you get to move to another managed service and lose all that expert knowledge you had with the old one.
If the 0day in your familiar pastures dwindles, despair not! Rather, bestir yourself to where programmers are led astray from the sacred Assembly, neither understanding what their programming languages compile to, nor asking to see how their data is stored or transmitted in the true bits of the wire. For those who follow their computation through the layers shall gain 0day and pwn, and those who say “we trust in our APIs, in our proofs, and in our memory models and need not burden ourselves with confusing engineering detail that has no value anyhow” shall surely provide an abundance of 0day and pwnage sufficient for all of us.
Thus preacheth Pastor M. Laphroaig.
In that way, cloud/serverless migration is really the whole worldwide IT workforce progressively willingly migrating IT (design, development, operations & applications) out of companies, to just a few big providers.
Just like energy.
And who gets to decide destinies in the world? Those who control, have and provide energy.
What a time to be alive.
End of XIX century was the rush to oil-based energy to control the world.
End of XX was the rush to data-based capture and control.
Such centralized systems also become significant points of failure. What happens to your small business if one of the many services you rely on disappears or changes their API?
"All that matters is the product" sounds very much like "all that matters is the bottom line" and we've seen the travesties that occur when profit is put above all else.
Also having granular IAM for services and data is very helpful for security. You have a single source of truth for all your devs, and extremely granular permissions that can be changed in a central location. Contrast to building out all of that auditing and automation on your own. Granted, IAM permissions give us tons of headaches on the regular, but on balance I still think it's better when done well.
If you're concerned about AWS/GCP/Azure looking at your private business's data, I think 1.) that's explicitly not allowed in the contracts you sign 2.) many services are HIPAA compliant, hence again by law they can't 3.) They'd for sure suffer massively both legally and through loss of business if they ever did that.
Depending on the problem, choose the right tools. For example so far everything I've seen about React and GraphQL tells me that maybe GQL isn't necessarily the best solution for a 5-person team, but a 100 person team may have a hard time living without it.
Kubernates / Docker is significant work. And when we had 4 developers making that effort was wasteful and we literally didn't have the time. Now at 20 we're much more staffed and the things Kubernates / Docker solves is useful.
Meanwhile we have a single PostgreSQL server. No nosql caching layer. We're at a point where we aren't sure how far we gotta go before we hit limits of a PSQL server, but at least 3-4 years before we hit it.
Point is look at the tools. See where these tools win it for you. Don't just blindly pick because it can scale super well. Pick it because it gives you things you need and a simplicity you need for the current and near-term context, and have a long-term strategy how to move away from it when it is no longer useful.
In reality that means that when something doesn't work as expected, instead of logging in and solving the problem, you'll be spending your time writing tickets and fighting through the tiers of support and their templated replies, waiting for them to eventually escalate your issue to some actual engineer. Wish you a lot of luck with that, I prefer being able to fix my own stuff (or at least give it a try first).
This is an incredibly common fallacy in our industry. Remember, you are responsible for the appropriateness of the dependencies you choose. If you build your fancy product on a bad foundation, and the foundation crumbles, your product collapses with it and it's unprofessional to say "Well, I didn't pour the foundation, so I can't be blamed." Maybe not, but you decided to build on top of it.
If you don't believe that this counts as "managed," then for all your words at the end, you believe at the end of the day that making a solution more complex so that failures are expected and humans are required is superior to doing a simple and all-in-code thing. For all that you claim to not like operations, you think that a system requiring operational work makes it better. You are living the philosophy of the sysadmins who run everything out of .bash_history and Perl in their homedir and think "infrastructure as code" is a fad.
The Raspberry Pi is managed by you, as opposed to managed by a different company / different people so you can focus on your business problem.
ALTER TABLE is quite limited in SQLite (for example there is no DROP COLUMN). Have you missed this?
Yes I miss proper ALTER TABLE a lot when it would have been useful. But my experience shows me that I can design software pretty well following YAGNI and the occasional instances of when I need missing functionality is more than made up for the ease of use otherwise.
What are the advantages, in your own experience, of using SQLite instead of something like PostgreSQL? With PostgreSQL, you could run database migrations during business hours and use a proper ALTER TABLE.
I'm curious: why do you queue instead of directly writing to the database?
I was thinking of something more "graphical", similar to pgAdmin or SequelPro, when you need to navigate in a dataset.
But I suppose you'll have to string up something if you want it to access a remote db.
(Sure, managed solutions can satisfy all of those. But not without understanding a few orders of magnitude more complexity than a simpler solution, and you'll never get there if your attitude is "I'll pay Amazon to do that for me." Amazon doesn't care whether you succeed.)
Wait, what? Why wouldn't they? Successful companies will spend money with them; defunct companies can't.
So you want technology that prevents you from doing something you might want to do, real badly?
Now when many sql databases support json I don’t quite understand why you would use a documentdb for anything else then for actual unstructured data, or key value pairs that needs specific performance.
I used mongodb for a while now I’m 100% back in sql. Document dbs really increased my love for sql. But I’m also long from being a nosql expert and I never have to deal with data above a TB.
It's tough to get the correct level because you can ansible > docker > install an rpm or just change the file in place. Both have their place and "hacky" solutions can work just fine for many years.
Future of what exactly?
What's the safest way to go skiing? Don't ski.
The choice of technology should first be based on the problem and not whether it is serverless. Choosing DynamoDB when your application demands strict serializability would be totally wrong. I hope more people think about the semantic specifications of their applications (especially under concurrency).
Serverless sounds great for "some" kind of systems but in other cases its a complete missfire.
Our job is to know which tool to use.
On the other hand, I find it frustrating how many developers either don't know basic SQL or try to get cute and use the flat side of an axe to smack down nails instead of just using the well-known, industry-standard beauty of a hammer that SQL.
It's managed at every shared hosting in the world.
Great goal. I guess you never had any use case to login to a production server to investigate a business critical issue.