When working with formatted text content like in Google Docs / Zoho Writer: moving a list item down or adding a new column or any table/list operation is essentially a tree manipulation op.
Concurrent conflicts in such cases are notoriously hard to converge without contextual special handling [1]. Does this implementation generalize a solution for such use-cases?
I guess it should be possible to combine a list(or string) CRDT for leaf nodes (i.e text blocks) and use this tree CRDT for structural nodes (lists & tables).
But that will make augmenting every op with two-dimensional address (parent-id, index_offset_into_that_parent)
That’s always how I’ve imagined it. Rich text is plain text with 2 additions: Annotation ranges (for bolded regions and such) and non-character elements (Eg a table or embedded image). A text crdt is fundamentally just a list crdt that happens to contain character data. So embedded elements can easily be modelled as a special item (the embedded child node), and with size of 1 like any other item in the string. And then with the right approach, you can mix and match different CRDTs in a tree as needed. (Rich text, contains a table, and one of the cells has an image and so on).
Augmenting every op with a parent-crdt-id field is unfortunate but I think unavoidable. Thankfully I suspect that in most real world use cases, it would be very common for runs of operations to share the same parent crdt. As such, I think those ID fields would run-length encode very well.
The implementation can indeed combine multiple different CRDTs. Within Loro's internal implementation, each op does need to store a parent ID. However, as Seph mentioned, consecutive operations under the same parent can be effectively compressed, so the amortized overhead of these parent IDs is often not significant.
We export to the closest available thing (e.g. file chips become links, people chips become mailto links, etc.) or drop the content entirely if there's truly nothing it can be converted to (rare case)
Great work Alon! Yet another case of impressive engineering.
I have a couple of questions:
1. How is the landscape of wasmGC browser support? Given it's relatively new is it OK to use this and ship production apps now? How does sheets handle older/unsupported browsers?
2. In Google IO the Workspace team mentioned they'd leverage Kotlin Multiplatform to achieve use cases like this. I see Sheet's using the in-house J2CL compiler but is there cases where KMM is used to target WASM - in sheets or docs? What are your thoughts?
Chrome (and other Chromium-based browsers) and Firefox have had WasmGC enabled for a few releases now. But Safari doesn't yet last I heard.
Sheets can serve a JS version to browsers without WasmGC, since they can compile the same code to either JS or Wasm.
About Kotlin, I know they have a working WasmGC toolchain that works very well, so personally I think it is a good option in general. But I don't know offhand about current users or plans.
For the name of the internal framework, I can't remember which podcast I heard it on, but they haven't mentioned it publicly much if at all. It's something they threw together in like 2012.
Recently I was shocked when I came to know that both CKeditor and TinyMCE are now owned by the same parent company. The acquisition has happened silently, that many people think TinyMCE and CKEditor are different competing companies.
Given that they are from the same governing body, it's no surprise they want to make TinyMCE adopt CKEditor's licensing path as well.
I can confirm. I've had quite a few projects that made it to the front page of HN and handled the traffic like cake. All of them ran on 5$ digital ocean droplets.
I accept some projects are more resource expensive than others, but majority of the time you can get away with a bit of asynchronous responses + scheduler/queue to spread the load horizontally over time.
Unpopular opinion: I blame the new age devops culture that made cloud app deployments unnecessarily complicated with k8s and cool new tech (that'd get them high profile jobs.) I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale for sudden spikes of irrational amounts of traffic, so why not just deploy it on a cheap vps?"
I'm convinced it's the hiring that has shaped the scene.
If you're hiring, you want to be able to add/replace people as easily as possible. If you're being hired, you want to charge as much as you can.
And to satisfy those two demands, the current web stack is great. Almost like it's been built for it.
It's got very little to do with the tech itself, a lot more to do with market dynamics. That's the problem it's trying to solve.
The root cause is the decade of zero interest rates which led to companies intentionally overcomplicating their stacks to justify neverending VC rounds. Early prospective employees took notice and adjusted their skills as a result.
The dangerous part is that in the meantime we've got brand new and budding talent that actually took this charade seriously and effectively got high on their own supply, seeing this performance art as the actual normality even in a post-ZIRP world where tech/engineering is primarily there to drive business profits and not a VC mating ritual.
The only winners are the cloud/infra/tooling providers who got an entire generation of "engineers" to do perpetuate their con without even realizing it.
> The root cause is the decade of zero interest rates which led to companies intentionally overcomplicating their stacks to justify neverending VC rounds.
Extended hot take: the true customer of a company are the current and prospective holders of capital, whether VC, private investment, or public markets. Keeping them happy is the main goal of the exec.
If your owners' portfolios include real estate, you push "back to the office". If it includes container startups, you deploy on k8s. Your "strategic partnerships" are determined by how much your owners care about propping up investment A investment B.
Except zero interest rates stopped almost four years ago, but we're still seeing vcs ape into ai this last year.
So clearly the underlying cause of this squandering of resources must come from somewhere else.
I point the finger at the rising class of super rich who don't know what to do with their money. Why do they exist? Why is taxation seemingly not applying to them any longer?
> Why is taxation seemingly not applying to them any longer?
Investing returns was never taxed, as long as they don't use their wealth for consumption they wont get taxed. This is a good thing since it encourages investments over excessive consumption, building a startup is much better than buying another yacht.
That's a very simplistic view.
Like with all things, there is a point where more investment money does not make things faster or better. In fact, since pretty much everything is still dependent on people and social groups, if all the relevant people are already busy, you won't get much for your money.
And too much money chasing too few choices creates bubbles, wich is exactly what has happened. And not just in tech, it is the problem of real estate market in many places of the rich world, and similar bubbles can be found in many markets.
I believe it is actually one modern problem, a good part of the rich world is getting paid a lot more than there is really a need for; and since we glorified "investment" essentially has a way to make even more money; a too big amount of money is not doing much instead of being useful right now.
If you look at society as a whole, spending every ressource every year would be bad, but also setting too much aside for later is actually wasteful.
It's a lot like stocking a pantry, you need to put enough so that you don't run out of food easily but if you have stuff in there that hasn't been used 5 years later it shows the "investment" was too much.
Food gets stale and loses nutritional value over time (even canned) but money also gets stale in a way.
As for the startup vs yacht I would say it largely depends on how useful the startup can ultimately be to society and how many people it can employ (how much actual value is being created).
Because even though yachts have fossil fuel consumption problems, it actually takes a lot of people to make them, maintain and service them actually. It can potentially have a better social impact than a startup...
And this is the root of the issue, people getting a lot of money actually have a responsibility to redistribute it (in an intelligent manner preferably); but what is happening is that people try to get even richer even though it stops having much value to anyone.
All this money that is aping into an ai gold rush could have been taxed to fund schools and other broadly useful things.
I'm very much a free market capitalist, but seeing all vcs ape into ai at the same time does not make me think there is value in have so many super rich people in the world.
I hate it when discussion devolves into high taxation vs low taxation or high regulation vs low regulation. There must be a thing such as the appropriate type of taxation and the appropriate type of regulation. And that depends on what we want as a society. Do we want a big class of super rich that don't know what to do with their money so they squander it on an ai gold rush? Or do we want to tax them so we can put that money to use that we all benefit from?
I think you're mixing up cause and effect. Low interest rates caused investors to chase higher returns in things like VC, VC had pressure to make investments, startups had easy money, so they were less scrappy and had more funding for overengineering.
I can’t confirm but it does look like a significant portion of current software engineering zeitgeist is developed for compartmentalized careerism with a good splash of non-value adding complexity.
If that keeps food on the table and the wheels of business spinning, fine, but I’ve seen this leading to situations where a simple thing expands to a role, and then department and … oh, hi, enterprise software, it’s you!
Hipster devops was overcomplicated long before kubernetes.
In my 1st job we had autoscaling on AWS to handle peaks… except that our servers took about 30 minutes to do the upgrades, download gcc, compile all the needed python modules… We'd always reach the autoscaling limit because the new servers weren't doing anything at all. All of them would be downloading and compiling the same things.
I was very junior, but I told my boss if we shouldn't maybe use base images that already contained everything, instead of a default blank ubuntu for the servers.
The boss said no, because that wouldn't be agile.
The whole thing of using yaml files to configure servers already existed, it worked terribly.
It was basically meme development there. Including using mongodb for no reason at all, and using it badly so that every write was actually moving thousands of records around.
> I was very junior, but I told my boss if we shouldn't maybe use base images that already contained everything, instead of a default blank ubuntu for the servers.
Honestly, nowadays Docker and other OCI containers do this pretty well. Spinning up new instances and even provisioning nodes has become very easy, in addition to load balancing features those provide.
12 factor apps are also an amazingly concise way to develop and manage multiple services without going into YAML hell: https://12factor.net/
The problem is that the management layers around containers are typically overcomplicated to the point of comedy.
Kubernetes might make sense with something like K0s or K3s might make some sense but is still inherently a full time position to run on prem (with updates and observability) and will cause headaches. Hashicorp Nomad is better, but is meant for scales greater than those of most companies. Docker Swarm hits the spot (especially with something like Portainer), but nobody seems to actually care because it's not a trendy piece of tech.
The day Docker Swarm dies is the day I go back to writing PHP in a shared hosting environment in protest.
Docker made doing the right thing really easy, to the point where doing that is a no brainer. I guess there are people out there who might still use OCI containers as glorified stateful VMs, but luckily the majority of documentation and examples out there build proper images that have everything included by the time it actually runs and you have to actively go against that if you want the other approach.
The problem with devops replacing the old title “sysadmin” was that the dev part dragged in the worst thing about developer culture: the love of complexity and the tendency to build massive towers of it.
Sysadmins usually avoided complexity because their attitude toward it was more sensible: it’s expensive, fragile, and tends to actually multiply failure modes.
I also blame cloud marketing. This stuff is a gigantic money printer for cloud companies. It’s in their interest to encourage as much over engineering as possible, especially if it locks you into things like Kubernetes that are hard to run and thus usually used as services.
I swear, I've come to identify myself simple stacker. Complexity should be avoided wherever reasonably possible. But where it's not, it's best left and maintained with the devs who want it.
If you guys don't see the value of being able to click on a button on a website to deploy, perfectly, everytime over some guy sshing into the box and running git pull I dunno what to tell you.
The discussion here is not the method of deployment, but the infra architecture.
You don't need k8s, containers, and multiple cloud instances to automate deployment.
It's perfectly possible and simple to implement a button to deploy on a single machine.
Heck, on OVH or Hetzner, you can have a dedicated bare metal machine with many cores and RAM exclusive to you, cheaper than famous cloud instances. These bare metals will handle what takes sometimes hundreds of containers to handle, with a much simple and easy to maintain infra.
You can also have ugly manual ad hoc Kubernetes workflows where random devs hack YAML in prod directly. I’ve even seen people ssh into running containers and change things.
This isn’t any better than Joe sysadmin and could be worse since the complexity is higher and there’s a greater chance to do more damage with one change. Adding complexity makes bad process worse.
The last time I checked docker was libcontainer and not running lxc any longer.
Libcontaier lxc all use the same api calls to the kernel... so in effect your still correct (mostly)
They all have overhead, if you're starting to use their features... cgroups can have some very funny impacts on app performance more so if your running containers ontop of a linux install over a hypervisor vs nix on hardware.
Why would someone want to self-torture into k8s if they only need a single machine?
If you need isolation for multiple services, just use a container... I don't see an orchestration need that justifies using k8s on a single machine. For such context, it's too much hassle for very little value in return.
Setting up a testing & deploy via a CI script is basically free. AWS gives away CodeDeploy for free. Ansible is open source. I learned all this stuff working on open source stuff where platforms like GitHub give you free compute time.
> If you guys don't see the value of being able to click on a button on a website to deploy, perfectly, everytime over some guy sshing into the box and running git pull I dunno what to tell you.
Maybe clicking a button that sshes into the box and runs git pull?
PSA: please don't follow a manual checklist over ssh. At least write an Ansible playbook that does those things repeatably, or even better, script idempotent changes in an .rpm or .deb to install.
Yeah my search engine, back when it was hosted on a PC in my living room off domestic broadband would shrug off HN[1][2] without the fans even spinning faster than usual.
And like, internet search should be more resource heavy than the sort of websites that regularly do keel over to HN. Every query is like up to 50 MB in disk reads.
The shift to cloud-based workloads (with oversubscribed CPUs and mandatory networked storage) means that a lot of people lost track of just how fast physical hardware (even mid-range consumer-grade) has become.
That’s also a deliberate thing - cloud providers have consciously avoided increasing the per-unit performance of a vcpu, you still get the same Sandy Bridge performance in 2024 as you did in 2012. They actually go all the way to the extent of having AMD design them smaller, higher-density “cloud” cores that don’t clock as high, to avoid ever increasing that vcpu unit.
Not so unpopular. I worked at a startup where they wasted huge amounts of money on a massively complex set up using kubernetes (and this was the early days of kubernetes). Despite this, or maybe because of it, our AWS bill was killing us.
The irony was that the cloud was supposed to be the simplest bit. All the computation and cryptography occurred on the mobile clients, the cloud was really just to provide storage. The same team then rewrote the mobile clients to have a "beautiful" API that took 100s of times more resource than the original code.
I guess they just loved complexity for the sake of it.
Maybe that still counts as an unpopular opinion in the large, but I’ve never seen it be an unpopular opinion by the standards of really effective teams or really productive hackers I had the privilege to be around at times, and it seems to me that your good idea is coming back in a big way recently.
When I worked on teams that denominated egress in terrabits/s, TPS in millions or higher (sometime much higher), and daily warehouse ingest in petabytes it was just the default thing to spin up an instance and a hot standby (sometimes per region or something) if that’s all that was required, and containers and whatnot were used only in the context of bare-metal: you do usually want one level of indirection so containers are really useful if you’re racking metal from a small menu of SKUs.
But as for why it ever became conventional wisdom to wrap a venv inside a container on a hypervisor, often with multiple images composed on a dizzying array of low-friction SKUs?
>I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale...
I'd argue it's just as much the dogmatic nature of all things software-related, together with the attendant shiny new object syndrome.
See SPAs, NoSQL, micro-services, etc. There's generally a use case for all of these, but they tend to be too easily extrapolated into, "if you're not using these, you're not doing it right."
> I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale for sudden spikes of irrational amounts of traffic, so why not just deploy it on a cheap vps?"
I’ve lost startup jobs for basically this. As in “let’s focus on our mvp instead of adopting k8s and discussing “the definition of done” for literally six hours a week”. That one ran out of money and folded, sadly. They had tons of potential and a few bad hires.
But also on the other side, my time is expensive, engineering time is expensive, downtime is catastrophic for a new business. Some people want to spend 4 5 or 6 figures to save twenty dollars a month. Imagine my time is worth $500 an hour, if the effort to make something cheaper doesn’t pay for itself in a year then it’s probably a waste of resources.
Once when working in devops I asked my team lead why we didn't just move everything to Heroku, rather than reinventing a bad in-house version of their features.
I was firmly told not to suggest it again unless I wanted to put all of us out of a job.
That seemed ridiculous to me — we had a laundry list of ways we could help the business if we got basic platform stuff off of our plates. But I sure learned something about how incentives affect otherwise-good engineers.
The current DevOps culture comes from the FAANG guys who really are getting a bazillion requests per second. In the last decade I worked for Amazon, Avalara and Audible Magic. None of these could build an app around a Digital Ocean droplet.
But I think you're pointing out there are PLENTY of useful webapps that can run on a minimal system.
I'm just curious where the middle ground is. Cause I think we've all seen sites blow up after a reference on HN or SlashDot or whatever and by the time you get there the only thing you see is a stock error from the PHP engine saying "My MySQL Engine is Melting" or somesuch.
It would be very cool if one could write an app using {NODE|Ruby|Python|Whatever} and have the infrastructure around it notice when things spike and do some magic under the hood to spin up new containers in geographically distributed data centers and scale up a simple persistence tier.
That way you could move forward with a SIMPLE application instance and not freak out that you'll disappoint new users if there's a spike in demand. You know, sort of like what AWS Lambda was supposed to be.
Hmm... I think I might have come up with a plan for my next startup. Thank you for speaking your truth.
[Edit: To re-iterate, I think I'm saying it's just as wrong to think a small, simple app needs Amazon-level redundancy and immediate scalability as it is to say Amazon could run on a single machine. But... the "Slash-Dotted Website Goes Down" scenario is real and there should be SOMETHING the industry could do that's easier on people than to force them into a custom AWS ECS solution across multiple continents.]
sure. but there's a middle ground there somewhere. it's not just service blackout after 5 requests per second or a $10k monthly aws bill. if your budget was $500, you could set off alarms or auto-shutoff after some threshold. and turn on syn cookies if you're worried about the kiddies.
the point is... there's a middleground there somewhere and I think different people put the cost/availability tradeoff at different places.
> I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale for sudden spikes of irrational amounts of traffic, so why not just deploy it on a cheap vps?"
I work on the Ops side partially, and our platforms are defined by the highest level of complexity we need to support. That is to say that your basic web app can run on k8s, but that auto-scaling, messaged-based behemoth that 85% of the business flows through will not.
So I can either support k8s, or I can support k8s _and_ old-school rsync deployments to VPS'
The complexity of running a basic web app on k8s is entirely too high, but the cost of keeping an entirely separate deployment/monitoring/oncall/permissions stack for VPS' is worse. Better hope your monitoring vendor has an agent that can run in a VPS' on the OS you want, or you're back to running Nagios yourself.
If at least 1 group is going to write an app that runs on k8s, you might as well run most of the company on k8s. Otherwise you're either going to manage 2 separate stacks, and/or try to write an abstraction layer that will probably itself resemble k8s.
> I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale for sudden spikes of irrational amounts of traffic, so why not just deploy it on a cheap vps?"
Was a devops engineer at my previous job.
We already had k8s clusters setup, pre-made CI templates and pre-tailored helm charts (along with monitoring and much more). All those things you (a developer) could mostly clone, slightly customize and ship both to a development k8s cluster and to a prod k8s cluster (with all the safety nets already in place).
Creating (and maintaining) a single vm for a pet project is way more work than using the pre-made and pre-customized and curated toolkit.
This was at a 100+ developers organisation.
If you think you could easily get off with a single vm then you've never seen devops done right, i'm fairly sure.
EDIT: I probably fell for the bait, but the post i'm replying to really made me remember why we went on a killing spree in order to eradicate everything that was not k8s at my previous job and removing as much developer access to prod as possible. Some idiot developers think they know better, usually end up re-inventing a square wheel that breaks as soon as it's not running on their laptop anymore.
IMO the vast majority of software development happens in much smaller organisations than that. Dev Ops still matters there, and the requirements are different.
I am working in an organisation with one and a half developers. I am lobbying that the third of fourth developer concentrates on Dev Ops here.
It is very important. Look at the back up/recovery procedures at your organisation. Has there ever been a fire drill? Are you sure the back ups are sound? Can you recover? What if data corruption occurred a week/month ago. Have you a back up of the uncorrupted data?
That is a very unsexy aspect of Dev Ops, and without somebody dedicated to the job, your backups will not be any of those things.
> I am working in an organisation with one and a half developers. I am lobbying that the third of fourth developer concentrates on Dev Ops here.
not sure you're doing the right thing here. you might want to consider hiring some kind of linux guy that can can do some basic devops. or maybe hire some devops contractor that can work with you on a part-time basis and "curate" some specific aspects of your operations.
I've seen this done in the past: so you've got this consultant on retainer, and you tell them something like: "i've got this issue, can we do something about it? our constraints would be x y z" and the consultant would make 1-3 proposals (different approaches, different pricing levels, different ETAs etc) and then you agree on what gets done. The key aspect here is that a good devops consultant can get stuff done very quickly.
> It is very important. Look at the back up/recovery procedures at your organisation. Has there ever been a fire drill? Are you sure the back ups are sound? Can you recover? What if data corruption occurred a week/month ago. Have you a back up of the uncorrupted data?
Yes (to all questions). I ended up in working in heavily regulated environments. All the things you mentioned were not just niceties, but mandatory by legal requirements.
> That is a very unsexy aspect of Dev Ops, and without somebody dedicated to the job, your backups will not be any of those things.
That's basic system administration. Most devops engineers are former sysadmins.
> IMO the vast majority of software development happens in much smaller organizations than that
I guess it's okay to have an opinion about that, but this seems like something that should probably be a fact. Unfortunately, not sure I can find reliable stats on sizes of engineering organizations.
The thing is, while there are obviously lots of small companies, there are also some really big software development organizations out there. A company like Netflix has 2,500 engineers. Microsoft employs over 100,000 engineers. Walmart employs over 15,000 software developers.
You need a lot of little 10-50 engineer dev shops to add up to the combined size of the engineering orgs of the Fortune 500.
According to https://www.statista.com/statistics/507530/united-states-dis..., at least, 29.4+25.8 = 55.2% of the US "IT Industry" workforce are employed in companies with >100 employees. That's a Long way from telling us about sizes of engineering orgs though.
But still... I'd be careful assuming that the vast majority of developers are in organizations of less than 100 people.
Why do you assume that only three US landscape is discussed here?
Plus I'm not very sure those statistics are reliable.
Anecdotal evidence: I have 22 years of career and I've only worked in big organizations twice, for the total of a year. Everything else was much smaller.
But you’ve got 100 developers. That’s not “most web apps”, that’s firmly in the set of companies that need standardisation and potential scale, where the devops team makes the life of many devs far better. When it’s just a few devs and cash is limited, the business doesn’t need the complexity. It’s mostly for ego and branding.
When the company is set up well on k8s, then choose k8s. If the company is set up well on VPS, then choose VPS.
If it got neither, I'm unsure what's the better way to go on a greenfield.
k8s has nice tooling, but part of that is required because it is massively complex.
But with managed k8s providers (e.g. GKE with autopilot) you can just put in proper CPU and memory limits and essentially get yourself a VPS provisioned without having to worry about anything by k8s measures.
If you have different pricing or location constraints, then it might be better to go with different models and some custom orchestration.
Yeah, I'm an independent, selling these solutions. My minimum stack costs about $500/mo in AWS costs- and you could save 90% of that. But then you'd be paying me $20k more to set it up again when you expand to another dev team, while this way I can add your second dev team for $10k and practically zero additional AWS cost.
Going straight to overkill is good business sense for any company that's going to make it to a medium-sized business, and it's not going to be the differentiating factor for a company that burns out.
> All those things you (a developer) could mostly clone, slightly customize and ship both to a development k8s cluster and to a prod k8s cluster (with all the safety nets already in place).
How did/does the devs create, test new code and debug issues? Can they do that locally on their local laptop? If so, how?
>> How did/does the devs create, test new code and debug issues? Can they do that locally on their local laptop? If so, how?
I used to buy the idea of this when there were monoliths.
Then I had the joy of running a shop with dozens of web properties.
Local dev became untenable. My systems admin was an early adopter of Xen. That shop ran like a dream... devs could come in and have a new environment update in place and just start to work. Staying in sync with prod was never a problem.
By making systems guys keep devs fed, by making devs work closely with systems folks you get better software. Containers just hid developers shitty decision making in a wrapper that Systems folks can tolerate.
And how DO you debug... cause what you do to figure a problem out locally is not how you trouble shoot when the shit hits the fan in prod. These tools should be the same, sane and well used and loved by everyone. Local debugging is part of the problem, even more so if your service based app that lives and dies on the wire.
Then I assume you have a custom Kubernetes LB that can handle non-HTTP TCP and UDP traffic because you choosing Kubernetes and the design restrictions that comes with it does not affect how the dev solves problems?
The underlying orchestrator definitely affects how the software needs to behave and is definitely not irrelevant.
> Then I assume you have a custom Kubernetes LB that can handle non-HTTP TCP and UDP traffic because you choosing Kubernetes and the design restrictions that comes with it does not affect how the dev solves problems?
nginx-ingress-controller does that. no custom stuff required.
I think that the $5 tier is a little tight for a web app as opposed to a crud app, but 2x $40 tiers is enough for a decent amount of traffic, with one as a failover.
The problem is that containers are excellent, and IMO there's a gap in the market between "I want to run one container" and "I want a fully managed k8s cluster"
For my personal projects (that right now seem to revolve around April 1st jokes for people in my industry) - I've found a good middle ground to be K3s on a single Hetzner VM that I scale up and down if I think more/less people are going to be looking at it (i.e. between the 5 - 10 dollar/month price range)
I set this up with Terraform and some bash scripts. Infra as code is just too convenient to pass up on for something that I might not come back to for months at a time (and then ask - how did I set that up?), and containers too mean that I can play with some shiny fun technology one time, leave it in a box, and then come back to a clean-slate for the next thing that I want to play with without having to pay for a new VM etc.
The other great thing about containers is that you can have entire isolated stacks for your projects, and you can stand the entire thing up with `docker compose up -d` in a matter of seconds. Gone are the days of accidentally connecting to the wrong database with WAMP.
> The problem is that containers are excellent, and IMO there's a gap in the market between "I want to run one container" and "I want a fully managed k8s cluster"
A single container: Docker
A few containers on the same node: Docker Compose
Containers across multiple nodes with load balancing and networking: Docker Swarm (with Portainer to manage it)
Alternatively, Podman is also pretty nice.
If you need something that runs not just containers in clusters: Hashicorp Nomad
If you want to go for Kubernetes but without it being too hard: K0s or K3s or MicroK8s or RKE (with Portainer or Rancher to manage it)
I don't think I ever got around to making it self healing if a container dies, but it does support gitops style deployments through a cronjob / conf repo similar to argocd
It's been running happily on a <$10 / month aws lightsail instance for a few years now, though tbh I'd still reach for k8s for anything serious
All my teams infra runs on fargate + ECS so I'm pretty familiar (and happy) with it. Running in fargate requires knowlege of AWS: VPC's, public and private subnets, security groups, ECR, ALB's, Target Groups, IAM policies + roles. Then, when you want to add a databse, you're back into all of the above, plus the database specific ones like database subnet groups.
When it comes to health checks you have docker health checks, container health checks, load balancer health checks - all of which are configured separately.a Not to mention, doing it "properly" where your task doesn't have a public IP and is only accessible through your load balancer - [0] might be one of the most infuriating responses on stackoverflow.
Meanwhile, with DO droplets, it's pretty much "here's a registry URL, and some configuration in YAML, go for it".
A ton of the problem is people turning everything into dynamic this or that. Wordpress culture, basically.
Most pages can really just serve a jekyll/astro static-generated page and be fine. But if you shove a database and php in the middle, it’s gonna be multiple orders of magnitude slower.
Kubernetes is/was a way to fight off walled gardens from cloud providers. The other path would have been to learn the bespoke implementation of each cloud provider depending on what that employer ended up using.
Kubernetes was at the right place, at the right time just as AWS was trying to force feed people their own proprietary solution, as Azure was trying to wall off people into their own walled garden, as GCP was being Google just not giving a damn about any other usecase than what works great at a massive search company.
With Kubernetes, developers can learn one API to deploy their applications and hopefully it works on AWS, Azure, GCP, DO, OVH or a laptop at home.
So that way, developers can learn one thing and transfer their knowledge at an employer that hosts on AWS, and then another that hosts on Azure and so on.
This is in contrast to the experience of a Python developer who's mastered FastAPI/Flask/SQLAlchemy and feels absolutely lost in a Django project or an Angular developer who stares a Next.js project wondering what the heck is happening and how it all works. Neither a Next.js or an Angular developer would start off with an AWS Amplify solution if they could help it.
> With Kubernetes, developers can learn one API to deploy their applications and hopefully it works on AWS, Azure, GCP, DO, OVH or a laptop at home.
That's one of the lies developers tell themselves, because at some point you're going to need to manage Accounts, VPCs and ELBs, Certificates, Security Groups, IAM policies, and everything else. All of those underlying primitives that are required and have massive differences in behavior that are expressed differently in GCP, Azure, and AWS.
On top of that Kubernetes is itself a walled garden.
You will inevitably end up cargo culting the entire ecosystem of plugins, like Cilium and Helm and so on. All of this IaC is meaningless outside of Kubernetes. Soon enough, you have 10,000 lines of YAML configuring highly proprietary infrastructure with multiple variants for each cloud. At some point you will have to rewrite controllers to add functionality or correct bugs the upstream maintainers don't want to prioritize, and so on.
Your "knowledge" of the stack ends up being the ability to orchestrate 15 levels of templated YAML. Eventually your company ends up hiring people who only know how to copy/paste YAML, and lose institutional knowledge of how underlying systems work. You didn't break out of the walled garden, you created an elaborate prison. And Amazon and GCP and Azure love you, because you're their #1 customer. The more complex you make it to deploy a CRUD app the more they profit.
> I've never come across a devops person who'd say "Hey, that sofware is too simple to prematurely scale for sudden spikes of irrational amounts of traffic, so why not just deploy it on a cheap vps?"
It Is Difficult to Get a Man to Understand Something When His Salary Depends Upon His Not Understanding It
> Unpopular opinion: I blame the new age devops culture that made cloud app deployments unnecessarily complicated with k8s and cool new tech (that'd get them high profile jobs.)...
Ah yes, time for the annual debate on the complexities of Kubernetes versus the unparalleled genius of custom scripts that seem to work...sometimes. Because reinventing the wheel is always superior to something with a standardized API.
And let's not forget the sheer elegance of a homegrown scripts that rivals the structured approach of Kubernetes. A true testament to intuitive design.
Sure, Kubernetes might have a few minor benefits beyond 'scaling out'. But honestly, who needs the ability to manage complex applications with any semblance of ease?
> honestly, who needs the ability to manage complex applications with any semblance of ease?
The argument being made here is majority of applications are rarely complex and hence dont require managing that complexity.
A simple webservice fronted by a simple reverse proxy like Caddy running on a single "modern PC" can do wonders without any Kubernetes needing to get involved.
I'm always amazed by Marijn's brilliance (Author of Lezer, CodeMirror and the awesome ProseMirror toolkit)
The level of depth he dives into to perfect his projects is insane. For example, lezer is a parser generator( which by itself is not a trivial feat with novel
ideas like incremental computations applied to parsing) to power his mainstream project which is CodeMirror.
Like this Prosemirror too has some insane levels of engineering underneath married with thoughtful architectural decisions.
He's incredibly responsive too. He just answered my question about an hour after it was asked... on a Sunday evening.
Slightly off topic; is anyone aware of any Lezer grammar for regex? I've not been able to find any in the FOSS world. I suspect Regex101 has one, bit it's sadly closed source.
Yeah, this happened to us too a couple of times. Once, we asked for a clarification and he took it as a sensible requirement, made the changes and took it to master in just over an evening.
The next day we just had to update our package to the latest version and marvel at his response time.
Massively agree. Code mirror 5 was excellent. Code mirror 6 was a big enough improvement to justify the upgrade. I've used it as part of 2 large projects and it's handled every expanding use case I've needed it to. It supports themes, sql,js and 2 weeks ago I used its diff functionality. Really great library I can fully recommend.
Code mirror 6 is super hard to use though unless you are a front end expert. Code mirror 5 was basically plug and play. Code mirror 6 is a box of Legos and you are responsible to build and bundle your own editor from scratch.
I publish to npm because 1) Package managers are a fabulous idea. They're an easy way to download a package and it's dependencies, and update them over time. 2) it's not a good idea to push build artifacts to your repo, 3) This depends on CodeMirror, which is published to npm. Otherwise I would have to provide it too. That's too much work. 4) you don't need to be a Node developer to use npm.
> Code mirror 6 is a box of Legos and you are responsible to build and bundle your own editor from scratch.
Yep. Let me stress that there's absolutely nothing wrong with this if that's what you want to do -- but it's less than optimal for the person who just wants to add a JavaScript file for the editor, with maybe a JS/CSS combo to support syntax highlighting for a specific language. Codemirror 5 was like that, pretty much.
You basically have to set up a whole independent build and packaging system to configure Codemirror 6, and a lot of people just don't want to deal with that.
It's a pity it's the only JavaScript-based editor that works at all reliably on mobile (I would be delighted to be proven wrong about this).
This is an amazing piece of work. Thanks for making this open-source.
The default way of generating beautiful PDFs in the backend these days is, running a headless browser and using browser APIs to turn HTML/CSS into PDFs. And apparently, it's a bit costly running instances of browser in the server and scale it properly for huge workloads.
This is literally a game changer. Now it's possible to design PDFs using HTML/css and generate them without the browser overhead!
As an aside its amazing how far the web has come, where the best way to make pretty pdf documents is to literally run a web browser on the server. This would have been so unthinkable back in the 90s & 2000s
I needed to transform a 12MB HTML file into a PDF document and headless Chrome quickly ran out of memory (4GB+).
We are now using a commercial alternative that seems be be using a custom engine that implements the HTML and CSS specs. The result is reduced memory usage (below 512MB during my tests) and the resulting PDF is much smaller, 3.3MB vs 42MB.
We are also using DocRaptor. It takes around 20 seconds to generate the PDF, and we only need to generate it every night. So the costs are also not an issue at the moment.
Yes, I’ve tried all the open source projects I could find. Including Weasyprint and wkhtmltopdf. Weasyprint was much slower than headless Chrome and also required a lot of memory to process the HTML. And wkhtmltopdf is no longer maintained and crashed while processing.
Have you tried Typst? It's like a modern version of LaTeX and allows to generate nice looking documents quickly. Can be called from the console and makes it easy to create templates and import resources like images, fonts and data (csv, json, toml, yaml, raw files, ...). Of course it is its own language instead of HTML/CSS but so far I found it quite pleasant to use.
Back around 2002 at least there were some products, ABCpdf is one I used a lot, which ran Internet Explorer on the server to generate PDFs from HTML. Worked pretty well from what I recall.
I'm fairly certain that using a headless browser on the server is mainly about sandboxing all the security concerns that PDFs have, not aesthetics, but yes.
it's actually because layout-via-code for arbitrary documents is a humblingly complex problem, so leveraging existing layout engines is preferred.
This impressive effort looks far better than what I'd achieve, but when this approach has been tried before, it is eventually discovered that few organizations have the resources to maintain a rendering engine long-term.
I do think complexity could be part of why we don't have many options here, but I don't agree that a layout engine is too difficult to maintain. More of the issue is that CSS layout (and maybe layout in general) is not widely well-understood. I've almost _never_ come across people interested in layout because generally it's a few properties to get something working and then you move on.
> few organizations have the resources to maintain a rendering engine long-term
I'm curious are there other instances of this happening than Edge switching to Blink? That event was one of my main motivators; it felt like further consolidation of obscure knowledge.
Very fun project! Did you ever consider integrating with web-platform-tests? It's shared between all the common browser vendors, and we're always interested in more contributors :-)
True. But I wonder if there are more special-purpose engines similar to Prince that have been abandoned.
> Did you ever consider integrating with web-platform-tests?
I've run some of the WPT tests manually, but I don't yet have <style> support, and some of them use <script> I think? That's a path I'm wary of (eval()?) but I could have a special mode just for tests.
I did discover lots of weird corners that would be great to make some WPT tests for. Definitely something I want to do!
Yes, a _lot_ of WPT tests depend on <script>. But there's also a bunch of ref-tests, where you just check that A and B match pixel for pixel (where B is typically written in the most obvious, dumb way possible). It lets you test advanced features in terms of simple ones. But yes, you'd need selector support in particular.
I maintain a standalone web layout engine[0] (currently implementing Flexbox and CSS Grid) which has no scripting support. WPT layout tests using <script> is a major blocker to us running WPT tests against our library. Yoga (used by React Native) is in a similar position.
Do you think the WPT would accept pull requests replacing such tests with equivalent tests that don't use <script> (perhaps using a build script to generate multiple tests instead - or simply writing out the tests longhand)?
I could run against only the ref-tests, but if I can't get full coverage then the WPT seems to provide little value over our own test suite.
I don't decide WPT policies (and I honestly don't know who does), but I'm pretty sure using a build script would be right out, as WPT is deeply embedded in a lot of other projects. E.g., if you made a build script, you would need to add support for running that script in Blink and in Gecko and in WebKit, and their multiple different runners, and probably also various CI systems.
As for the second option, I don't actually know. If it becomes 10x as long, I would assume you get a no for maintainability and readability reasons. If it's 20% longer and becomes no less clear, I'd say give it a try with a couple tests first? It's possible that the WPT would be interested in expanding its scope to a wider web platform than just browsers. You would probably never get people to stop writing JS-dependent tests, though, so you would need to effectively maintain this yourself.
Of course, for a bunch of tests you really can't do without <script>, given that a lot of functionality is either _about_ scripting support (e.g. CSSOM), intimately tied to it (e.g. invalidation) or will be tested only rather indirectly by other forms of tests (e.g. computed values, as opposed to used values or specified values or actual values or …).
To reply mostly with my WPT Core Team hat off, mostly summarising the history of how we've ended up here:
A build script used by significant swaths of the test suite is almost certainly out; it turns out people like being able to edit the tests they're actually running. (We _do_ have some build scripts — but they're mostly just mechanically generating lots of similar tests.
A lot of the goal of WPT (and the HTML Test Suite, which it effectively grew out of) has been to have a test suite that browsers are actually running in CI: historically, most standards test suites haven't been particularly amenable to automation (often a lot of, or exclusively, manual tests, little concern for flakiness, etc.), and with a lot of policy choices that effectively made browser vendors choose to write tests for themselves and not add new tests to the shared test suite: if you make it notably harder to write tests for the shared test suite, most engineers at a given vendor are simply going to not bother.
As such, there's a lot of hesitancy towards anything that regresses the developer experience for browser engineers (and realistically, browser engineers, by virtue of sheer number, are the ones who are writing the most tests for web technologies).
One could definitely imagine a world in which these are a test type of their own, and the test logic (in check-layout-th.js) can be rewritten in a custom test harness to do the same comparisons in an implementation without any JS support.
The other challenge for things like Taffy only targeting flexbox and grid is we're unlikely to add any easy way to distinguish tests which are testing interactions with other layout features (`position: absolute` comes to mind!).
One of the benefits of using the browser is that the generated PDF will be using vectors/fonts etc whereas Canvas will be mostly an image in the PDF. Not a big deal for the most use cases.
I feel like it’s probably not a leap to go from this to having a PDF renderer as a backend. The trickiness is in the layout, which this is already doing. Looks to be a lower level api and a way to render to absolutely positioned html. That gets you most of the way there.
I'm a little confused by your comment. I've been using the Prawn library to generate PDFs on the backend for a side project I am working on for quite sometime https://github.com/prawnpdf/prawn
(Admittedly, the PDFs I generate are most certainly not beautiful, so maybe that's the difference)
Prawn really is great. I use it to generate invoices and for exporting a billing overview in client projects. And it’s quite fast as well, since it generates the PDF directly without the need to spin up a browser.
I've been building PDF renderers for a few clients. The biggest request in PDFs has been accessibility. Everyone needs to be ADA compliant. I sadly don't see myself switching to a canvas renderer because then there is no accessibility.
I was taught programming using an obscure language called LOGO that allowed moving a cursor(aka turtle) to draw shapes algorithmically.
128MB of ram was considered decent and if you could afford to plug in another 128MB of ram, you'd be able to use Windows XP for an hour or so before it freezes your computer.
One of the authors of the book is Chris, who leads the Blink rendering team at G :)