More

nijave · 2024-06-28T11:34:22

Lower power if you have idle stuff running in the cluster but need RAM to run it. Mini PCs also work.

These are also portable and quiet so you can stick it in your bedroom and leave it running 24/7 without feeling like you're in a data center (so it's more accessible to people in shared housing)

yencabulator · 2024-06-28T17:33:35

But what size of a NUC do you need to have the power of 3 RPis, and virtualize on that for learning? I'd expect just about any decent x86 box to be able to pull that off, and they can be silent / fanless / suspend too.

Actually, you might have a better chance of finding a fanless NUC than running latest RPi fanless.

My brandless fanless mini PCs from 2016 are doing great. i5, 32 GB RAM etc. For a lot more oomph, get https://store.minisforum.com/products/minisforum-um790-pro?v... (though Ryzen 9 -> fan).

nijave · 2024-06-28T11:30:07

I think bare metal makes more sense in big setups then you don't have the overhead of a hypervisor. For home setups, I think it's easier to get going with VMs since the acquisition cost and setup time is lower once you have 1 single, larger pieces of hardware to run it all on

nijave · 2024-06-28T11:26:12

I was debugging some issues with Thanos and had pretty good success tweaking the codebase to add additional telemetry.

The code was fairly well organized and more importantly worked out of the box with a Makefile and IDE integration (GoLand). All it took was `git clone` and opening GoLand to get started.

For C (maybe it's C++), fluentbit seemed pretty straight forward (I don't have much experience in C though)

nijave · 2024-06-28T11:19:00

Besides just KISS, a lot of messes I've seen have been implementing patterns outside the framework or implementing complex patterns that didn't add value.

Besides KISS (or maybe as an extensive), try to keep framework-based codebases as close to the official documented setup as possible. You automatically get to s of free, high-quality documentation available on the Internet.

nijave · 2024-06-26T01:50:03

I think codegen/compilation is a middle ground here. A higher level language like starlark can be compiled down to a set of instructions that provide the described guarantees.

This is how Pants (build system) works. You have declarative Starlark which supports basic programming semantics and this generates a state the engine reads and tries to produce.

I've been meaning to dive into jsonnet for a while but it'd be good to have a higher level representation that didn't rely on sophisticated templating and substitution engines like current k8s.

Compare k8s to Terraform where you have modules, composability, variables. These can be achieved in k8s but you need to layer more tooling on (kustomize, helm, etc). There could be a richer config system than "shove it in YAML"

Things like explicit ordering and dependencies are hard to represent in pure yaml since they're ",just text fields" without additional tools

nijave · 2024-06-26T01:44:19

This pattern is powerful since you can pick arbitrary tooling and easily make modifications with your own tooling. For instance substituting variables/placeholders or applying static analysis.

nijave · 2024-06-24T11:57:40

I used ELevate for a few VMs at home and it worked pretty well. I upgraded from Centos 7 to Alma 9 one release at a time

https://almalinux.org/elevate/

cpach · 2024-06-24T12:03:59

That looks very useful. Thank you for linking!

nijave · 2024-06-22T21:26:04

Lambda scales faster if you really do need that. For instance, imagine bursts of 100k requests. Cold start on Lambda is going to be lower than you can autoscale something else.

ndriscoll · 2024-06-22T21:47:56

What actually happens is you hit the concurrency limit and return 99k throttling errors, or your lambdas will try to fire up 100k database connections and will again probably just give you 99k-100k errors. Meanwhile a 1 CPU core container would be done with your burst in a few seconds. What in the world needs to handle random bursts from 0 to 100k requests all at once though? I struggle to imagine anything.

Lambda might be a decent fit for bursty CPU-intensive work that doesn't need to do IO and can take advantage of multiple cores for a single request, which is not many web applications.

nijave · 2024-06-26T12:21:44

You'd obviously need to make sure the persistence pieces and downstream components can also handle this. You could dump items onto a queue or utilize key value like Dynamo

A 1 CPU container isn't going to handle that many "app" requests unless you have a trivial (almost no logic) or highly optimized (like c static web server) app

Lambda apps don't need to take advantage of multiple cores since you get a guaranteed fractional core per request

One example is retail fire sales. I interviewed with a company that had this exact use case. They produced anti bot software for limited product releases and needed to handle this load (they required their customers submit a request ahead of time to pre scale)

Also useful with telemetry systems where you might get a burst in logs or metrics and want to consume as fast as possible to avoid dropped data and buffering on the source but can dump to a queue for async processing

ndriscoll · 2024-06-26T23:23:39

In my experience, real world applications mostly do deal with basically trivial requests (e.g. simple CRUD with some auth checks). The starting point I'm used to for multiple web frameworks is in the thousands of requests/second. Something like telemetry is definitely trivial, and should be easy to batch to get into the 10s of thousands of requests/second.

The 25 ms request you mentioned in another comment is what I'd categorize as extremely CPU intensive.

chuckadams · 2024-06-22T22:16:20

If you go from 0 to 100K legit requests in the same instant, any sane architecture will ramp up autoscaling, but not so instantly that it tries to serve every last one of them in that moment. Most of them get throttled. A well-behaved client will back off and retry, and a badly-behaved client can go pound sand. But reality is, if you crank the max instances knob to 100k and they all open DB connections, your DB had better handle 100k connections.

ndriscoll · 2024-06-22T22:58:11

A sane architecture would be running your application on something that has at least the resources of a phone (e.g. 8+ GB RAM), in which case it should just buffer 100k connections without issue if that's what you need/want it to do. A sane application framework has connection pooling to the database built in, so the 100k requests would share ~16-32 connections and your developers never have to think about such things.

nijave · 2024-06-26T16:08:04

You need to multiple that out by request servicing time.

Say your application uses 25ms real CPU time per request. That's 40 reqs/sec/cpu core. On a 4 core server, that's 160reqs/sec. That's 625 seconds to clear that backlog assuming a linear rate (it's probably sub linear unless you have good load shedding).

So that's 10 minutes to service 100k requests in your example. I'm ignoring any persistent storage (DB) since that would exist with our without Lambda so that would need its own design/architecture.

chuckadams · 2024-06-22T23:12:35

I'd call pooling part of "the DB". "DB Layer" if you must, or the interface to it, whatever. Anyway, AWS has RDS Proxy, which held up pretty well against my (ad hoc and unscientific) load tests. But if you're actually trying to handle 100K DB requests in flight at once, your DB layer probably has some distributed architecture going on already.

ndriscoll · 2024-06-22T23:35:23

If you're using RDS proxy, now you're not scaling to zero, and you still can't handle 100k burst requests because lambda can't do that. So why not use a normal application architecture which actually can handle bursts no problem and doesn't need a distributed database?

Lambda could be a compelling offering for many use cases if they made it so that you could set concurrency on each invocation. e.g. only spin up one invocation if you have fewer then 1k requests in flight, and let that invocation process them concurrently. But as long as it can only do 1 request at a time per invocation, it's just a vastly worse version of spinning up 1 process per request, which we moved away from because of how poorly it scales.

nijave · 2024-06-26T16:04:37

Lambda can scale to 10k in 1 minute https://aws.amazon.com/blogs/aws/aws-lambda-functions-now-sc...

If your response time is 100ms, that's 100k requests in 1 minute.

Lambda runs your code in a VM that's kept hot so repeated invocations aren't launching processes. AWS is eating the cost of keeping the infra idle for you (arguably passing it on).

ndriscoll · 2024-06-26T23:56:33

A normal application can scale to 10k concurrent requests as fast as they come in (i.e. a fraction of a second). Even at 16kB/request, that's a little over 160 MB of memory. That's the point: a socket and a couple objects per request scales way better than 1 process/request which scales way better than 1 VM per request, regardless of how hot you keep the VM.

Serving 10k concurrent connections/requests was an interesting problem in 1999. People were doing 10M on one server 10 years ago[0]. Lambda is traveling back in time 30 years.

[0] https://migratorydata.com/blog/migratorydata-solved-the-c10m...

nijave · 2024-06-22T21:24:30

Abstracting the runtime environment is nice for mixed environments and supporting local development. Maybe some deployments have really low load and Lambda is dirt cheap whereas other instances have sustained load and it's cheaper to be able to swap infrastructure backend.

It doesn't have to be about life support for neglected code.

nijave · 2024-06-18T01:11:12

Infinite growth!