More

sgarland · 2025-02-01T16:52:57 1738428777

Those same people are utterly incapable of reading logs. I’ve had devs send me error messages that say precisely what the problem is, and yet they’re asking me what to do.

The form of this that bothers me the most is in infra (the space I work in). K8s is challenging when things go sideways, because it’s a lot of abstractions. It’s far more difficult when you don’t understand how the components underpinning it work, or even basic Linux administration. So there are now a ton of bullshit AI products that are just shipping return codes and error logs out to OpenAI, and sending it back rephrased, with emoji. I know this is gatekeeping, and I do not care: if you can’t run a K8s cluster without an AI tool, you are not qualified to run a K8s cluster. I’m not saying don’t try it; quite the opposite: try it on your own, without AI help, and learn by reading docs and making mistakes (ideally not in prod).

sgarland · 2025-02-01T01:53:11 1738374791

That’s why you install gnu-coreutils. Eliminate the difference.

sgarland · 2025-01-30T00:08:33 1738195713

I think you have something misconfigured, or are timing incorrectly. I'm working on a project right now with ~10K LOC. I haven't timed it, but it's easily <= 2 seconds. Even if I nuke MyPy's cache, it's at most 5 seconds. This is on an M3 MBP FWIW.

imron · 2025-01-30T00:34:01 1738197241

And with dmypy (included with myoy) it’s even faster

Redoubts · 2025-01-30T19:32:29 1738265549

I've found dmypy very underbaked. It's very easy to get it to regularly crash or pin a CPU indefinitely in my codebase.

imron · 2025-01-30T22:02:31 1738274551

Yeah it’s far from perfect, but speed is usually not its biggest fault.

I’ll still be switching to the astral offering as soon as it’s production ready.

sgarland · 2025-01-30T00:06:18 1738195578

I've avoided Pyright explicitly for that reason. I have a severe dislike of Node, and don't want it installed on my computer for any reason. I'm aware that this is a self-limiting position.

Anyway, agreed that this is very exciting news. Poetry was great, and then I found uv. It's... wow. It's really good.

pinoy420 · 2025-01-30T00:24:00 1738196640

Node is fine. This is very much old man shouts at cloud. Pyright is really quick. Node is fast enough for a majority of uses.

taurknaut · 2025-01-30T13:15:11 1738242911

Tbh, i understand the hatred of node just from an administration perspective. I also avoid tooling just to avoid dealing with it (although, shout out to n rather than nvm).

rednafi · 2025-01-30T01:00:43 1738198843

Not old and not yelling at anything. I don’t have any qualms about installing tools that depend on Node, as I use VSCode and therefore Pyright all the time.

That said, Node is awful as a backend language, and speed has nothing to do with it. I will write Go, Python, Rust, or anything else because of how those languages have been designed. JS doesn’t belong in my backend—YMMV.

sgarland · 2025-01-30T00:49:36 1738198176

I don’t dislike it because it’s not fast enough; I dislike it because I associate with the rise of tech influencers, who are at best charlatans, and who at worst have helped to usher in an Eternal September far worse than Web2.0 could ever have dreamt of doing.

ripped_britches · 2025-01-30T01:07:22 1738199242

Node did this? A fully optional JS runtime that isn’t used in browsers? Who hurt you?

sgarland · 2025-01-30T01:22:32 1738200152

> Who hurt you?

Web devs; I thought that was implied.

sgarland · 2025-01-30T00:03:32 1738195412

I have to wonder – are they using a connection pooler? I'm leaning towards no, since what they did in code can be natively done with PgBouncer, PgCat, et al. That would also explain the last footnote:

> The big bottleneck is all the active connections

For anyone who is unaware, Postgres (and Aurora-compatible Postgres, which sucks but has a great marketing team) uses a process per connection, unlike MySQL (and others, I think) which use a thread per connection. This is inevitably the bottleneck at scale, long before anything else.

I did feel for them here:

> We couldn’t create a blue-green deployment when the master DB had active replication slots. The AWS docs did not mention this. [emphasis mine]

The docs also used to explicitly say that you could run limited DDL, like creating or dropping indices, on the Green DB. I found this to be untrue in practice, notified them, and I see they've since updated their docs. A painful problem to discover though, especially when it's a huge DB that took a long time to create the B/G in the first place.

stopachka · 2025-01-30T00:21:20 1738196480

> are they using a connection pooler

We use Hikari [1] an in-process connection pooler. We didn't opt for pgbouncer at al, because we didn't want to add the extra infra yet.

> since what they did in code can be natively done with PgBouncer, PgCat, et al.

Can you point me to a reference I could look at, about doing a major version upgrade with PgBouncer et al? My understanding is that we would still need to write a script to switch masters, similar to what we wrote.

> The big bottleneck is all the active connections

The active connections we were referring too were websocket connections; we haven't had problems with PG connections.

Right now the algorithm we use to find affected queries and notify websockets starts to falter when the number of active websocket connections on one machine get too high. We're working on improving it in the coming weeks.

I updated the footnote to clarify that it was about websocket connections.

> I did feel for them here:

Thank you! That part was definitely the most frustrating.

[1] https://github.com/brettwooldridge/HikariCP

sgarland · 2025-01-30T00:59:01 1738198741

I’m not sure about a reference, other than their docs [0]. Basically, you’d modify the config to point to the new servers, issue PAUSE to PgBouncer to gracefully drain connections, then RELOAD to pick up the new config, then RESUME to accept new traffic.

This would result in client errors while paused, though, so perhaps not quite the same. To me, a few seconds of downtime is fine, but everyone has their own opinions. EDIT: you could of course also modify your client code (if it doesn’t already) to gracefully retry connections, which would effectively make this zero downtime.

ProxySQL (which I think now supports Postgres) has a global delay option where you can effectively make clients think that the query is just taking a long time; meanwhile, you can do the same sequence as outlined.

If you had HA Bouncers (which hopefully you would), you could cheat a little as you eluded to in the post, and have one still allow read queries to hit the old DB while cutting over writes on the other one, so the impact wouldn’t be as large.

[0]: https://www.pgbouncer.org/usage.html

stopachka · 2025-01-30T17:11:08 1738257068

> you’d modify the config to point to the new servers, issue PAUSE to PgBouncer to gracefully drain connections, then RELOAD to pick up the new config, then RESUME to accept new traffic.

The function we wrote effectively executes these steps [1]. I think it would look similar if we had used PgBouncer. I could see it be an option though if we couldn't scale down to "one big machine".

[1] https://github.com/instantdb/instant/blob/main/server/src/in...

Izkata · 2025-01-30T04:02:40 1738209760

> This would result in client errors while paused, though, so perhaps not quite the same.

What? Docs say:

> New client connections to a paused database will wait until RESUME is called.

Which fits what I remember when I was testing pgbouncer as part of automatic failover ages ago, if the connection from pgbouncer to the database dropped it would block until it reconnected without the app erroring.

sgarland · 2025-01-30T13:17:31 1738243051

I stand corrected! It may also depend on the application itself, timeouts, etc. I’ve seen errors before when doing this, but now that I think about it, it was on the order of a handful of connections out of thousands, so it was probably poor client handling, or something else.

LtdJorge · 2025-01-30T11:18:44 1738235924

I thin he means already established connections, but not sure.

Edit: not true, actually. PAUSE will wait for the connections to be released (disconnected in session pooling, transaction ended in transaction pooling...)

nijave · 2025-01-30T02:18:26 1738203506

Curious what you don't like about Aurora? We've found it to generally be better than the older PG offering since it uses clustered storage, you don't pay storage per replica. Additionally, you can pay 30% more per instance for unlimited IOPs

Serverless is generally a non starter unless you have a really really spikey workload

sgarland · 2025-01-30T03:05:08 1738206308

As a disclaimer, I generally dislike most managed offerings of anything, because I don’t think you get nearly the value out of them for the price hike (and performance drop). For DBs especially, I don’t see the value, but I’m also a DBRE with extensive Linux experience, so the maintenance side doesn’t bother me.

For Aurora in general, here’s a short list:

* Since the storage is separated, and farther than even EBS, latency is worse. Local, on-hardware NVMe is blindingly fast, enough that you can often forget that it isn’t RAM.

* I’ve yet to see Aurora perform better; MySQL or Postgres variants. My 13 year old Dell R620s literally outperform them; I’ve tested it.

* The claimed benefit of being able to take a DB up to 128 TiB is a. an artificial limit that they’ve made worse by denying the same to RDS b. difficult to reach in practice, because of a bunch of gotchas like fixed-size temporary storage, which can make it impossible to do online DDL of large tables.

* For the MySQL variant, they removed the change buffer entirely (since storage is distributed, it was necessary for their design), which dramatically slows down writes to tables with secondary indices.

* It’s not open-source. I can and have pored through Postgres and MySQL source code, built debug builds, etc. to figure out why something was happening.

gtaylor · 2025-01-30T10:52:59 1738234379

I’ve never been on a team that migrated to Aurora PG for raw query perf. It is slower than a bespoke setup that is optimized for raw latency, but Aurora is going to hold up under much higher traffic with much less fuss. It also has an excellent snapshot/restore facility.

nijave · 2025-01-30T12:02:50 1738238570

Lack of local storage is a fair criticism. I understand balancing reliability with performance but there's some more middle ground like allowing NVMe storage on replicas but not the primary.

I don't know much about the MySQL variant.

Aurora isn't open source but I'm also not sure there's a compelling reason. It's highly reliant on AWS ability to run massive scale storage systems that amortize the IO cost across tons of physical devices (their proprietary SAN).

If you have dedicated staff, managed services are definitely less compelling. We have 2 infrastructure engineers to run 15+ platforms so we're definitely getting a lot of leverage out of managed services. We'd have to 5x in size/cost to justify a specialist.

zokier · 2025-01-30T14:12:51 1738246371

Aurora has "Optimized Reads" feature these days that allows using local nvme storage: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...

sgarland · 2025-01-31T14:51:19 1738335079

I know, but IMO if you’re wanting higher performance, then Aurora is the wrong approach. Just learn how to manage Linux; it’s not that hard.

dalyons · 2025-01-30T02:44:36 1738205076

Aurora has been excellent in my experience. Many operational problems (eg managing replica lag) disappear

sgarland · 2025-01-29T13:54:20 1738158860

Agreed. I caught a bug in Python code I wrote yesterday by adding a typehint to a variable in a branch that has never been executed. MyPy immediately flagged it, and I fixed it. Never had to watch it fail.

I put typehints on everything in Python, even when it looks ridiculous. I treat MyPy errors as show-stoppers that must be fixed, not ignored. This works well for me.

sgarland · 2025-01-26T22:56:44 1737932204

> These days often DevOps is done by former Software Engineers rather than "old fashioned" Sys admins.

Yes, and the world is a poorer place for it. Google’s SRE model works in part because they have _both_ Ops and SWE backgrounds.

The thing about traditional Ops is, while it may not scale to Google levels, it does scale quite well to the level most companies need, _and_ along the way, it forces people to learn how computers and systems work to a modicum of depth. If you’re having to ssh into a box to see why a process is dying, you’re going to learn something about that process, systemd, etc. If you drag the dev along with you to fix it, now two people have learned cross-areas.

If everything is in a container, and there’s an orchestrator silently replacing dying pods, that no longer needs to exist.

To be clear, I _love_ K8s. I run it at home, and have used it professionally at multiple jobs. What I don’t like is how it (and every other abstraction) have made it such that “infra” people haven’t the slightest clue how infra actually operates, and if you sat them down in front of an empty, physical server, they’d have no idea how to bootstrap Linux on it.

k8sToGo · 2025-01-27T06:21:08 1737958868

That's a fair point I also observed.

sgarland · 2025-01-26T13:16:02 1737897362

> SQL has an ugly syntax

Speak for yourself. I’ve always found it to be very clear and concise.

    SELECT <tuples> FROM <table> [[type of] JOIN other_table ON ?, …] [WHERE ? [boolean operator], …] [ORDER BY ? [DESC]] [LIMIT ?]

sgarland · 2025-01-26T00:14:30 1737850470

Since you’re not returning anything from `parent`, it makes much more sense to use a semijoin, which is something ORMs usually bury in an obscure section of docs, if they support them at all.

    SELECT * FROM `table` t
    WHERE EXISTS (
        SELECT 1 FROM parent p
        WHERE p.id = :parent)

Or, you know, just eliminate the other table entirely (which the optimizer may well do) since p.id = t.parent_id = :parent

vips7L · 2025-01-26T15:58:38 1737907118

You’re completely missing the point while also completely making my point.

The ORM is going to do the correct thing here, while the SQL I quickly typed out will work, but does the inefficient thing and requires more manual review and back and forth in discussions.

sgarland · 2025-01-26T17:39:19 1737913159

I disagree. The point you're making is predicated on not understanding SQL. If you know an ORM well, and don't understand SQL, then of course it will be easier to review. I would however argue that if you don't understand SQL, then you can never truly understand an ORM, in that you can't know what the DB is capable of doing. You'll probably get lucky for things like `WHERE foo IN (...) --> WHERE EXISTS` translations that the DB's optimizer does for you, but you also probably won't even know that's happening, since in this scenario you don't understand SQL.

ORMs typically do an OK job at producing OK queries, in that they're unlikely to be the worst possible option, but are unlikely to be optimal. This is largely driven by schema decisions, which, if you don't understand SQL, are unlikely be optimal. The clunkiest, least-performant queries I've ever dealt with were always rooted in having a poorly-designed schema.

vips7L · 2025-01-27T17:04:45 1737997485

> The point you're making is predicated on not understanding SQL.

This is not the point I'm making. But you seem to not care about that. Cheers man.

sgarland · 2025-01-25T23:24:08 1737847448

No, they are. RA2 and AoE2 came out within a year of each other, and it isn’t even a contest. RA2 is fun, but it’s so much simpler, and games are so much shorter.

I still like RA2 quite a bit, but it’s not in the same league as others.

Aeolun · 2025-01-26T00:09:24 1737850164

> I still like RA2 quite a bit, but it’s not in the same league as others.

While I agree with the statement as written, what I take away from it is the polar opposite. RA2 is (to me) the best RTS ever created.

sgarland · 2025-01-26T13:48:22 1737899302

No judgement – I like them both. What is it about RA2 that you prefer to others? Also, YR or original?

mft_ · 2025-01-26T16:59:36 1737910776

(Not the previous poster but) for me wth RA2 there's a slickness to the gameplay, the mechanics, the controls, and the graphics. I've tried to get on with RA several times (via OpenRA) but it never quite clicks for me - it feels old and clunky in comparison.

semi-extrinsic · 2025-01-26T19:10:31 1737918631

Have you tried Beyond All Reason? It's a modern take on Total Annihilation, known for having a lot of "player comforts" that reduce the need for micro.

mft_ · 2025-01-26T19:41:21 1737920481

I haven't, but downloading now - thanks.

I've been intermittently grinding Mindustry recently...

Aeolun · 2025-01-28T10:47:10 1738061230

YR or original both. There’s a fluency to the animations, sounds and gameplay that I haven’t seen anywhere before or after. It just feels like everythibg comes together perfectly.

The units are all unique. The few differentiators between nations are actually very relevant.

The only thing that comes close is Starcraft.