Hacker Newsnew | past | comments | ask | show | jobs | submit | qianli_cs's commentslogin

My colleague did some internal benchmarking and found that LISTEN/NOTIFY performs well under low to moderate load, but doesn't scale well with a large number of listeners. Our findings were pretty consistent with this blog post.

(Shameless plug [1]) I'm working on DBOS, where we implemented durable workflows and queues on top of Postgres. For queues, we use FOR UPDATE SKIP LOCKED for task dispatch, combined with exponential backoff and jitter to reduce contention under high load when many workers are polling the same table.

Would love to hear feedback from you and others building similar systems.

[1] https://github.com/dbos-inc/dbos-transact-py


Nice! I'm using DBOS and am a little active on the discord. I was just wondering how y'all handled this under the hood. Glad to hear I don't have to worry much about this issue

Why not read the WAL?

We considered using WAL for change tracking in DBOS, but it requires careful setup and maintenance of replication slots, which may lead to unbounded disk growth if misconfigured. Since DBOS is designed to bolt onto users' existing Postgres instances (we don't manage their data), we chose a simpler, less intrusive approach that doesn't require a replication setup.

Plus, for queues, it's so much easier to leverage database constraints and transactions to implement global concurrency limit, rate limit, and deduplication.


I’ve seen several blog posts trying to analyze HN data on the best time to post. However, the results are all over the place. For example, the below ones have different recommendations (weekend vs weekday).

- https://blog.rmotr.com/the-best-time-to-post-on-hacker-news-...

- https://medium.com/@mi.schaefer/what-is-the-best-time-to-pos...

But the precondition is that you’re submitting high quality content.


I've read a similar series from Phil back in 2020: "Writing a SQL database from scratch in Go" https://notes.eatonphil.com/database-basics.html

The code is available on GitHub: https://github.com/eatonphil/gosql (it's specifically a PostgreSQL implementation in Go).

It's cool to build a database in 3000 lines, but for a real production-ready database you'll need testing. Would love to see some coverage on correctness and reliability tests. For example, SQLite has about 590 times more test code than the library itself. (https://www.sqlite.org/testing.html)


In addition to TypeORM, DBOS supports several popular ORMs:

- Drizzle (we're also a sponsor to Drizzle): https://docs.dbos.dev/typescript/tutorials/orms/using-drizzl...

- Knex: https://docs.dbos.dev/typescript/tutorials/orms/using-knex

- Prisma: https://docs.dbos.dev/typescript/tutorials/orms/using-prisma

More ORM support is on the way.


Why not always default to using transactions?


DBOS always uses transactions to perform database operations. If you're writing a function that performs database operations, you can use the @DBOS.transaction() decorator to wrap the function so that DBOS's bookkeeping records commit in the same transaction as your operation.

However, if you're interfacing with a third-party API, then that wouldn't be part of a database transaction (you'll use @DBOS.step instead). The reason is that you don't want to hold database locks when you're not performing database operations.


Hello! I'm a co-founder at DBOS here and I'm happy to answer any questions :)


Hi! How does it perform under heavy load and with thousands of workflows trying to run concurrently since it relies on Postgres for a lot of things (including using a transaction)? In the end it seems that if I have an application with lots of distributed workers trying to run workflows, I'll still be limited by the CPU/memory of the DB.


Hi there, I think I might have found a typo in your example class in the github README. In the class's `workflow` method, shouldn't we be `await`-ing those steps?


Nice catch. Fixing it :)


Can you change the workflow code for a running workflow that already advanced some steps? What support DBOS have for workflow evolution?


It's not recommended--the assumed model is that every workflow finishes on the code version it started. This is managed automatically in our hosted version (DBOS Cloud) and there's an API for self-hosting: https://docs.dbos.dev/typescript/tutorials/development/self-...

That said, we know sometimes you have to do surgery on a long-running workflow, and we're looking at adding better tooling for it. It's completely doable because all the state is stored in Postgres tables (https://docs.dbos.dev/explanations/system-tables).


I know this this might sound scripted or can be considered cliche but what is the use case for DBOS.


The main use case is to build reliable programs. For example, orchestrating long-running workflows, running cron jobs, and orchestrating AI agents with human-in-the-loop.

DBOS makes external asynchronous API calls reliable and crashproof, without needing to rely on an external orchestration service.


How do you persist execution state? Does it hook into the Python interpreter to capture referenced variables/data structures etc, so they are available when the state needs to be restored?


That work is done by the decorators! They wrap around your functions and store the execution state of your workflows in Postgres, specifically:

- Which workflows are executing

- What their inputs were

- Which steps have completed

- What their outputs were

Here's a reference for the Postgres tables DBOS uses to manage that state: https://docs.dbos.dev/explanations/system-tables


All of this seems it would fit any transactional key value structure.


Hai, really cool project! This is something I can actually use.


About workflow recovery: if I'm running multiple instance of my app that uses DBOS and they all crash, how do you divide the work of retrying pending workflows?


Each workflow is tagged by the executor ID that runs it. You can command each new executor to handle a subset of the pending workflows. This is done automatically on DBOS Cloud. Here's the self-hosting guide: https://docs.dbos.dev/typescript/tutorials/development/self-...


FYI the “Build Crashproof Apps” button in your docs doesn’t do anything.


You'll need to click either the Python or TypeScript icon. We support both languages and will add more icons there.


Thanks the icons work!

I was originally looking at the docs to see if there was any information on multi-instance (horizontally scaled) apps. Is this supported? If so, how does that work?


Yeah, DBOS Cloud automatically (horizontally) scales your apps. For self-hosting, you can spin up multiple instances and connect them to the same Postgres database. For fan-out patterns, you may leverage DBOS Queues. This works because DBOS uses Postgres for coordination, rate limiting, and concurrency control. For example, you can enqueue tasks that are processed by multiple instances; DBOS makes sure that each task is dequeued by one instance.

Docs for Queues and Parallelism: https://docs.dbos.dev/typescript/tutorials/queue-tutorial


The article nicely explains how to build a minimalist OS — works great as an intro material. I think understanding basic OS concepts is essential for performance tuning and debugging.


Notice a bunch of downvotes -- Apologies for being unfamiliar with the rules here (I've always been reading HN, but I'm new to commenting). I should've added a lot more details to my previous comment and been more specific. Any other guides would be helpful too. I'll be careful in the future.

When I learned OS, I followed MIT 6.828 (https://pdos.csail.mit.edu/6.828/2017/overview.html) and implemented a small OS called JOS based on Xv6. So if you're looking for some teaching OS in x86, check it out.


I suspect you were downvoted because your comment sounded like it was generated by an LLM.


Exactly, you have to (vaguely) know what you’re looking for and have some basic ideas of what algorithms would work. AI is good at helping with syntax stuff but not really good at thinking.


I think truncating down allows us to procrastinate a little more :)


I thought it was about LLM training but it’s actually prompt engineering.


I'm thinking about training next! But deepseek is so good already


Great article — it clearly explains “The devil is in the details” :) Would love to see another one for LSM-Tree, and the comparison between B-Trees and LSM-Trees.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: