Hi there, I think I might have found a typo in your example class in the github README. In the class's `workflow` method, shouldn't we be `await`-ing those steps?
It's not recommended--the assumed model is that every workflow finishes on the code version it started. This is managed automatically in our hosted version (DBOS Cloud) and there's an API for self-hosting: https://docs.dbos.dev/typescript/tutorials/development/self-...
That said, we know sometimes you have to do surgery on a long-running workflow, and we're looking at adding better tooling for it. It's completely doable because all the state is stored in Postgres tables (https://docs.dbos.dev/explanations/system-tables).
The main use case is to build reliable programs. For example, orchestrating long-running workflows, running cron jobs, and orchestrating AI agents with human-in-the-loop.
DBOS makes external asynchronous API calls reliable and crashproof, without needing to rely on an external orchestration service.
How do you persist execution state? Does it hook into the Python interpreter to capture referenced variables/data structures etc, so they are available when the state needs to be restored?
About workflow recovery: if I'm running multiple instance of my app that uses DBOS and they all crash, how do you divide the work of retrying pending workflows?
Each workflow is tagged by the executor ID that runs it. You can command each new executor to handle a subset of the pending workflows. This is done automatically on DBOS Cloud. Here's the self-hosting guide: https://docs.dbos.dev/typescript/tutorials/development/self-...
I was originally looking at the docs to see if there was any information on multi-instance (horizontally scaled) apps. Is this supported? If so, how does that work?
Yeah, DBOS Cloud automatically (horizontally) scales your apps. For self-hosting, you can spin up multiple instances and connect them to the same Postgres database. For fan-out patterns, you may leverage DBOS Queues. This works because DBOS uses Postgres for coordination, rate limiting, and concurrency control. For example, you can enqueue tasks that are processed by multiple instances; DBOS makes sure that each task is dequeued by one instance.