DBOS looks simple (good), but from the docs below, executor elasticity appears to be locked behind license purchase. So it truly is like docker compose, good and bad parts?
>When self-hosting in a distributed setting without Conductor, it is important to manage workflow recovery so that when an executor crashes, restarts, or is shut down, its workflows are recovered. You should assign each executor running a DBOS application an executor ID through DBOS configuration. Each workflow is tagged with the ID of the executor that started it. When an application with an executor ID restarts, it only recovers pending workflows assigned to that executor ID.
This is a good question! No, it's not like docker compose (I imagine you implied the swarm and hub pull limits?)
DBOS Conductor is an out of band management service that IIRC helps you mainly observe your DBOS and recover failures in a seamless way. As far as I could see, it's not necessary, for you to use DBOS workflows and queues. Don't quote me though and reach out to their forum and verify in case I'm missing certain usecases.
Personally, I do not use DBOS Conductor - I have my own observable setup using Grafana/VictioriaMetrics as my workflows are instrumented with OTel. I had initially set Conductor up for development (and it looked to be free for development although I recall some major limitations on how many workflows etc - which is why I put my own alternate monitoring setup).
They also have a very reasonably priced cloud hosted DBOS Conductor. I think my first 30 days were completely free and then they moved me to a "hobby" tier. It's a fantastic way to help decide whether it's for you.
I believe DBOS Conductor is how DBOS pays the bills but you can use DBOS workflows and queues unlimited without DBOS Conductor. If you don't want to pay for Conductor - their out of band management service, you can put together your own just fine, like I did. My own Grafana/VictioriaMetrics setup answers my questions but I would imagine Claude/Codex/Cursor should be able to put something fairly useful if you didn't want to go down my route.
> executor elasticity appears to be locked behind license purchase
DBOS has designed their system to be extremely flexible and extensible. While yes, Conductor can absolutely manage your executors for you, it's not the only thing that can. You're not limited to using Conductor. As I said, I manage my own - everything you need to know to do so is in the code and documentation. They even have a document for LLMs and agents. I have had to interact with the DBOS team 0 times to set everything up.
I prefer this business model (an optional tool - Conductor, is paid) vs. DBOS offering just everything across the stack on a "free tier" but with caps on DBOS workflows and queues. In their current business model DBOS workflows and queues are completely uncapped (atleast from what I can make out).
If you do reach out to them, I would appreciate if you let me know anything to the contrary.
Well, in my early days programming python I made a lot(!!) of code assuming non-concurrent execution, but some of that code will break in the future with GIL removal. Hopefully the Python devs keep these important changes as opt-ins.
I expect to have at least 15 more years in the workforce and I hate that I have to live through this "revolution". I worry about what will be final balance of lives improved vs lives worsened.
Congrats on the progress!
What is the behavior of PgDoc if it receives some sort of query it can't currently handle properly? Is there a linter/static analysis tool I can use to evaluate if my query will work?
The current behavior unfortunately is to just let it through and return an incorrect result. We are adding more checks here and rely heavily on early adopters to have a decent test suite before launching their apps to prod.
That being said, we do have this [1]:
[general]
expanded_explain = true
This will modify the output of EXPLAIN queries to return routing decisions made by PgDog. If you see that your query is "direct-to-shard", i.e. goes to only one shard, you can be certain that it'll work as expected. These queries will talk to only one database and don't require us to manipulate the result or assemble results from multiple shards.
For cross-shard queries, you'll need your own integration tests, for now. We'll add checks here shortly. We have a decent CI suite as well, but it doesn't cover everything. Every time we look at that part of the code, we just end up adding more features, like the recent support for LIMIT x OFFSET y (PgDog rewrites it to LIMIT x + y and applies the offset calculation in memory).
The distinction is more clear when indexing actual text and applying tokenization. A "typical" index on a database column goes like "column(value => rows)". When people mention inverted indexes its usually in the context of full text search, where "column value" usually goes through tokenization and you build an index for all N tokens of a column "column:(token 1 => rows)", "column:(token 2 => rows)",... "column:(token N => rows)".
Not the person you have asked but at work (we are a CRM platform) we allow our clients to arbitrarily query their userbase to find matching users for marketing campaigns (email, sms, whatsapp). These campaigns can some times target a few hundred thousand people. We are on a really ancient version of ES, but it sucks at this job in terms of throughput. Some experimenting with bigquery indicates it is so much better at mass exporting.
Fair; my question was mostly in the context of ANN, since that was the discussion point - I have to assume ES (as a search engine) would not necessarily be the right tool for data warehousing types of workloads.
https://docs.dbos.dev/production/workflow-recovery#recovery-...
>When self-hosting in a distributed setting without Conductor, it is important to manage workflow recovery so that when an executor crashes, restarts, or is shut down, its workflows are recovered. You should assign each executor running a DBOS application an executor ID through DBOS configuration. Each workflow is tagged with the ID of the executor that started it. When an application with an executor ID restarts, it only recovers pending workflows assigned to that executor ID.
https://docs.dbos.dev/production/hosting-conductor
> Self-hosted Conductor is released under a proprietary license. Self-hosting Conductor for commercial or production use requires a paid license key.
reply