I’m familiar with Xorq. One of features of the Xorq library that I find interesting is that it catalogs data processing (compute) expressions as it compiles, along with call lineage. Makes reuse easier for SQL and non-SQL processing.
Making software durable and resilient to failures can be a big architectural investment. But what if it wasn't? Would we make everything durable by default?
Xorq is a Python lib https://github.com/xorq-labs/xorq that provides a declarative syntax for defining portable, composite compute stacks for different AI/ML use cases.
In this example, Xorq is used to compose an open source FeatureHouse that runs on DuckLake and interfaces via Apache Arrow Flight.
The post explains how:
- The FeatureHouse is composed with Xorq
- Feature leakage is avoided
- The FeatureHouse can be ported to any underlying query engine (e.g., Iceberg)
- Observability and lineage are handled
- Feast can be integrated with it
Composite data engines such as the one in this Trino-DuckDB example can be created using the xorq framework to simplify multi-engine data pipelines. Useful when a dataset's native query engine does not support a required operation.
(from DBOS) Great question. For better or worse, it seems like discussions about workflows and durable execution often intertwine. Usually ending up in what types of jobs or workflows require durable exec.
But really, any system that runs the risk of failing or committing an error should have something in place to observe it, undo it, resume it. Your point about "big enough scale" is true - you can write your own code to handle that, and manually troubleshoot and repair corrupted data up to a certain point. But that takes time.
By making durable execution more lightweight/seamless (a la DBOS or Restate), the use of durable execution libs become just good programming practice for any application where cost of failure is a concern.
No, they don't. They work just like Temporal and the others, which send the durable state to a separate store. I totally understand that this is good for the tool business model, since many users will end up paying for the separate store, instead of keeping it in the same DB where the application already is, without needing to pay extra for it. After all, the amount of data for keeping that state should be relativly small, so no one would provision a separate DB for it if the SDK didn't force them to.
My bad, I had misunderstood that. You're right, and thank you for sharing it.
That is a great differentiator. I see you have Python and JS/TS libraries, but my team is working with Go right now, so I'll have to pass. However, I'll keep it on my radar. Being able to handle wrap DB commands in the same transactions as the workflow store is awesome! [2]
It depends what part of DBOS you're looking at.
DBOS Transact is the framework (TypeScript) used to develop apps/workflows such as those in the benchamrk.
DBOS Cloud hosts and executes DBOS Transact apps/workflows a la (AWS Lambda+Step Functions). So it is apples:apples. Functionally, DBOS Cloud is like Lambda and Step Functions in one.