Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Workflow orchestrator in Golang (github.com/harshadmanglani)
82 points by harshadmanglani on March 4, 2024 | hide | past | favorite | 23 comments
A brief overview: 1. Workflows steps share a running context, with access to data they need require. 2. Steps in the workflow (builders) are chained together based on a topologically sorted built from the predefined input & output. 3. No servers spin up (like Conductor/Cadence) - the orchestrator is low level and meant for simplifying business logic. 4. Before/After listeners for each step.

Would love to hear your thoughts and feedback!




Looks like a great side project, congrats on launching. My basic feedback is that you should set expectations on the project page by answering the following questions:

- Why would someone use this instead of Airflow/Cadence/Temporal/Databuilderframework?

- What does this look like when it's used? Most frameworks provide some kind of example project, you should too.

- Related, but more specifically, what does the `IDataStore` interface contract mean? Beyond the two functions that I have to implement, are there any considerations related to the overall performance/scalability/durability of the system? Would it make sense to use a disk-backed store, or Redis, or Postgres?

- How do I observe the system? Which workflows are running, which have failed, what the current state is, etc. Are there metrics? Logs?

All of this is based on the assumption you want people to adopt this framework. If it's just a cool side project, that's fine too, but you should probably try to set that expectation in the README.


Super helpful feedback @peter. Thanks a ton!

I've noted all of these and I'll modify the README to include them. Thank you for taking the time to go through in such detail :)

Given that the current state of a workflow:

- is inherently invisible

- all we can really check in the DB is if, for a workflow, the available data contains the target data;

how would I address observability concerns?

This is a function of a lack of workflow states due to the lower levels of abstraction it operates on. User defined workflow states would do the trick, but I suppose that would take writing some more code after integrating the framework.


I would recommend either updating your framework to allow for instrumentation (logs/metrics/etc) or showing how to add that instrumentation in "user defined workflow states" via an example application.

The current pitch, which is "this is a workflow orchestration framework that does not allow for any monitoring or observability" is a complete non-starter. You may want to consider how other projects keep track of workflow state and allow for it to be instrumented.


Fair enough, will do. Thanks!


I don't generally believe in orchestrators (they miss the point, things are not single computers and neither is the world) and so I have that feedback here but also for:

> Airflow/Cadence/Temporal/Databuilderframework?

Which don't really think about modelling non-centralized things.

This of course doesn't mean they're not useful, it's just that they don't have what I believe is a good long-term value proposition.

I'm incredibly biased because I'm working on programmatic, real-time modelling of distributed systems with https://github.com/purpleidea/mgmt/


Can you elaborate on what you mean by modelling non-centralized things?


I don't think you understand what Temporal does.


Nice. It is great to see native lightweight opensource (I hope it is considering that someone said that there is no license file yet) solutions hit this space. For what it's worth, I have built something similar to this but for Java programming language. You can find it here -> https://github.com/americanexpress/unify-flowret. My reason for building something like this was that the product market is just too unwieldy to work with and has multiple layers of complexity which most of the time can be done away with. Just my opinion.

On a side note, you will at some point in time have to deal with multi version workflows. I know that this is one feature that limits wide adoption of an orchestrator.


Yes, I've added the license.

Thanks for sharing your work Deepak, you have some pretty extensive documentation! Funnily enough, the Golang framework is almost a clone of https://github.com/flipkart-incubator/databuilderframework (another orchestration engine in Java).

This is another HN post that extensively covers almost every major orchestrator in the market: https://news.ycombinator.com/item?id=24216317

As for multi version workflows, I suppose that will have to be a tradeoff between maintaining somewhat redundant code or adding workflows as WorkflowV1 and WorkflowV2 and stitching together relevant steps in respective versions (reduces redundancy to some extent but won't eliminate)


Love this space. I’ve build a few versions of this in go, glad to see some go native frameworks.

Would also love to see some examples.

Concerns I’ve had to deal with in the past (if these are solved, examples would go a long way):

- max queue size between elements is important. Don’t want the database reader role to outrun the workers, but also never want them desaturated

- some dynamic limits for local to a role (limit DB readers to limit connection pool) and global (how much cpu and memory share does a job need before starting)


Thanks for the inputs! I'll add some examples in the documentation.

- with control on the database reader yourself, i think you should be able to find a way around saturation/desaturation?

- again, the fairly certain you can limit DB readers in the DataStore interface you pass to the orchestrator. I'll think about the cpu and memory share and if there's a way to expose that.

A major concern I've always had with workflow orchestrators is the versioning of workflows. Think about long running workflows (>2 days). If you change your workflow logic, what happens to the existing ones that haven't completed yet?

Everyone handles this differently, and I've been thinking about a generic way to do this. Thoughts?


This looks very cool for embedding workflows in a native Go project. There is also Trackman (https://github.com/cloud66-oss/trackman) which is mostly built for workflow based commanline execution.


So many tools in this space! This one looks a little bit like go-task, but it seems maybe better for production workflows because if timeout support, while go-task seems more aimed to command line work/makefile replacement.

—-

https://github.com/go-task/task


Thanks for sharing this! There is no support for timeouts yet, but might be a good addition.


Thanks for sharing! I'll check it out.


Added docs and an example with docusaurus: https://harshadmanglani.github.io/polaris


Argh, using this would cause name collisions in my brain. https://github.com/agersant/polaris/ ^^


It seems you overlooked adding a license so that people can know under what circumstances they can use your project's code


Let me add a license, I missed it. Thanks for pointing it out!


If there isn’t a license, you can’t use it. I think it’s pretty clear.


It's the opposite. Without a license, the default copyright laws apply, meaning that the author retains all rights to their source code and no one may reproduce, distribute, or create derivative works from their work.

Source: https://docs.github.com/en/repositories/managing-your-reposi...


thanks for sharing this. that's what @withinboredom mentioned too :)


That's uhh... exactly what I said?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: